Efficacy Analysis in Clinical Trials: A Comprehensive Review of Statistical and Machine Learning Approaches

Ghosh, Dhrubajyoti; Pal, Samhita

doi:10.51387/26-NEJSDS104

The New England Journal of Statistics in Data Science

Efficacy Analysis in Clinical Trials: A Comprehensive Review of Statistical and Machine Learning Approaches

Dhrubajyoti Ghosh

Samhita Pal ¹

https://doi.org/10.51387/26-NEJSDS104

Pub. online: 1 April 2026 Type: Timely Review Article

Open Access

Area: Biomedical Research

¹ These authors contributed equally as first authors.

Accepted
26 February 2026

Published
1 April 2026

Abstract

Efficacy testing is a cornerstone of clinical trials, ensuring that medical interventions achieve their intended therapeutic effects. Over the decades, a wide range of statistical methodologies have been developed to address the complexities of clinical trial data, including parametric, nonparametric, Bayesian, and machine learning approaches. Parametric methods, such as t-tests, ANOVA, and LMMs, have traditionally been the foundation of efficacy testing due to their efficiency under well-defined assumptions. Nonparametric techniques, including the Friedman test, Brunner-Munzel test, and modern extensions like nparLD, have emerged as robust alternatives, particularly for skewed, ordinal, or non-normal data. Bayesian methodologies have enabled the incorporation of prior information and uncertainty quantification, while machine learning techniques, such as deep learning and reinforcement learning, are revolutionizing trial designs and outcome predictions. Despite these advancements, significant gaps remain, including challenges in handling high-dimensional data, missingness, and ensuring equitable efficacy testing across diverse populations. This review provides a comprehensive overview of these statistical methods, highlighting their applications, strengths, limitations, and future directions. By bridging traditional statistical frameworks with modern computational techniques, the field can continue to advance toward more reliable and personalized clinical trial methodologies.

References

[1]

Abadi, E., Segars, W. P., Tsui, B. M., Kinahan, P. E., Bottenus, N., Frangi, A. F., Maidment, A., Lo, J. and Samei, E. (2020). Virtual clinical trials in medical imaging: a review. Journal of Medical Imaging 7(4) 042805.

[2]

Adcock, M., Fankhauser, M., Post, J., Lutz, K., Zizlsperger, L., Luft, A. R., Guimarães, V., Schättin, A. and de Bruin, E. D. (2020). Effects of an in-home multicomponent exergame training on physical functions, cognition, and brain volume of older adults: a randomized controlled trial. Frontiers in medicine 6 321.

[3]

Akritas, M. G., Arnold, S. F. and Brunner, E. (1997). Nonparametric hypotheses and rank statistics for unbalanced factorial designs. Journal of the American Statistical Association 92(437) 258–265. https://doi.org/10.2307/2291470. MR1436114

[4]

Algyar, M. F. and Abdelsamee, K. S. (2024). Laparoscopic assisted versus ultrasound guided transversus abdominis plane block in laparoscopic bariatric surgery: a randomized controlled trial. BMC anesthesiology 24(1) 133.

[5]

Altman, D. G. (1991) Practical Statistics for Medical Research. Chapman and Hall/CRC.

[6]

Arenas, I., Ujueta, F., Diaz, D., Yates, T., Olivieri, B., Beasley, R. and Lamas, G. (2019). Limb preservation using edetate disodium-based chelation in patients with diabetes and critical limb ischemia: an open-label pilot study. Cureus 11(12).

[7]

Bandelow, B., Brunner, E., Broocks, A., Beinroth, D., Hajak, G., Pralle, L. and Rüther, E. (1998). The use of the Panic and Agoraphobia Scale in a clinical trial. Psychiatry Research 77(1) 43–49.

[8]

Barber, D. and Williams, C. K. (2001). Gaussian Processes for Bayesian Modeling of Tumor Growth. Neural Information Processing Systems.

[9]

Beasley, R., Harrison, T., Peterson, S., Gustafson, P., Hamblin, A., Bengtsson, T. and Fagerås, M. (2022). Evaluation of budesonide-formoterol for maintenance and reliever therapy among patients with poorly controlled asthma: a systematic review and meta-analysis. JAMA network open 5(3) 220615.

[10]

Beaton, D. E. et al. (2002). Measuring the burden of musculoskeletal conditions. Arthritis & Rheumatism.

[11]

Berger, J. O. and Sellke, T. (1987). Testing a point null hypothesis: The irreconcilability of p values and evidence. Journal of the American statistical Association 82(397) 112–122. MR0883340

[12]

Blei, D. M., Kucukelbir, A. and McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American statistical Association 112(518) 859–877. https://doi.org/10.1080/01621459.2017.1285773. MR3671776

[13]

Boulware, D. R., Pullen, M. F., Bangdiwala, A. S., Pastick, K. A., Lofgren, S. M., Okafor, E. C., Skipper, C. P., Nascene, A. A., Nicol, M. R., Abassi, M. et al. (2020). A randomized trial of hydroxychloroquine as postexposure prophylaxis for Covid-19. New England journal of medicine 383(6) 517–525.

[14]

Bradburn, M. J. et al. (2003). Survival analysis part II: Multivariate data analysis—an introduction to concepts and methods. British Journal of Cancer.

[15]

Brunner, E. and Munzel, U. (2000). The nonparametric Behrens-Fisher problem: Asymptotic theory and a small-sample approximation. Biometrics 56(4) 1173–1182. https://doi.org/10.1002/(SICI)1521-4036(200001)42:1<17::AID-BIMJ17>3.0.CO;2-U. MR1744561

[16]

Camino, R. D., Hammerschmidt, C. A. and State, R. (2019). Improving missing data imputation with deep generative models. arXiv preprint arXiv:1902.10666.

[17]

Cao, W., Wang, D., Li, J., Zhou, H., Li, L. and Li, Y. (2018). Brits: Bidirectional recurrent imputation for time series. Advances in neural information processing systems 31.

[18]

Casy, T., Grasseau, A., Charras, A., Rouvière, B., Pers, J. -O., Foulquier, N. and Saraux, A. (2022). Assessing the robustness of clinical trials by estimating Jadad’s score using artificial intelligence approaches. Computers in Biology and Medicine 148 105851.

[19]

Chazard, E., Ficheur, G., Beuscart, J. -B. and Preda, C. (2017). How to compare the length of stay of two samples of inpatients? A simulation study to compare type I and type II errors of 12 statistical tests. Value in Health 20(7) 992–998.

[20]

Che, Z., Purushotham, S., Cho, K., Sontag, D. and Liu, Y. (2018). Recurrent neural networks for multivariate time series with missing values. Scientific reports 8(1) 6085.

[21]

Chen, G., Saad, Z. S., Britton, J. C., Pine, D. S. and Cox, R. W. (2013). Linear mixed-effects modeling approach to FMRI group analysis. Neuroimage 73 176–190.

[22]

Cheng, T., Wood, E., Nguyen, P., Kerr, T. and DeBeck, K. (2014). Increases and decreases in drug use attributed to housing status among street-involved youth in a Canadian setting. Harm reduction journal 11 1–6.

[23]

Covert, I. C., Krishnan, B., Najm, I., Zhan, J., Shore, M., Hixson, J. and Po, M. J. (2019). Temporal graph convolutional networks for automatic seizure detection. In Machine learning for healthcare conference 160–180. PMLR.

[24]

Dahal, L., Ghojoghnejad, M., Vancoillie, L., Ghosh, D., Bhandari, Y., Kim, D., Ho, F. C., Tushar, F. I., Luo, S., Lafata, K. J. et al. (2025). XCAT 3.0: A comprehensive library of personalized digital twins derived from CT scans. Medical Image Analysis 103636.

[25]

Dai, J. Y., Gilbert, P. B., Hughes, J. P. and Brown, E. R. (2013). Estimating the efficacy of preexposure prophylaxis for HIV prevention among participants with a threshold level of drug concentration. American journal of epidemiology 177(3) 256–263.

[26]

Daniels, M. and Hogan, J. (2008) Missing Data in Longitudinal Studies: Strategies for Bayesian Modeling and Sensitivity Analysis. CRC Press. https://doi.org/10.1201/9781420011180. MR2459796

[27]

Davidian, M. (2017) Nonlinear models for repeated measurement data. Routledge.

[28]

Davis, M., Conlon, K., Bohac, G. C., Barcenas, J., Leslie, W., Watkins, L., Lamzabi, I., Deng, Y., Li, Y. and Plate, J. M. (2012). Effect of pemetrexed on innate immune killer cells and adaptive immune T cells in subjects with adenocarcinoma of the pancreas. Journal of Immunotherapy 35(8) 629–640.

[29]

Dayan, I., Roth, H. R., Zhong, A., Harouni, A., Gentili, A., Abidin, A. Z., Liu, A., Costa, A. B., Wood, B. J., Tsai, C. -S. et al. (2021). Federated learning for predicting clinical outcomes in patients with COVID-19. Nature medicine 27(10) 1735–1743.

[30]

De Waele, J. J., Tellado, J. M., Weiss, G., Alder, J., Kruesmann, F., Arvis, P., Hussain, T. and Solomkin, J. S. (2014). Efficacy and safety of moxifloxacin in hospitalized patients with secondary peritonitis: pooled analysis of four randomized phase III trials. Surgical infections 15(5) 567–575.

[31]

De Winter, J. C. F. (2013). Using the Student’s t-test with extremely small sample sizes. Practical Assessment, Research, and Evaluation 18(10) 1–12.

[32]

Diggle, P. (2002) Analysis of longitudinal data. Oxford university press.

[33]

Dong, S., Yu, H., Poupart, P. and Ho, E. A. (2024). Gaussian processes modeling for the prediction of polymeric nanoparticle formulation design to enhance encapsulation efficiency and therapeutic efficacy. Drug Delivery and Translational Research 1–17.

[34]

Dorsey, E. R. and Topol, E. J. (2016). State of telehealth. New England journal of medicine 375(2) 154–161.

[35]

Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

[36]

Elias, S. M., Barney, C. E. and Bishop, J. W. (2013). The treatment of self-efficacy among psychology and management scholars. Journal of Applied Social Psychology 43(4) 811–822.

[37]

Esteva, F. J., Soh, L. -T., Holmes, F. A., Plunkett, W., Meyers, C. A., Forman, A. D. and Hortobagyi, G. N. (2000). Phase II trial and pharmacokinetic evaluation of cytosine arabinoside for leptomeningeal metastases from breast cancer. Cancer chemotherapy and pharmacology 46 382–386.

[38]

Fagerland, M. W. et al. (2011). Statistical analysis of contingency tables. BMC Medical Research Methodology 11(1) 47.

[39]

Farewell, V. T. (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics.

[40]

Ferrario, C. M., Jessup, J., Chappell, M. C., Averill, D. B., Brosnihan, K. B., Tallant, E. A., Diz, D. I. and Gallagher, P. E. (2005). Effect of angiotensin-converting enzyme inhibition and angiotensin II receptor blockers on cardiac angiotensin-converting enzyme 2. Circulation 111(20) 2605–2610.

[41]

Fine, J. P. and Gray, R. J. (1999). A proportional hazards model for the subdistribution of a competing risk. Journal of the American statistical association 94(446) 496–509. https://doi.org/10.2307/2670170. MR1702320

[42]

Fisher, R. A. (1925) Statistical methods for research workers. Oliver and Boyd. MR0346954

[43]

Fitzmaurice, G. M. et al. (2008) Applied Longitudinal Analysis. Wiley.

[44]

Fogelholm, M. and Kukkonen-Harjula, K. (2000). Does physical activity prevent weight gain–a systematic review. Obesity reviews 1(2) 95–111.

[45]

Friedman, M. (1937). The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of the American Statistical Association.

[46]

Furno, P., Dionisi, M. S., Bucaneve, G., Menichetti, F. and Del Favero, A. (2000). Ceftriaxone versus β-lactams with antipseudomonal activity for empirical, combined antibiotic therapy in febrile neutropenia: a meta-analysis. Supportive care in cancer 8 293–301.

[47]

Garber, A. J., Duncan, T. G., Goodman, A. M., Mills, D. J., Rohlf, J. L. et al. (1997). Efficacy of metformin in type II diabetes: results of a double-blind, placebo-controlled, dose-response trial. The American journal of medicine 103(6) 491–497.

[48]

Gelman, A. and Hill, J. (2013) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.

[49]

Ghosh, D. and Luo, S. (2025). A non-parametric U-statistic testing approach for multi-arm clinical trials with multivariate longitudinal data. Journal of Multivariate Analysis 105447. https://doi.org/10.1016/j.jmva.2025.105447. MR4901559

[50]

Ghosh, D., Pal, S., Lutz, M., Luo, S. and Initiative, A. D. N. (2025). Ensemble survival analysis for preclinical cognitive decline prediction in Alzheimer’s disease using longitudinal biomarkers. Journal of Alzheimer’s Disease 13872877251365621.

[51]

Ghosh, D., Xu, X., Luo, S. and Database, C. I. P. (2025). Power and sample size calculation for multivariate longitudinal trials using the longitudinal rank sum test. Statistics in Medicine 44(20–22) 70261. https://doi.org/10.1002/sim.70261. MR4960437

[52]

Ghosh, D., Boettcher, W. A., Johnston, R. and Lahiri, S. (2025). THANOS: A Predictive Model of Electoral Campaigns Using Twitter Data and Opinion Polls. Data Science in Science 4(1) 2484180.

[53]

Gibbons, J. D. and Chakraborti, S. (2010) Nonparametric Statistical Inference. CRC Press. MR2681063

[54]

Gligorijevic, J., Gligorijevic, D., Pavlovski, M., Milkovits, E., Glass, L., Grier, K., Vankireddy, P. and Obradovic, Z. (2019). Optimizing clinical trials recruitment via deep learning. Journal of the American Medical Informatics Association 26(11) 1195–1202.

[55]

Gray, R. J. (1988). A class of K-sample tests for comparing the cumulative incidence of a competing risk. Annals of Statistics 16(3) 1141–1154. https://doi.org/10.1214/aos/1176350951. MR0959192

[56]

Gueorguieva, R. and Krystal, J. H. (2004). Move over ANOVA: Progress in analyzing repeated-measures data and its reflection in papers published in the Archives of General Psychiatry. Archives of General Psychiatry 61(3) 310–317.

[57]

Guo, S., Jiang, X., Mao, B. and Li, Q. -X. (2019). The design, analysis and application of mouse clinical trials in oncology drug development. BMC cancer 19 1–14.

[58]

Halder, J. B., Benton, J., Julé, A. M., Guérin, P. J., Olliaro, P. L., Basáñez, M. -G. and Walker, M. (2017). Systematic review of studies generating individual participant data on the efficacy of drugs for treating soil-transmitted helminthiases and the case for data-sharing. PLoS Neglected Tropical Diseases 11(10) 0006053.

[59]

Harutyunyan, H., Khachatrian, H., Kale, D. and Ver Steeg, G. (2019). Multitask Learning and Benchmarking with Clinical Time-Series Data. Scientific Reports.

[60]

Heinonen, M., Arora, S., Remes, S., Saarinen, I. and Lähdesmäki, H. (2021). Bayesian Multivariate Gaussian Processes for Longitudinal Clinical Data. Journal of Biomedical Informatics 114 103654.

[61]

Henderson, C. R. (1954). Estimation of variance and covariance components. Biometrics 9(2) 226–252. https://doi.org/10.2307/3001853. MR0055650

[62]

Hess, K. R. (1994). Assessing time-by-covariate interactions in proportional hazards regression models using cubic spline functions. Statistics in medicine 13(10) 1045–1062. https://doi.org/10.1007/978-0-387-68639-4. MR2400249

[63]

Ho, M. -W., Tu, W., Ghosh, P. and Tiwari, R. C. (2013). A nested Dirichlet process analysis of cluster randomized trial data with application in geriatric care assessment. Journal of the American Statistical Association 108(501) 48–68. https://doi.org/10.1080/01621459.2012.734164. MR3174602

[64]

Höfler, J., Rohracher, A., Kalss, G., Zimmermann, G., Dobesberger, J., Pilz, G., Leitinger, M., Kuchukhidze, G., Butz, K., Taylor, A. et al. (2016). (S)-Ketamine in refractory and super-refractory status epilepticus: a retrospective study. CNS drugs 30 869–876.

[65]

Hollander, M., Wolfe, D. A. and Chicken, E. (2013) Nonparametric Statistical Methods. Wiley. MR3221959

[66]

Hong, J. and Chun, H. (2023). A prediction model for healthcare time-series data with a mixture of deep mixed effect models using Gaussian processes. Biomedical Signal Processing and Control 84 104753.

[67]

Hoo, J. -X., Yang, Y. -F., Tan, J. -Y., Yang, J., Yang, A. and Lim, L. -L. (2023). Impact of multicomponent integrated care on mortality and hospitalization after acute coronary syndrome: a systematic review and meta-analysis. European Heart Journal-Quality of Care and Clinical Outcomes 9(3) 258–267.

[68]

Huang, J. -Z., Chen, C. -N., Lee, C. -P., Kao, C. -H., Hsu, H. -C. and Chou, A. -K. (2022). Evaluation of the effects of skin-to-skin contact on newborn sucking, and breastfeeding abilities: a quasi-experimental study design. Nutrients 14(9) 1846.

[69]

Ibrahim, J. G. et al. (2001). Bayesian approaches to joint modeling of longitudinal and survival data. Statistics in Medicine 20(13) 1993–2015.

[70]

Ibrahim, J. G. and Molenberghs, G. (2009). Missing data methods in longitudinal studies: a review. Test 18(1) 1–43. https://doi.org/10.1007/s11749-009-0138-x. MR2495958

[71]

Izmailova, E. S., Wagner, J. A., Ammour, N., Amondikar, N., Bell-Vlasov, A., Berman, S., Bloomfield, D., Brady, L. S., Cai, X., Calle, R. A. et al. (2021). Remote digital monitoring for medical product development. Clinical and Translational Science 14(1) 94–101.

[72]

Kang, Q., Vahl, C. I., Fan, H., Geurden, T., Ameiss, K. A. and Taylor, L. P. (2019). Statistical analyses of chicken intestinal lesion scores in battery cage studies of anti-coccidial drugs. Veterinary parasitology 272 83–94.

[73]

Kay, R. (1977). The AFT model in survival analysis: Theory and applications. Biometrics.

[74]

Ketema, T., Bacha, K., Getahun, K. and Bassat, Q. (2021). In vivo efficacy of anti-malarial drugs against clinical Plasmodium vivax malaria in Ethiopia: a systematic review and meta-analysis. Malaria Journal 20 1–19.

[75]

Khan, J., Ooka, J., Miller, S., Madden, L. and Hoitink, H. (2004). Systemic resistance induced by Trichoderma hamatum 382 in cucumber against Phytophthora crown rot and leaf blight. Plant Disease 88(3) 280–286.

[76]

Kim, A. Y., Jang, E. H., Kim, S., Choi, K. W., Jeon, H. J., Yu, H. Y. and Byun, S. (2018). Automatic detection of major depressive disorder using electrodermal activity. Scientific reports 8(1) 17030.

[77]

Kok, C., Jahmunah, V., Oh, S. L., Zhou, X., Gururajan, R., Tao, X., Cheong, K. H., Gururajan, R., Molinari, F. and Acharya, U. R. (2020). Automated prediction of sepsis using temporal convolutional network. Computers in Biology and Medicine 127 103957.

[78]

Komorowski, M., Celi, L. A., Badawi, O., Gordon, A. C. and Faisal, A. A. (2018). The Artificial Intelligence Clinician Learns Optimal Treatment Strategies for Sepsis in Intensive Care. Nature Medicine 24 1716–1720.

[79]

Koopmeiners, J. S. and Modiano, J. (2014). A bayesian adaptive phase i–ii clinical trial for evaluating efficacy and toxicity with delayed outcomes. Clinical Trials 11(1) 38–48.

[80]

Kumar, D. (2018) Stress-Strength Estimation and its applications in Clinical Trials. State University of New York at Albany. MR3908068

[81]

Laird, N. and Ware, J. (1982). Random-Effects Models for Longitudinal Data. Biometrics 38 963–974.

[82]

Lehmann, E. L. and D’Abrera, H. J. M. (2006) Nonparametrics: Statistical Methods Based on Ranks. Springer. MR2279708

[83]

Li, J., Chitwood, J., Menda, N., Mueller, L. and Hutton, S. F. (2018). Linkage between the I-3 gene for resistance to Fusarium wilt race 3 and increased sensitivity to bacterial spot in tomato. Theoretical and applied genetics 131 145–155.

[84]

Li, L., Shen, C., Li, X. and Robins, J. M. (2013). On weighting approaches for missing data. Statistical methods in medical research 22(1) 14–30. https://doi.org/10.1177/0962280211403597. MR3190643

[85]

Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73(1) 13–22. https://doi.org/10.1093/biomet/73.1.13. MR0836430

[86]

Little, R. J. and Rubin, D. B. (2019) Statistical analysis with missing data. John Wiley & Sons. https://doi.org/10.1002/9781119013563. MR1925014

[87]

Luo, N., Di, W., Zhang, A., Wang, Y., Ding, M., Qi, W., Zhu, Y., Massing, M. W. and Fang, Y. (2012). A randomized, one-year clinical trial comparing the efficacy of topiramate, flunarizine, and a combination of flunarizine and topiramate in migraine prophylaxis. Pain Medicine 13(1) 80–86.

[88]

Mann, H. B. and Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics 18(1) 50–60. https://doi.org/10.1214/aoms/1177730491. MR0022058

[89]

Mao, J. J., Bryl, K., Gillespie, E. F., Green, A., Hung, T. K., Baser, R., Panageas, K., Postow, M. A. and Daly, B. (2025). Randomized clinical trial of a digital integrative medicine intervention among patients undergoing active cancer treatment. npj Digital Medicine 8(1) 29.

[90]

Mashaly, O. A., El Mahallawy, A. S. and Amer, T. A. (2023). Intralesional Injection of Ethanolamine Oleate With or Without Local Anaesthetic Agent to Assess Postoperative Pain in Oral Venous Malformations (a Randomized Controlled Clinical Trial). Alexandria Dental Journal 48(3) 102–108.

[91]

Maxwell, S. E. and Delaney, H. D. (2004) Designing Experiments and Analyzing Data: A Model Comparison Perspective. Routledge.

[92]

Mayan, I., Roth, H., Ghosh, D., Whitson, H. E. and Johnson, K. G. (2025). Genetic and biomarker disclosure process in a memory and aging study. Journal of Alzheimer’s Disease 104(2) 312–318.

[93]

Mohamed, Y. R. E., El-Attar, A. M. I., Anwar, D. M. F. and Shehab, A. S. A. (2024). The efficacy of ultrasound and fluoroscopy-guided caudal epidural prolotherapy versus steroids for chronic pain management in failed back surgery syndrome. Alexandria Journal of Medicine 60(1) 238–243.

[94]

Molenberghs, G. and Kenward, M. (2007) Missing data in clinical studies. John Wiley & Sons.

[95]

Nath, S., Korot, E., Fu, D. J., Zhang, G., Mishra, K., Lee, A. Y. and Keane, P. A. (2022). Reinforcement learning in ophthalmology: potential applications and challenges to implementation. The Lancet Digital Health 4(9) 692–697.

[96]

Nemati, S., Ghassemi, M. M. and Clifford, G. D. (2016). Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach. In 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC) 2978–2981. IEEE.

[97]

Neuhaus, J. M. et al. (1991). Estimation of covariate effects in generalized linear models for longitudinal data. Biometrics 47(4) 985–996.

[98]

Noguchi, K., Gel, Y. R., Brunner, E. and Konietschke, F. (2012). nparLD: An R software package for the nonparametric analysis of longitudinal data in factorial experiments. Journal of Statistical Software 50(1) 1–23.

[99]

Olson, C. L. (1976). On choosing a test statistic in multivariate analysis of variance. Psychological Bulletin.

[100]

Oyamada, S., Chiu, S. -W. and Yamaguchi, T. (2022). Comparison of statistical models for estimating intervention effects based on time-to-recurrent-event in stepped wedge cluster randomized trial using open cohort design. BMC Medical Research Methodology 22(1) 123.

[101]

Pan, C., Tian, Y., Zhou, T. and Li, J. (2024). Personalized Prediction of Parkinson’s Disease Progression Based on Deep Gaussian Processes. In MEDINFO 2023—The Future Is Accessible 765–769 IOS Press.

[102]

Park, Y. and Chang, W. (2024). A Personalized Dose-Finding Algorithm Based on Adaptive Gaussian Process Regression. Pharmaceutical Statistics 23(6) 1181–1205.

[103]

Pi-Sunyer, X., Astrup, A., Fujioka, K., Greenway, F., Halpern, A., Krempf, M., Lau, D. C., Le Roux, C. W., Violante Ortiz, R., Jensen, C. B. et al. (2015). A randomized, controlled trial of 3.0 mg of liraglutide in weight management. New England Journal of Medicine 373(1) 11–22.

[104]

Pinheiro, J. C. and Bates, D. M. (2000) Mixed-Effects Models in S and S-PLUS. Springer.

[105]

Pocock, S. J. et al. (1987). Statistical considerations in the design of clinical trials: Beta-blockers in cardiovascular medicine. Statistics in Medicine.

[106]

Pocock, S. J. et al. (2002). Subgroup analysis, covariate adjustment, and baseline comparisons in clinical trial reporting. Statistics in Medicine.

[107]

President, T. (2019). Analyzing the Influence of Key Factors for Patient Willingness to Participate in Clinical Trials. Semantic Scholar.

[108]

Proust-Lima, C. et al. (2009). Joint latent class models for longitudinal and time-to-event data: A review. Statistical Methods in Medical Research 18(2) 147–166.

[109]

Psaty, B. M., Smith, N. L., Siscovick, D. S., Koepsell, T. D., Weiss, N. S., Heckbert, S. R., Lemaitre, R. N., Wagner, E. H. and Furberg, C. D. (1997). Health outcomes associated with antihypertensive therapies used as first-line agents: a systematic review and meta-analysis. Jama 277(9) 739–745.

[110]

Rasch, D. et al. (2011). The robustness of parametric statistical methods. Psychological Science 22(9) 1211–1213.

[111]

Rasmussen, C. E. and Williams, C. K. I. (2006) Gaussian Processes for Machine Learning. MIT Press. MR2514435

[112]

Reitsma, A., Chu, R., Thorpe, J., McDonald, S., Thabane, L. and Hutton, E. (2014). Accounting for center in the Early External Cephalic Version trials: an empirical comparison of statistical methods to adjust for center in a multicenter trial with binary outcomes. Trials 15 1–11.

[113]

Ricciardi, F., Liverani, S. and Baio, G. (2023). Dirichlet process mixture models for regression discontinuity designs. Statistical methods in medical research 32(1) 55–70. https://doi.org/10.1177/09622802221129044. MR4528435

[114]

Rizopoulos, D. (2012) Joint Models for Longitudinal and Time-to-Event Data: With Applications in R. Chapman and Hall/CRC.

[115]

Rogers, S., Farlow, M., Doody, R., Mohs, R., Friedhoff, L. and Group*, D. S. (1998). A 24-week, double-blind, placebo-controlled trial of donepezil in patients with Alzheimer’s disease. Neurology 50(1) 136–145.

[116]

Rorden, C., Karnath, H. -O. and Bonilha, L. (2007). Improving lesion-symptom mapping. Journal of cognitive neuroscience 19(7) 1081–1088.

[117]

Ruxton, G. D. (2006). The unequal variance t-test is an underused alternative to Student’s t-test. Behavioral Ecology 17(4) 688–690.

[118]

Schafer, J. L. (1997) Analysis of incomplete multivariate data. CRC press. https://doi.org/10.1201/9781439821862. MR1692799

[119]

Scharf, A. -C., Gronewold, J., Eilers, A., Todica, O., Moenninghoff, C., Doeppner, T. R., de Haan, B., Bassetti, C. L. and Hermann, D. M. (2023). Depression and anxiety in acute ischemic stroke involving the anterior but not paramedian or inferolateral thalamus. Frontiers in psychology 14 1218526.

[120]

Schuler, M. S., Lechner, W. V., Carter, R. E. and Malcolm, R. (2009). Temporal and gender trends in concordance of urine drug screens and self-reported use in cocaine treatment studies. Journal of addiction medicine 3(4) 211–217.

[121]

enk, S., Denerel, N., Köyaasolu, O. and Tunç, S. (2021). The effect of isolation on athletes’ mental health during the COVID-19 pandemic. The Physician and sportsmedicine 49(2) 187–193.

[122]

Senn, S. (1993). Repeated measures ANOVA in clinical trials: Applications in weight loss and metabolic outcomes. Biometrika.

[123]

Senn, S. (2006). Change from baseline and analysis of covariance revisited. Statistics in Medicine. https://doi.org/10.1002/sim.2682. MR2307596

[124]

Shapiro, R. E., Hochstetler, H. M., Dennehy, E. B., Khanna, R., Doty, E. G., Berg, P. H. and Starling, A. J. (2019). Lasmiditan for acute treatment of migraine in patients with cardiovascular risk factors: post-hoc analysis of pooled results from 2 randomized, double-blind, placebo-controlled, phase 3 trials. The journal of headache and pain 20 1–10.

[125]

Sheller, M. J., Edwards, B., Reina, G. A., Martin, J. and Bakas, S. (2020). Federated Learning in Medicine: Facilitating Multi-Institutional Collaborations without Sharing Patient Data. Scientific Reports.

[126]

Siddique, J. et al. (2008). Missing data in randomized controlled trials for weight loss. Obesity.

[127]

Simoni, J. M., Wiebe, J. S., Sauceda, J. A., Huh, D., Sanchez, G., Longoria, V., Andres Bedoya, C. and Safren, S. A. (2013). A preliminary RCT of CBT-AD for adherence and depression among HIV-positive Latinos on the US-Mexico border: the Nuevo Dia study. AIDS and Behavior 17 2816–2829.

[128]

Sirima, S. B., Ouédraogo, A., Tiono, A. B., Kaboré, J. M., Bougouma, E. C., Ouattara, M. S., Kargougou, D., Diarra, A., Henry, N., Ouédraogo, I. N. et al. (2022). A randomized controlled trial showing safety and efficacy of a whole sporozoite vaccine against endemic malaria. Science translational medicine 14(674) 3776.

[129]

Song, S., Matsushima, N., Lee, J. and Mendell, J. (2015). Linear mixed-effects model of QTc prolongation for olmesartan medoxomil. Journal of Clinical Pharmacology 56(1) 96.

[130]

Sotoudeh-Paima, S., Segars, W. P., Ghosh, D., Luo, S., Samei, E. and Abadi, E. (2024). A systematic assessment and optimization of photon-counting CT for lung density quantifications. Medical Physics 51(4) 2893–2904.

[131]

Sousa, M. R. d. and Ribeiro, A. L. P. (2009). Systematic review and meta-analysis of diagnostic and prognostic studies: a tutorial. Arquivos brasileiros de cardiologia 92 241–251.

[132]

Student (1908). The probable error of a mean. Biometrika 1–25.

[133]

Subbaswamy, A. and Saria, S. (2020). From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics 21(2) 345–352. https://doi.org/10.1093/biostatistics/kxz041. MR4132548

[134]

Takada, M., Sozu, T. and Sato, T. (2015). Practical approaches for design and analysis of clinical trials of infertility treatments: crossover designs and the Mantel–Haenszel method are recommended. Pharmaceutical statistics 14(3) 198–204.

[135]

Takahashi, A. and Suzuki, T. (2021). Bayesian optimization design for dose-finding based on toxicity and efficacy outcomes in phase I/II clinical trials. Pharmaceutical Statistics 20(3) 422–439.

[136]

Teh, J. L., Purwin, T. J., Han, A., Chua, V., Patel, P., Baqai, U., Liao, C., Bechtel, N., Sato, T., Davies, M. A. et al. (2020). Metabolic adaptations to MEK and CDK4/6 cotargeting in uveal melanoma. Molecular cancer therapeutics 19(8) 1719–1726.

[137]

Therneau, T. M. and Grambsch, P. M. (2000) Modeling Survival Data: Extending the Cox Model. Springer. https://doi.org/10.1007/978-1-4757-3294-8. MR1774977

[138]

Tian, W., Ding, W., Kim, S., Zheng, L., Zhang, L., Li, X., Gu, J., Zhang, L., Pan, M. and Chen, S. (2013). Efficacy and safety profile of combining vandetanib with chemotherapy in patients with advanced non-small cell lung cancer: a meta-analysis. PLoS One 8(7) 67929.

[139]

Trella, A. L., Zhang, K. W., Jajal, H., Nahum-Shani, I., Shetty, V., Doshi-Velez, F. and Murphy, S. A. (2024). A Deployed Online Reinforcement Learning Algorithm In An Oral Health Clinical Trial. arXiv preprint arXiv:2409.02069.

[140]

Tsiatis, A. A. and Davidian, M. (2004). Joint modeling of longitudinal and time-to-event data: An overview. Statistica Sinica 14(3) 809–834. MR2087974

[141]

Tsuboyama, K., Koyama-Honda, I., Sakamaki, Y., Koike, M., Morishita, H. and Mizushima, N. (2016). The ATG conjugation systems are important for degradation of the inner autophagosomal membrane. Science 354(6315) 1036–1041.

[142]

Tuomainen, K., Al-Samadi, A., Potdar, S., Turunen, L., Turunen, M., Karhemo, P. -R., Bergman, P., Risteli, M., Åström, P., Tiikkaja, R. et al. (2019). Human tumor–derived matrix improves the predictability of head and neck cancer drug testing. Cancers 12(1) 92.

[143]

Tushar, F. I., Vancoillie, L., McCabe, C., Kavuri, A., Dahal, L., Harrawood, B., Fryling, M., Zarei, M., Sotoudeh-Paima, S., Ho, F. C. et al. (2025). Virtual lung screening trial (VLST): An in silico study inspired by the national lung screening trial for lung cancer detection. Medical Image Analysis 103 103576.

[144]

Vangeneugden, T., Laenen, A., Geys, H., Renard, D. and Molenberghs, G. (2004). Applying linear mixed models to estimate reliability in clinical trial data with repeated measurements. Controlled clinical trials 25(1) 13–30.

[145]

Verbeke, G. and Molenberghs, G. (2000) Linear Mixed Models for Longitudinal Data. Springer. https://doi.org/10.1007/978-1-4419-0300-6. MR1880596

[146]

Vickers, A. J. (2001). The use of percentage change from baseline as an outcome in a controlled trial is statistically inefficient. Trials.

[147]

Vinkers, D. J., Gussekloo, J., Stek, M. L., Westendorp, R. G. and Van Der Mast, R. C. (2004). The 15-item Geriatric Depression Scale (GDS-15) detects changes in depressive symptoms after a major negative life event. The Leiden 85-plus Study. International journal of geriatric psychiatry 19(1) 80–84.

[148]

Walker, M., Churcher, T. S. and Basáñez, M. -G. (2014). Models for measuring anthelmintic drug efficacy for parasitologists. Trends in parasitology 30(11) 528–537.

[149]

Wei, L. J. (1992). The accelerated failure time model: A useful alternative to the Cox regression model in survival analysis. Statistics in Medicine.

[150]

Welch, B. L. (1947). The generalization of ‘STUDENT’S’problem when several different population varlances are involved. Biometrika 34(1-2) 28–35. https://doi.org/10.2307/2332510. MR0019277

[151]

Whitehead, J., Thygesen, H. and Whitehead, A. (2011). Bayesian procedures for phase I/II clinical trials investigating the safety and efficacy of drug combinations. Statistics in Medicine 30(16) 1952–1970. https://doi.org/10.1002/sim.4267. MR2829058

[152]

Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics Bulletin 1(6) 80–83. Accessed 2024-12-01. https://doi.org/10.2307/3001946. MR0025133

[153]

Wiles, N., Fischer, K., Cowen, P., Nutt, D., Peters, T., Lewis, G. and White, I. (2014). Allowing for non-adherence to treatment in a randomized controlled trial of two antidepressants (citalopram versus reboxetine): an example from the GENPOD trial. Psychological medicine 44(13) 2855–2866.

[154]

Wood, S. N. (2017) Generalized additive models: an introduction with R. Chapman and hall/CRC. MR2206355

[155]

Wright, S. P. (1992). Adjusting for baseline in longitudinal clinical trials. Journal of Clinical Epidemiology.

[156]

Xu, C., Hadjipantelis, P. Z. and Wang, J. -L. (2020). Semi-parametric joint modeling of survival and longitudinal data: the r package JSM. Journal of Statistical Software 93 1–29.

[157]

Xu, R., Huang, S., Song, Z., Gao, Y. and Wu, J. (2024). A deep mixed-effects modeling approach for real-time monitoring of metal additive manufacturing process. IISE Transactions 56(9) 945–959.

[158]

Xu, X., Ghosh, D., Luo, S. and Database, C. I. P. (2025). A novel longitudinal rank-sum test for multiple primary endpoints in clinical trials: Applications to neurodegenerative disorders. Statistics in Biopharmaceutical Research 1–11.

[159]

Yakar, N., Emingil, G., Türedi, A., ahin, Ç., Köse, T., Bostanci, N. and Silbereisen, A. (2023). Value of gingival crevicular fluid TREM-1, PGLYRP1, and IL-1β levels during menopause. Journal of Periodontal Research 58(5) 1052–1060.

[160]

Yoon, J., Jordon, J. and Schaar, M. (2018). Gain: Missing data imputation using generative adversarial nets. In International conference on machine learning 5689–5698. PMLR.

[161]

Zeger, S. L. and Liang, K. Y. (1988). Longitudinal data analysis for discrete and continuous outcomes. Biometrics 44(4) 1049–1060. https://doi.org/10.2307/2532076. MR0999450

[162]

Zhao, Y., Kosorok, M. R. and Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in medicine 28(26) 3294–3315. https://doi.org/10.1002/sim.3720. MR2750277

[163]

Zhao, Y., Zeng, D., Socinski, M. A. and Kosorok, M. R. (2011). Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67(4) 1422–1433. https://doi.org/10.1111/j.1541-0420.2011.01572.x. MR2872393

[164]

Zhu, Y., Bi, D., Saunders, M. and Ji, Y. (2023). Prediction of chronic kidney disease progression using recurrent neural network and electronic health records. Scientific Reports 13(1) 22091.

[165]

Zimmerman, D. W. (2004). A note on preliminary tests of equality of variances. British Journal of Mathematical and Statistical Psychology 57(1) 173–181. https://doi.org/10.1348/000711004849222. MR2087822

Full article Related articles

Open access article under the CC BY license.

Keywords

62P10 62F03 62G10 Efficacy testing Longitudinal Cross-sectional Clinical trials Parametric methods Nonparametric methods Bayesian methods Machine learning Deep learning

Metrics

since December 2021

560

Article info
views

Full article
views

154

PDF
downloads

XML
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file