Modeling the Mean with Time as a Categorical Variable in Longitudinal Designs for Smaller-Sized Clinical Trials: A Case Studies Approach Based on Three Phase 3 Clinical Trials in Rare Diseases

Zahrieh, David; Wang, Yi; Le-Rademacher, Jennifer; Koutsoukos, Tony

doi:10.51387/26-NEJSDS96

The New England Journal of Statistics in Data Science

Modeling the Mean with Time as a Categorical Variable in Longitudinal Designs for Smaller-Sized Clinical Trials: A Case Studies Approach Based on Three Phase 3 Clinical Trials in Rare Diseases

David Zahrieh Yi Wang Jennifer Le-Rademacher ¹ All authors (4)

https://doi.org/10.51387/26-NEJSDS96

Pub. online: 28 January 2026 Type: Case Study, Application, And/or Practice Article

Open Access

Area: Biomedical Research

¹ Contributed equally.

Accepted
8 December 2025

Published
28 January 2026

Abstract

Background: Generalized estimating equations (GEE) and mixed-model repeated measures (MMRM) can handle longitudinal continuous outcomes when modeling the mean with time included categorically. Due to small sample sizes in rare diseases, a compound symmetry (CS) covariance pattern is sometimes adopted. In this setting, there is scant literature in the rare disease community that provide practical advice about the use of both methods based on real datasets from trials conducted in rare diseases, including when to use the sandwich variance estimator with or without a bias correction.

Methods: To fill this gap, we simulated data from three longitudinal, phase 3 trials conducted in rare diseases to jointly review the operating characteristics: a randomized trial in GNE myopathy (N = 44 placebo; N = 45 treatment) and pediatric X-linked hypophosphatemia (XLH) (N = 32 control; N = 29 treatment), and a single-arm in adult XLH (N = 14).

Results: In each trial, few participants discontinued; furthermore, <1.5% of the measurement occasions were missing outcome data, no missing outcome data pattern occurred in >1 participant, and the missing completely at random (MCAR) assumption was clinically justified. In the two trials with nonconstant variances/covariances over time, bias-corrected sandwich variance estimators with t-based inference was needed with MMRM and GEE. If the CS pattern was a good approximation, as seen in the pediatric XLH trial, then model-based standard errors with t-based inference performed well with both methods

Conclusion: Based on a review of three case studies, the MCAR assumption was plausible and missingness low. When modeling the mean response with time included categorically and with a parsimonious CS covariance structure, each method required careful consideration with its use.

References

[1]

Mallinckrodt, C. H., Lane, P. W., Schnell, D., Peng, Y. and Mancuso, J. P. Recommendations for the primary analysis of continuous endpoints in longitudinal clinical trials. Drug Inf J. 42(4) 303–319 (2008). https://doi.org/10.1177/009286150804200402

[2]

Mancl, L. A. and Leroux, B. G. Efficience of regression estimates for clustered data. Biometrics. 52 500–511 (1996).

[3]

Liang, K. Y. and Longitudinal, Z. SL. Data-Analysis Using Generalized Linear-Models. Biometrika. 73(1) 13–22 (1986). https://doi.org/10.1093/biomet/73.1.13. MR0836430

[4]

Day, S., Jonker, A. H., Lau, L. P. L. et al. Recommendations for the design of small population clinical trials. Orphanet J Rare Dis. 13(1) 195 (2018). https://doi.org/10.1186/s13023-018-0931-2

[5]

CMW, v. d. W. M. and du Prie-Olthof MJ, G. The patient’s view on rare disease trial design - a qualitative study. Orphanet J Rare Dis. 14(1) 31 (2019). https://doi.org/10.1186/s13023-019-1002-z

[6]

Hall, A. K. and Ludington, E. Considerations for successful clinical development for orphan indications. Expert Opinion on Orphan Drugs. 1(11) 847–850 (2013).

[7]

Hilgers, R. D., König, F., Molenberghs, G. and Senn, S. J. Design and analysis of clinical trials for small rare disease populations. J Rare Dis Res Treat. 1(3) 53–60 (2016).

[8]

Abrahamyan, L., Feldman, B. M., Tomlinson, G. et al. Alternative designs for clinical trials in rare diseases. Am J Med Genet C Semin Med Genet. 172(4) 313–331 (2016). https://doi.org/10.1002/ajmg.c.31533

[9]

Pizzamiglio, C., Vernon, H. J., Hanna, M. G. and Pitceathly, R. DS. Designing clinical trials for rare diseases: unique challenges and opportunities. Nat Rev Methods Primers. 2(1) (2022). https://doi.org/10.1038/s43586–022-00100-2

[10]

Robins, J. M., Rotnitzky, A. and Zhao, L. P. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association. 90 106–121 (1995). MR1325118

[11]

Fitzmaurice, G. M., Laird, N. M. and Ware, J. H. Applied longitudinal data analysis. 2nd ed. Probability and statistics. John Wiley & Sons (2011). MR2830137

[12]

Gurka, M. J. Selecting the best linear mixed model under REML. The American Statistician. 60 19–26 (2006). https://doi.org/10.1198/000313006X90396. MR2224133

[13]

mmrm: Mixed models for repeated measures (2022). https://CRAN.R-project.org/package=mmrm

[14]

Lu, K. and Mehrotra, D. V. Specification of covariance structure in longitudinal data analysis for randomized clinical trials. Stat Med. 29(4) 474–488 (2010). https://doi.org/10.1002/sim.3820. MR2751783

[15]

Liang, K. Y. and Zeger, S. L. Longitudinal data analysis using generalized linear models. Biometrika. 73(1) 13–22 (1986). https://doi.org/10.1093/biomet/73.1.13. MR0836430

[16]

Gurka, M. J., Edwards, L. J. and Muller, K. E. Avoiding bias in mixed model inference for fixed effects. Stat Med. 30(22) 2696–2707 (2011). https://doi.org/10.1002/sim.4293. MR2843173

[17]

Kauermann, G. and Carroll, R. J. The sandwich variance estimator: Efficiency properties and coverage probability of confidence intervals. Discussion Paper 189 Collaborative Research Center 3862000.

[18]

Kauermann, G. and Carroll, R. J. A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc. 96(456) 1387–1396 (2001). https://doi.org/10.1198/016214501753382309. MR1946584

[19]

Mancl, L. A. and DeRouen, T. A. A covariance estimator for GEE with improved small-sample properties. Biometrics. 57(1) 126–134 (2001). https://doi.org/10.1111/j.0006-341x.2001.00126.x. MR1833298

[20]

Bell, R. M. and McCaffrey, D. F. Bias reduction in standard errors for linear regression with multi-stage samples. Survey Methodology. 28 169–181 (2002).

[21]

Gosho, M., Hirakawa, A., Noma, H., Maruo, K. and Sato, Y. Comparison of bias-corrected covariance estimators for MMRM analysis in longitudinal data with dropouts. Stat Methods Med Res. 26(5) 2389–2406 (2017). https://doi.org/10.1177/0962280215597938. MR3712239

[22]

Gosho, M., Noma, H. and Maruo, K. Practical Review and Comparison of Modified Covariance Estimators for Linear Mixed Models in Small-sample Longitudinal Studies with Missing Data. Int Stat Rev. 89(3) 550–572 (2021). https://doi.org/10.1111/insr.12447. MR4411918

[23]

Gosho, M., Sato, T. and Takeuchi, H. Robust covariance estimator for small-sample adjustment in the generalized estimating equations: a simulation study. Science Journal of Applied Mathematics and Statistics. 2(1) 20–25 (2014).

[24]

Pustejovsky, J. E. and Tipton, E. Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models. Journal of Business and Economic Statistics. 36(4) 672–683 (2018). https://doi.org/10.1080/07350015.2016.1247004. MR3871709

[25]

Wang, M., Kong, L., Li, Z. and Zhang, L. Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples. Stat Med. 35(10) 1706–1721 (2016). https://doi.org/10.1002/sim.6817. MR3513479

[26]

Imbens, G. W. and Kolesar, M. Robust standard errors in small samples: Some practical advice. In Research NBoE, editor. NBER Working Paper Series. National Bureau of Economic Research (2012). Cambridge, MA.

[27]

Pan, W. On the robust variance estimator in generalised estimating equations. Biometrika. 88(3) 901–906 (2001). https://doi.org/10.1093/biomet/88.3.901. MR1859421

[28]

ICH E9 Statistical Principles for Clinical Trials (1998).

[29]

Lochmüller, H., Behin, A., Caraco, Y. et al. A phase 3 randomized study evaluating sialic acid extended-release for GNE myopathy. Neurology. 92(18) E2109–E2117 (2019). https://doi.org/10.1212/Wnl.0000000000006932

[30]

Imel, E. A., Glorieux, F. H., Whyte, M. P. et al. Burosumab versus conventional therapy in children with X-linked hypophosphataemia: a randomised, active-controlled, open-label, phase 3 trial. Lancet. 393(10189) 2416–2427 (2019). https://doi.org/10.1016/S0140-6736(19)30654-3

[31]

Insogna, K. L., Rauch, F., Kamenicky, P. et al. Burosumab Improved Histomorphometric Measures of Osteomalacia in Adults with X-Linked Hypophosphatemia: A Phase 3, Single-Arm. International Trial. J Bone Miner Res. 34(12) 2183–2191 (2019). https://doi.org/10.1002/jbmr.3843

[32]

Little, R. J. A. A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association. 83(44) 1198–1202 (1988). MR0997603

[33]

Akaike, H. In Information theory and an extension of the maximum likelihood principle 267–281 (1973). MR0483125

[34]

Burnham, K. P. and Anderson, D. R. Model selection and multimodel inference - A practical information-theoretic approach. 2 ed. Springer, New York, NY (2002). MR1919620

[35]

Kenward, M. G. and Roger, J. H. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 53(3) 983–997 (1997). https://doi.org/10.2307/2533558

[36]

Schluchter, M. D. and Small-Sample, E. JD. Adjustments to Tests with Unbalanced Re peated Measures Assuming Several Covariance Structures. Journal of Statistical Computation and Simulation. 37 69–87 (1990).

[37]

Hammill, B. G. and Preisser, J. S. A SAS/IML software program for GEE and regression diagnostics. Comput Stat Data An. 51(2) 1197–1212 (2006). https://doi.org/10.1016/j.csda.2005.11.016. MR2297517

[38]

Greenhouse, S. W. and Geisser, S. On methods in the analysis of profile data. Psychometrika. 32 95–112 (1959). https://doi.org/10.1007/BF02289823. MR0103783

[39]

Raal, F. J., Rosenson, R. S., Reeskamp, L. F. et al. Evinacumab for Homozygous Familial Hypercholesterolemia. N Engl J Med. 383(8) 711–720 (2020). https://doi.org/10.1056/NEJMoa2004215

[40]

Zeger, S. L. and Liang, K. Y. Longitudinal Data-Analysis for Discrete and Continuous Outcomes. Biometrics. 42(1) 121–130 (1986). https://doi.org/10.2307/2531248

[41]

Zhao, L. P., Prentice, R. and Self, S. Multivariate mean parameter estimation by using a partly exponential model. Journal of the Royal Statistical Society, Series B. 54 805–811 (1992).

[42]

Ren Zhang Y Jia Y, Y. et al. Analyses of repeatedly measured continuous outcomes in randomized controlled trials needed substantial improvements. J Clin Epidemiol. 143 105–117 (2022). https://doi.org/10.1016/j.jclinepi.2021.12.007

[43]

Veenhuizen, Y., Cup, E. H. C., Jonker, M. A. et al. Self-management program improves participation in patients with neuromuscular disease: A randomized controlled trial. Neurology. 93(18) e1720–e1731 (2019). https://doi.org/10.1212/WNL.0000000000008393

[44]

nlme: Linear and Nonlinear Mixed Effects Models. Version Version R package version 3.1-163 (2023). https://CRAN.R-project.org/package=nlme

[45]

clubSandwich: Cluster-Robust (Sandwich) Variance Estimators with Small-Sample Corrections. Version R package version 0.5.8.9999 (2023). http://jepusto.github.io/clubSandwich/

[46]

gee: Generalized estimation equation solver. Version R package version 4.13-19 (2015). https://cran.r-project.org/web/packages/gee/

[47]

Halekoh, U., Hojsgaard, S. and Yan, J. The R Package geepack for Generalized Estimating Equations. J Stat Softw. 15(2) 1–11 (2006). https://doi.org/10.18637/jss.v015.i02

[48]

Fay, M. P. and Graubard, B. I. Small-sample adjustments for Wald-type tests using sandwich estimators. Biometrics. 57(4) 1198–1206 (2001). https://doi.org/10.1111/j.0006-341X.2001.01198.x. MR1950428

[49]

Gunsolley, J. C., Getchell, C. and Chinchilli, V. M. Small sample characteristics of generalized estimating equations. Communications in Statistics, Simulation and Computation. 24 869–878 (1995).

[50]

Hinkley, D. V. and Wang, S. Efficiency of robust standard errors for regression coefficients. Communications in Statistics, Theory and Methods. 20 1–11 (1991). https://doi.org/10.1080/03610929108830479. MR1114631

[51]

Kauermann, G. and Carroll, R. J. The sandwich variance estimator: efficiency properties and coverage probability of confidence intervals. Journal of the American Statistical Association. 96 1387–1396 (2001). https://doi.org/10.1198/016214501753382309. MR1946584

[52]

Thoemmes, F. and Enders, C. K. A structural equation model for testing whether data are missing completely at random presented at. Annual Meeting of the American Educational Research Association, Chicago Chicago, IL (2007).

[53]

Fitzmaurice, G. M. Methods for handling dropouts in longitudinal clinical trials. Statistica Neerlandica. 57(1) 75–99 (2003). https://doi.org/10.1111/1467-9574.00222. MR2055522

Full article

Open access article under the CC BY license.

Keywords

Generalized estimating equations Longitudinal designs Mixed model repeated measures Rare diseases Sandwich variance estimator Small-sample correction

Metrics

since December 2021

153

Article info
views

Full article
views

PDF
downloads

XML
downloads

RSS

Authors

Abstract

References

Export citation

Copy and paste formatted citation

Download citation in file