The New England Journal of Statistics in Data Science logo


  • Help
Login Register

  1. Home
  2. To appear
  3. Modeling Disease Progression in the Pres ...

The New England Journal of Statistics in Data Science

Submit your article Information Become a Peer-reviewer
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

Modeling Disease Progression in the Presence of an Outcome-Dependent Visiting Process with Application to Cystic Fibrosis Clinical Data
Weiji Su   Xia Wang   Pedro Miranda-Afonso     All authors (5)

Authors

 
Placeholder
https://doi.org/10.51387/26-NEJSDS99
Pub. online: 26 March 2026      Type: Methodology Article      Open accessOpen Access
Area: Biomedical Research

Accepted
26 January 2026
Published
26 March 2026

Abstract

The timing of longitudinal measurements may depend upon outcome or disease severity. In biomedical studies relying on clinical encounter data, patients often have dense, irregular collections of visit data when suffering a worse health condition. In parallel, the longitudinal measurements may be impacted by the period of irregular visiting. Ignoring the impact of the outcome-dependent visiting process when constructing a longitudinal disease progression model can produce biased results. We propose a Bayesian joint model linking a mixed-effects model for the longitudinal marker and Weibull proportional hazards model with a log frailty for the visiting process, adjusting both longitudinal marker and event processes with covariates. We examine different random effect structures and performance characterizing disease trajectory. Motivated by clinical data on cystic fibrosis lung disease, we estimate the longitudinal process for lung function decline. Individuals with lower lung function tend to have more frequent clinical visits than those with higher lung function. Simulation studies suggest that incorporating a time-dependent Gaussian process is more important for model fit than adding the survival model via joint modeling; the random intercepts model exhibits maximum bias, especially when there is an outcome-dependent visiting process.

Supplementary material

 Supplementary Material
ESM1.pdf: Extended results on parameter estimates of the joint models with random slopes or intercepts only (Tables S1-S2); performance across models according to fit statistics (Tables Tables S3-S4); simulation study results on model parameter estimates and performance (Tables S5-S6); trace plots and residual diagnostics from the real data application (Figures S1-S2). ESM2: Collection of files that includes the implementation code for the models and a simulated dataset.

References

Disclosure statement

The authors report that there are no competing interests to declare with respect to the research, authorship, and/or publication of this article.
[1] 
Asar, O., Bolin, D., Diggle, P. and Wallin, J. (2018). Linear Mixed-Effects Models for Non-Gaussian Repeated Measurement Data. Pre-print arXiv:180402592v1. https://doi.org/10.1111/rssc.12405. MR4166856
[2] 
Celeux, G., Forbes, F., Robert, C. and Titterington, D. (2006). Deviance Information Criteria for Missing Data Models. Bayesian Analysis 1(4) 651–674. https://doi.org/10.1214/06-BA122. MR2282197
[3] 
Cole, S. and Hernan, M. (2008). Constructing inverse probability weights for marginal structural models. Am J Epidemiol 168(6) 656–664.
[4] 
Cystic Fibrosis Foundation (2023). Cystic Fibrosis Foundation Patient Registry.
[5] 
Gasparini, A., Abrams, K., Barrett, J., Major, R., Sweeting, M., Brunskill, N. et al. (2020). Mixed-effects models for health care longitudinal data with an informative visiting process: A Monte Carlo simulation study. Stat Neerl 74(1) 5–23. https://doi.org/10.1111/stan.12188. MR4050397
[6] 
Geisser, S. and Eddy, W. (1979). A Predictive Approach to Model Selection. J Am Stat Assoc 74(365) 153–160. MR0529531
[7] 
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 1 515–534. https://doi.org/10.1214/06-BA117A. MR2221284
[8] 
Gelman, A. and Rubin, D. (1992). Inference from iterative simulation using multiple sequences. Statistical Science 7 457–472.
[9] 
Henderson, R., Diggle, P. and Dobson, A. (2000). Joint modelling of longitudinal measurements and event time data. Biostatistics 1(4) 465–480.
[10] 
Konstan, M., Schluchter, M., Xue, W. and Davis, P. (2007). Clinical use of Ibuprofen is associated with slower FEV1 decline in children with cystic fibrosis. Am J Respir Crit Care Med 176(11) 1084–1089.
[11] 
Lederer, D., Bell, S., Branson, R. et al. (2018). Control of Confounding and Reporting of Results in Causal Inference Studies: Guidance for Authors from Editors of Respiratory, Sleep, and Critical Care Journals. Ann Am Thorac Soc.
[12] 
Liou, T., Elkin, E., Pasta, D., Jacobs, J., Konstan, M., Morgan, W. et al. (2010). Year-to-year changes in lung function in individuals with cystic fibrosis. J Cyst Fibros 9(4) 250–256.
[13] 
Lipsitz, S., Fitzmaurice, G., Ibrahim, J., Gelber, R. and Lipshultz, S. (2002). Parameter estimation in longitudinal studies with outcome-dependent follow-up. Biometrics 58(3) 621–630. https://doi.org/10.1111/j.0006-341X.2002.00621.x. MR1933535
[14] 
McCulloch, C., Neuhaus, J. and Olin, R. (2016). Biased and unbiased estimation in longitudinal studies with informative visit processes. Biometrics 72(4) 1315–1324. https://doi.org/10.1111/biom.12501. MR3591616
[15] 
Mogayzel, P. J., Naureckas, E., Robinson, K., Mueller, G., Hadjiliadis, D., Hoag, J. et al. (2013). Cystic fibrosis pulmonary guidelines. Chronic medications for maintenance of lung health. Am J Respir Crit Care Med 187(7) 680–689.
[16] 
Neal, R. (2011) MCMC using Hamiltonian Dynamics. CRC Press, Boca Raton. MR2858447
[17] 
Neuhaus, J., McCulloch, C. and Boylan, R. (2018). Analysis of longitudinal data from outcome-dependent visit processes: Failure of proposed methods in realistic settings and potential improvements. Stat Med 37(29) 4457–4471. https://doi.org/10.1002/sim.7932. MR3879439
[18] 
Preisser, J., Lohman, K. and Rathouz, P. (2002). Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. Stat Med 21(20) 3035–3054.
[19] 
Pullenayegum, E. and Lim, L. (2016). Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design. Stat Methods Med Res 25(6) 2992–3014. https://doi.org/10.1177/0962280214536537. MR3572895
[20] 
Pullenayegum, E., Birken, C., Maguire, J. and Collaboration, T. (2021). Clustered longitudinal data subject to irregular observation. Stat Methods Med Res 30(4) 1081–1100. https://doi.org/10.1177/0962280220986193. MR4259889
[21] 
Rizopoulos, D. (2012) Joint models for longitudinal and time-to-event data: with applications in R. CRC Press, Boca Raton. xiv, 261 p.
[22] 
Rizopoulos, D. and Ghosh, P. (2011). A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat Med 30(12) 1366–1380. https://doi.org/10.1002/sim.4205. MR2828959
[23] 
Scotet, V., L’Hostis, C. and Ferec, C. (2020). The Changing Epidemiology of Cystic Fibrosis: Incidence, Survival and Impact of the CFTR Gene Discovery. Genes (Basel) 11(6).
[24] 
Spiegelhalter, D., Best, N., Carlin, B. and Van der Linde, A. (2002). Bayesian Measures of Model Complexity and Fit (with Discussion). Journal of the Royal Statistical Society, Series B 64(4) 583–616. https://doi.org/10.1111/1467-9868.00353. MR1979380
[25] 
Stan Development Team (2017). Stan Modeling Language Users Guide and Reference Manual.
[26] 
Su, W. (2020). Flexible Joint Hierarchical Gaussian Process Model for Longitudinal and Recurrent Event Data. University of Cincinnati. MR4533229
[27] 
Su, W., Wang, X. and Szczesniak, R. (2021). Flexible link functions in a joint hierarchical Gaussian process model. Biometrics 77(2) 754–764. https://doi.org/10.1111/biom.13291. MR4307670
[28] 
Sun, J. -D., Sun, L. and Zhao, X. (2005). Semiparametric regression analysis of longitudinal data with informative observation times. J Am Stat Assoc 100(471) 882–889. https://doi.org/10.1198/016214505000000060. MR2201016
[29] 
Szczesniak, R., Andrinopoulou, E., Su, W., Afonso, P., Burgel, P., Cromwell, E. et al. (2023). Lung Function Decline in Cystic Fibrosis: Impact of Data Availability and Modeling Strategies on Clinical Interpretations. Ann Am Thorac Soc.
[30] 
Szczesniak, R., McPhail, G., Duan, L., Macaluso, M., Amin, R. and Clancy, J. (2013). A semiparametric approach to estimate rapid lung function decline in cystic fibrosis. Ann Epidemiol 23(12) 771–777.
[31] 
Szczesniak, R., Su, W., Brokamp, C., Keogh, R., Pestian, J., Seid, M. et al. (2020). Dynamic predictive probabilities to monitor rapid cystic fibrosis disease progression. Stat Med 39(6) 740–756. https://doi.org/10.1002/sim.8443. MR4067763
[32] 
Taylor-Robinson, D., Whitehead, M., Diderichsen, F., Olesen, H., Pressler, T., Smyth, R. et al. (2012). Understanding the natural progression in with cystic fibrosis: a longitudinal study. Thorax 67(10) 860–866.
[33] 
van Oudenhoven, F., Swinkels, S., Ibrahim, J. and Rizopoulos, D. (2020). A marginal estimate for the overall treatment effect on a survival outcome within the joint modeling framework. Stat Med 39(28) 4120–4132. https://doi.org/10.1002/sim.8713. MR4175019
[34] 
Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning 11 3571–3594. MR2756194

Full article Related articles PDF XML
Full article Related articles PDF XML

Copyright
© 2026 New England Statistical Society
by logo by logo
Open access article under the CC BY license.

Keywords
Bayesian Frailty Gaussian process Irregular visits Joint model Longitudinal model Medical monitoring

Funding
The authors received financial support for this research from the National Institutes of Health (NIH) under Grants R01HL141286 and K25HL125954.

Metrics
since December 2021
53

Article info
views

25

Full article
views

28

PDF
downloads

26

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

The New England Journal of Statistics in Data Science

  • ISSN: 2693-7166
  • Copyright © 2021 New England Statistical Society

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer
Powered by PubliMill  •  Privacy policy