The Anytime-Valid Logrank Test: Error Control Under Continuous Monitoring with Unlimited Horizon
Volume 2, Issue 2 (2024), pp. 190–214
Pub. online: 29 May 2024
Type: Statistical Methodology
Open Access
Accepted
23 January 2024
23 January 2024
Published
29 May 2024
29 May 2024
Abstract
We introduce the anytime-valid (AV) logrank test, a version of the logrank test that provides type-I error guarantees under optional stopping and optional continuation. The test is sequential without the need to specify a maximum sample size or stopping rule, and allows for cumulative meta-analysis with type-I error control. The method can be extended to define anytime-valid confidence intervals. The logrank test is an instance of the martingale tests based on E-variables that have been recently developed. We demonstrate type-I error guarantees for the test in a semiparametric setting of proportional hazards, show explicitly how to extend it to ties and confidence sequences and indicate further extensions to the full Cox regression model. Using a Gaussian approximation on the logrank statistic, we show that the AV logrank test (which itself is always exact) has a similar rejection region to O’Brien-Fleming α-spending but with the potential to achieve $100\% $ power by optional continuation. Although our approach to study design requires a larger sample size, the expected sample size is competitive by optional stopping.
References
Breiman, L. Optimal gambling systems for favorable games. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, vol. 4.1, pp. 65–79. University of California Press (January 1961). MR0135630. https://projecteuclid.org/ebooks/berkeley-symposium-on-mathematical-statistics-and-probability/Proceedings-of-the-Fourth-Berkeley-Symposium-on/chapter/Optimal-Gambling-Systems-for-Favorable-Games/bsmsp/1200512159.
Cox, D. R. Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological) 34(2) 187–220 (1972). ISSN 0035-9246. Publisher: [Royal Statistical Society, Wiley]. https://www.jstor.org/stable/2985181. MR0341758
Cox, D. R. Partial likelihood. Biometrika 62(2) 269–276 (1975). ISSN 0006-3444. Publisher: [Oxford University Press, Biometrika Trust]. https://doi.org/10.2307/2335362. MR0400509
Darling, D. A. and Robbins, H. Confidence sequences for mean, variance, and median. Proceedings of the National Academy of Sciences 58(1) 66–68 (July 1967). Publisher: Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.58.1.66. MR0215406
Dawid, A. P. Present position and potential developments: some personal views: statistical theory: the prequential approach. Journal of the Royal Statistical Society. Series A (General) 147(2) 278–292 (1984). ISSN 0035-9238. Publisher: [Royal Statistical Society, Wiley]. https://doi.org/10.2307/2981683. MR0763811
Efron, B. The efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association 72(359) 557–565 (September 1977). ISSN 0162-1459. https://doi.org/10.1080/01621459.1977.10480613. MR0451514
Fleming, T. R. and Harrington, D. P. Counting Processes and Survival Analysis. John Wiley & Sons (September 2011). ISBN 978-1-118-15066-5. MR1100924
Grünwald, P. D., De Heide, R. and Koolen, W. Safe testing. Journal of the Royal Statistical Society, Series B (2024). With Discussion. arXiv:1906.07801.
Grünwald, P. D. and Mehta, N. A. A tight excess risk bound via a unified PAC-Bayesian–Rademacher–Shtarkov–MDL complexity. In Proceedings of the 30th International Conference on Algorithmic Learning Theory, pp. 433–465. PMLR (March 2019). ISSN 2640-3498. https://proceedings.mlr.press/v98/grunwald19a.html. MR3932854
Grünwald, P. D. and Roos, T. Minimum description length revisited. International Journal of Mathematics for Industry 11(01) 1930001 (December 2019). ISSN 2661-3352. Publisher: World Scientific Publishing Co. https://doi.org/10.1142/s2661335219300018. MR4090761
Howard, S. R., Ramdas, A., McAuliffe, J. and Sekhon, J. Time-uniform Chernoff bounds via nonnegative supermartingales. Probability Surveys 17 257–317 (2020). https://doi.org/10.1214/18-PS321. MR4100718
Howard, S. R., Ramdas, A., McAuliffe, J. and Sekhon, J. Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics 49(2) 1055–1080 (2021). https://doi.org/10.1214/20-aos1991. MR4255119
Kim, K. and DeMets, D. L. Confidence intervals following group sequential tests in clinical trials. Biometrics 43(4) 857–864 (1987). ISSN 0006-341X. Publisher: [Wiley, International Biometric Society]. https://doi.org/10.2307/2531539. MR0920470
Klein, J. P. and Moeschberger, M. L. Survival Analysis: Techniques for Censored and Truncated Data. Statistics for Biology and Health. Springer, New York, NY (2003). ISBN 978-0-387-95399-1, 978-0-387-21645-4. https://doi.org/10.1007/b97377
Tze Leung Lai. On confidence sequences. The Annals of Statistics 4(2) 265–280 (March 1976). ISSN 0090-5364, 2168-8966. Publisher: Institute of Mathematical Statistics. https://doi.org/10.1214/aos/1176343406. MR0395103.
Li, J. and Barron, A. Mixture density estimation. In Advances in Neural Information Processing Systems, vol. 12. MIT Press (1999). https://papers.nips.cc/paper/1999/hash/a0f3601dc682036423013a5d965db9aa-Abstract.html.
Li, Q. (J.) Estimation of Mixture Models. PhD Thesis, Yale University, New Haven, CT, USA, 1999. MR2699116
Lindon, M. and Malek, A. Anytime-valid inference for multinomial count data. In (S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho and A. Oh, eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 2817–2831. Curran Associates, Inc. (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/12f3bd5d2b7d93eadc1bf508a0872dc2-Paper-Conference.pdf.
Mehrotra, D. V. and Roth, A. J. Relative risk estimation and inference using a generalized logrank statistic. Statistics in Medicine 20(14) 2099–2113 (2001). ISSN 1097-0258. https://doi.org/10.1002/sim.854.
O’Brien, P. C. and Fleming, T. R. A multiple testing procedure for clinical trials. Biometrics 35(3) 549–556 (1979). ISSN 0006-341X. Publisher: [Wiley, International Biometric Society]. https://doi.org/10.2307/2530245
Peto, R. Discussion of: Regression models and life tables, by DR Cox. Journal of the Royal Statistical Society, Series B 26 205–207 (1972). MR0341758
Peto, R. and Peto, J. Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society. Series A (General) 135(2) 185–207 (1972). ISSN 0035-9238. Publisher: [Royal Statistical Society, Wiley]. https://doi.org/10.2307/2344317
Pocock, S. J. Group sequential methods in the design and analysis of clinical trials. Biometrika 64(2) 191–199 (August 1977). ISSN 0006-3444. https://doi.org/10.1093/biomet/64.2.191
Pocock, S. J. Current controversies in data monitoring for clinical trials. Clinical Trials 3(6) 513–521 (December 2006). ISSN 1740-7745. Publisher: SAGE Publications https://doi.org/10.1177/1740774506073467.
Ramdas, A., Ruf, J., Larsson, M. and Koolen, W. Admissible anytime-valid sequential inference must rely on nonnegative martingales. arXiv:2009.03167 [math, stat] (September 2020).
Ramdas, A., Grünwald, P., Vovk, V. and Shafer, G. Game-theoretic statistics and safe anytime-valid inference. Statist. Sci. 38(4) 576–601 (2023). ISSN 0883-4237. https://doi.org/10.1214/23-sts894. MR4665027
Robbins, H. and Siegmund, D. The expected sample size of some tests of power one. The Annals of Statistics 2(3) 415–436 (May 1974). ISSN 0090-5364. Publisher: Institute of Mathematical Statistics. https://doi.org/10.1214/aos/1176342704. MR0448750
Robbins, H. and Siegmund, D. Boundary crossing probabilities for the Wiener process and sample sums. The Annals of Mathematical Statistics 41(5) 1410–1429 (October 1970). ISSN 0003-4851, 2168-8990. Publisher: Institute of Mathematical Statistics. https://doi.org/10.1214/aoms/1177696787. MR0277059
Schoenfeld, D. The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika 68(1) 316–319 (1981). ISSN 0006-3444. Publisher: [Oxford University Press, Biometrika Trust]. https://doi.org/10.1093/biomet/68.1.316. MR0614969
ter Schure, J. A. (Judith), Ly, A., Belin, L., Benn, C. S., Bonten, M. J. M., Cirillo, J. D., Damen, J. A. A., Fronteira, I., Hendriks, K. D., Junqueira-Kipnis, A. P., Kipnis, A., Launay, O., Mendez-Reyes, J. E., Netea, M. G., Nielsen, S., Upton, C. M., van den Hoogen, G., Weehuizen, J. M., Grünwald, P. D. and van Werkhoven, C. H. (Henri). Bacillus Calmette-Guérin vaccine to reduce COVID-19 infections and hospitalisations in healthcare workers: a living systematic review and prospective ALL-IN meta-analysis of individual participant data from randomised controlled trials. medRxiv (2022). https://doi.org/10.1101/2022.12.15.22283474
ter Schure, J. and Grünwald, P. Accumulation Bias in metaanalysis: the need to consider time in error control [version 1; peer review: 2 approved]. F1000Research, 8:962, June 2019. ISSN 2046-1402. https://f1000research.com/articles/8-962/v1.
ter Schure, J. and Grünwald, P. ALL-IN meta-analysis: breathing life into living systematic reviews [version 1; peer review: 1 approved, 2 approved with reservations]. F1000Research, page 11:549, 2022. https://f1000research.com/articles/11-549.
Sellke, T. and Siegmund, D. Sequential analysis of the proportional hazards model. Biometrika 70(2) 315–326 (1983). Publisher: Oxford University Press. https://doi.org/10.1093/biomet/70.2.315. MR0712020
Shafer, G. Testing by betting: A strategy for statistical and scientific communication. Journal of the Royal Statistical Society Series A 184(2) 407–431 (2021). Publisher: Royal Statistical Society. https://doi.org/10.1111/rssa.12647. MR4255905
Slud, E. V. Sequential linear rank tests for two-sample censored survival data. Annals of Statistics 12(2) 551–571 (June 1984). ISSN 0090-5364, 2168-8966. Publisher: Institute of Mathematical Statistics. https://doi.org/10.1214/aos/1176346505. MR0740911
Slud, E. V. Partial likelihood for continuous-time stochastic processes. Scandinavian Journal of Statistics 19(2) 97–109 (1992). ISSN 0303-6898. Publisher: [Board of the Foundation of the Scandinavian Journal of Statistics, Wiley]. https://www.jstor.org/stable/4616231. MR1173593
Tse, T. and Davison, A. C. A note on universal inference. Stat 11(1) e501 (2022). https://doi.org/10.1002/sta4.501. MR4529724
Tsiatis, A. A. A large sample study of Cox’s regression model. The Annals of Statistics 9(1) 93–108 (January 1981). ISSN 0090-5364, 2168-8966. Publisher: Institute of Mathematical Statistics. https://doi.org/10.1214/aos/1176345335
Tsiatis, A. A. Group Sequential Methods for Survival Analysis with Staggered Entry. Institute of Mathematical Statistics (1982). ISBN 978-0-940600-02-7. https://doi.org/10.1214/lnms/1215464854. MR0734207
Turner, R., Ly, A., Perez-Ortiz, M. F., ter Schure, J. and Grunwald, P. D. safestats: Safe Anytime-Valid Inference, November 2022. https://CRAN.R-project.org/package=safestats.
Ville, J. Etude critique de la notion de collectif. Gauthier-Villars (1939). MR3533075
Wald, A. Sequential Analysis. Wiley, New York (1947). MR0020764
Wang, R. and Ramdas, A. False discovery rate control with e-values. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 84(3) 822–852 (2022). MR4460577
Wasserman, L., Ramdas, A. and Balakrishnan, S. Universal inference. Proceedings of the National Academy of Sciences 117(29) 16880–16890 (July 2020). ISSN 0027-8424, 1091-6490. Publisher: National Academy of Sciences Section: Physical Sciences. https://doi.org/10.1073/pnas.1922664117
Waudby-Smith, I. and Ramdas, A. Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society Series B: Statistical Methodology (2023). ISSN 1369-7412. https://doi.org/10.1093/jrsssb/qkad009. MR4716192
Waudby-Smith, I., Arbour, D., Sinha, R., Kennedy, E. H. and Ramdas, A. Time-uniform central limit theory and asymptotic confidence sequences. arXiv:2103.06476 (2021).
Wu, J. and Xiong, X. Group sequential survival trial design and monitoring using the log-rank test. Statistics in Biopharmaceutical Research 9(1) 35–43 (January 2017). https://doi.org/10.1080/19466315.2016.1189355