<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">NEJSDS</journal-id>
<journal-title-group><journal-title>The New England Journal of Statistics in Data Science</journal-title></journal-title-group>
<issn pub-type="ppub">2693-7166</issn><issn-l>2693-7166</issn-l>
<publisher>
<publisher-name>New England Statistical Society</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">NEJSDS98</article-id>
<article-id pub-id-type="doi">10.51387/26-NEJSDS98</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Case Study, Application, and/or Practice Article</subject></subj-group><subj-group subj-group-type="area">
<subject>Machine Learning and Data Mining</subject></subj-group></article-categories>
<title-group>
<article-title>Predictive Performance of Statistical and Machine Learning Survival Models with Time-Dependent Covariates: An Evaluation</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Lu</surname><given-names>Zhaohua</given-names></name><email xlink:href="mailto:zhaohua.lu@daiichisankyo.com">zhaohua.lu@daiichisankyo.com</email><xref ref-type="aff" rid="j_nejsds98_aff_001"/>
</contrib>
<contrib contrib-type="author">
<name><surname>He</surname><given-names>Philip</given-names></name><email xlink:href="mailto:philip.he@daiichisankyo.com">philip.he@daiichisankyo.com</email><xref ref-type="aff" rid="j_nejsds98_aff_002"/><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_nejsds98_aff_001">211 Mt Airy Rd, Basking Ridge, NJ 07920, <institution>Daiichi-Sankyo Inc.</institution>, <country>USA</country>. E-mail address: <email xlink:href="mailto:zhaohua.lu@daiichisankyo.com">zhaohua.lu@daiichisankyo.com</email></aff>
<aff id="j_nejsds98_aff_002">211 Mt Airy Rd, Basking Ridge, NJ 07920, <institution>Daiichi-Sankyo Inc.</institution>, <country>USA</country>. E-mail address: <email xlink:href="mailto:philip.he@daiichisankyo.com">philip.he@daiichisankyo.com</email></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2026</year></pub-date><pub-date pub-type="epub"><day>26</day><month>3</month><year>2026</year></pub-date><volume content-type="ahead-of-print">0</volume><issue>0</issue><fpage>1</fpage><lpage>11</lpage><history><date date-type="accepted"><day>16</day><month>1</month><year>2026</year></date></history>
<permissions><copyright-statement>© 2026 New England Statistical Society</copyright-statement><copyright-year>2026</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>Time-to-event (TTE) endpoints are widely used in drug development and biomedical research. Traditional statistical models, for example the Cox regression model, have been used to predict TTE outcomes. Recent studies have also employed flexible machine learning (ML) methods, for example, tree models, to obtain superior prediction performance. In addition, post-baseline time-varying predictors have recently been reported to improve prediction using ML methods. In this study, we applied the Cox model and ML methods to predict the onset of TTE with both baseline and post-baseline predictors. We evaluated the predictive performance of these models using various metrics, including the time-dependent area under the receiver operating characteristic curve (AUC), the concordance index (C-index), and integrated Brier scores. We also used these metrics as criteria to guide the selection of predictors in the predictive models. Our findings indicate that the Cox model remains a robust choice, often comparable to ML methods in moderate sample sizes, provided the proportional hazards assumption holds. However, tree-based methods demonstrate superior performance in capturing complex, nonlinear interactions, albeit requiring larger sample sizes to stabilize predictions.</p>
</abstract>
<kwd-group>
<label>Keywords and phrases</label>
<kwd>Time-dependent predictors</kwd>
<kwd>Cox proportional hazards model</kwd>
<kwd>Survival random forest</kwd>
<kwd>Survival tree</kwd>
<kwd>Concordance index</kwd>
<kwd>Brier score</kwd>
<kwd>Time-dependent AUC</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="j_nejsds98_s_001">
<label>1</label>
<title>Introduction</title>
<p>Time-to-event (TTE) endpoints are fundamental in drug development and biomedical research. These endpoints are used to measure the duration until a specific event, such as disease progression, treatment failure, or death, which makes them critical to understanding treatment efficacy and patient prognosis. Analyzing TTE data requires specialized statistical and computational methods to account for censoring, where the event of interest may not be observed at the time of analysis. Traditional statistical models, such as the Cox proportional hazards model, have long been the cornerstone of the analysis of TTE outcomes. The Cox model [<xref ref-type="bibr" rid="j_nejsds98_ref_014">14</xref>] provides a semiparametric framework that estimates the effect of covariates on the hazard function while making minimal assumptions about the baseline hazard. Its interpretability and flexibility have made it a preferred choice in clinical and biomedical research. However, the Cox model assumes proportional hazards over time, which may not hold in real-world datasets, and it may face challenges when handling high-dimensional data or complex relationships, e.g., interactions among predictors.</p>
<p>In recent years, machine learning (ML) methods have gained attention for predicting TTE outcomes due to their flexibility and capacity to model nonlinear relationships and interactions among variables. Techniques such as tree-based models [<xref ref-type="bibr" rid="j_nejsds98_ref_031">31</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_020">20</xref>] and deep learning approaches [<xref ref-type="bibr" rid="j_nejsds98_ref_034">34</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_036">36</xref>] have been shown to achieve improved predictive performance in scenarios where traditional models fall short. ML methods can accommodate large-scale data, automatically select relevant features, and handle complex, high-dimensional predictors. These strengths have positioned ML as a promising alternative to classical statistical models in survival analysis.</p>
<p>Moreover, recent studies indicate the improved performance after incorporating post-baseline time-varying predictors into predictive models [<xref ref-type="bibr" rid="j_nejsds98_ref_020">20</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_056">56</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_048">48</xref>]. Post-baseline predictors, which evolve over time during the course of observation, can capture dynamic changes in patient status or in responses to treatment that may significantly influence TTE outcomes. Integrating these predictors into ML models has been reported to further enhance prediction accuracy, offering a more comprehensive and adaptive representation of survival dynamics.</p>
<p>Given these advances, a critical question remains: Which method performs better, traditional statistical models such as the Cox model, or flexible ML approaches, with or without post-baseline predictors in typical clinical trials with moderate sample size? Cuthbert et. al. [<xref ref-type="bibr" rid="j_nejsds98_ref_015">15</xref>] compared the prediction performance of survival analysis methods with only baseline predictors present. This paper aims to address this question through a systematic comparison of these approaches using multiple evaluation metrics, and to provide insights into their respective strengths and limitations, particularly in the context of incorporating post-baseline time-varying predictors, and to guide researchers in selecting the most appropriate methodology for their TTE analyses.</p>
<p>In this study, we first review a range of statistical and ML models for TTE outcomes, including the Cox proportional hazards model and tree-based models, with or without post-baseline predictors. Then, we examine several widely used model evaluation metrics for TTE outcomes, including the time-dependent area under the receiver operating characteristic (ROC) curve [AUC, <xref ref-type="bibr" rid="j_nejsds98_ref_029">29</xref>], the concordance index [C-index, <xref ref-type="bibr" rid="j_nejsds98_ref_026">26</xref>], and Brier score and integrated Brier score [<xref ref-type="bibr" rid="j_nejsds98_ref_009">9</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_023">23</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_058">58</xref>], which are commonly employed to assess predictive performance in survival analysis. In addition, we evaluate and compare the predictive performance of traditional statistical models, such as the Cox proportional hazards model and ML models for TTE outcomes. This evaluation considers scenarios with only baseline covariates and those incorporating post-baseline covariates. Lastly, we investigate the performance of various model evaluation metrics under different true data generation mechanisms. By comparing these metrics across different models, we aim to elucidate their strengths and limitations, offering a perspective on their capability to handle complex survival data.</p>
<p>While machine learning methods offer substantial modeling flexibility, their suitability depends critically on the data characteristics typical of clinical trials. Advanced approaches such as gradient boosting machines [<xref ref-type="bibr" rid="j_nejsds98_ref_018">18</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_013">13</xref>] and deep learning–based survival models (e.g., DeepSurv [<xref ref-type="bibr" rid="j_nejsds98_ref_034">34</xref>]) have demonstrated strong performance in large-scale observational datasets. However, clinical trials—particularly Phase II and III oncology studies— usually have moderate sample sizes, where such methods may face challenges related to optimization stability, overfitting, and sensitivity to hyperparameter tuning. Empirical evidence suggests that deep learning and boosting-based methods often require substantially larger sample sizes to achieve stable and competitive performance in clinical tabular data [<xref ref-type="bibr" rid="j_nejsds98_ref_005">5</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_044">44</xref>]. In contrast, random forest–based approaches have been shown to exhibit greater robustness under moderate sample sizes, with relatively low sensitivity to hyperparameter choices [<xref ref-type="bibr" rid="j_nejsds98_ref_042">42</xref>]. In addition, boosting algorithms may overemphasize noisy or mislabeled observations in small samples, increasing the risk of overfitting [<xref ref-type="bibr" rid="j_nejsds98_ref_016">16</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_010">10</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_019">19</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_052">52</xref>], and tuning complexity [<xref ref-type="bibr" rid="j_nejsds98_ref_042">42</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_006">6</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_004">4</xref>]. Given these considerations, this study focuses on methods that balance nonlinear flexibility, robustness under moderate sample sizes, and feasibility for modeling post-baseline time-varying covariates—namely, the Cox proportional hazards model, tree-based methods for LTRC data [<xref ref-type="bibr" rid="j_nejsds98_ref_020">20</xref>], and survival random forest [<xref ref-type="bibr" rid="j_nejsds98_ref_031">31</xref>]. A more detailed discussion is provided in the Discussion section.</p>
<p>This paper is structured as follows. In Section <xref rid="j_nejsds98_s_002">2</xref>, we describe the time-to-event (TTE) models, including the traditional Cox proportional hazards model, and modern ML methods tailored for TTE outcomes. These models incorporate baseline predictors and, where applicable, post-baseline predictors to enhance predictive performance. In Section <xref rid="j_nejsds98_s_003">3</xref>, we explore model evaluation metrics specific to TTE variables. In Section <xref rid="j_nejsds98_s_007">4</xref>, simulation studies are conducted in various settings to demonstrate the strengths and limitations of the TTE prediction models and their associated evaluation metrics. In Section <xref rid="j_nejsds98_s_010">5</xref>, we conclude with a summary and a discussion.</p>
</sec>
<sec id="j_nejsds98_s_002">
<label>2</label>
<title>Time-to-Event Models</title>
<p>Let <inline-formula id="j_nejsds98_ineq_001"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbf{X}_{i}}={({X_{i1}},\dots ,{X_{ip}})^{\prime }}$]]></tex-math></alternatives></inline-formula> denote the <italic>p</italic> time-invariant covariates for subject <italic>i</italic>. The hazard function for the Cox proportional hazards model [<xref ref-type="bibr" rid="j_nejsds98_ref_014">14</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_047">47</xref>] is given by: 
<disp-formula id="j_nejsds98_eq_001">
<label>(2.1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo movablelimits="false">exp</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="bold-italic">β</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \lambda (t\mid {\mathbf{X}_{i}})={\lambda _{0}}(t)\exp \left({\mathbf{X}^{\prime }_{i}}\boldsymbol{\beta }\right),\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_002"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\lambda _{0}}(t)$]]></tex-math></alternatives></inline-formula> is the baseline hazard function, <inline-formula id="j_nejsds98_ineq_003"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">β</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{\beta }$]]></tex-math></alternatives></inline-formula> represents the regression coefficients for time-invariant covariates. The model assumes that the hazard ratios between different levels of covariates are constant over time, which means that the effect of the covariates on the hazard rate is multiplicative and does not change as time progresses. The model does not assume a specific form for the baseline hazard function, which allows flexibility in capturing time-dependent risk. The baseline hazard function can be estimated by the Breslow estimator [<xref ref-type="bibr" rid="j_nejsds98_ref_008">8</xref>]. The regression coefficients can be estimated by maximizing the partial likelihood independent of the baseline hazard [<xref ref-type="bibr" rid="j_nejsds98_ref_014">14</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_047">47</xref>]. We use the R package <italic>survival</italic> to implement the fitting and prediction of the Cox proportional hazards model [<xref ref-type="bibr" rid="j_nejsds98_ref_049">49</xref>].</p>
<p>Ishwaran et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_031">31</xref>] developed survival random forest models (SRFs), which offer a more flexible nonparametric alternative that can capture nonlinear relationships and interactions between covariates, often achieving superior predictive accuracy in complex data settings. SRF is an ensemble of multiple survival trees. Individual trees are grown using bootstrap samples and, at each node, a subset of variables is randomly selected to determine the optimal split. The nodes are divided according to survival differences using statistical measures such as the logrank test statistic [<xref ref-type="bibr" rid="j_nejsds98_ref_038">38</xref>]. Each tree produces a cumulative hazard function, which is averaged across all trees in the forest to produce ensemble predictions. In this study, we used random forest to model survival outcomes using only baseline covariates. The SRF is implemented in the R package <italic>randomForestSRC</italic> [<xref ref-type="bibr" rid="j_nejsds98_ref_032">32</xref>]. To optimize the performance of the model, we tune key parameters, including the number of trees in the forest (<italic>ntree</italic>), the number of variables considered for splitting at each node (<italic>mtry</italic>), and the minimum size of terminal nodes (<italic>nodesize</italic>). A five-fold cross-validation approach is used to determine the combination of tuning parameters that produces the locally optimal concordance index (C-index), a widely used evaluation metric for survival models.</p>
<p>When both baseline and post-baseline covariates are incorporated into the analysis, the modeling framework becomes more challenging due to the time-varying nature of some predictors. In the presence of time-varying predictors, let <inline-formula id="j_nejsds98_ineq_004"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">q</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbf{Z}_{ij}}={({Z_{i1j}},\dots ,{Z_{iqj}})^{\prime }}$]]></tex-math></alternatives></inline-formula> denote the vector of <italic>q</italic> time-varying covariates for subject <italic>i</italic> observed at the <italic>j</italic>-th time interval. The hazard function for the Cox proportional hazards model can be conceptually extended to: 
<disp-formula id="j_nejsds98_eq_002">
<label>(2.2)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">λ</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">λ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo movablelimits="false">exp</mml:mo>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="bold-italic">β</mml:mi>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="bold-italic">γ</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \lambda (t\mid {\mathbf{X}_{i}},{\mathbf{Z}_{ij}})={\lambda _{0}}(t)\exp \left({\mathbf{X}^{\prime }_{i}}\boldsymbol{\beta }+{\mathbf{Z}^{\prime }_{ij}}\boldsymbol{\gamma }\right),\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_005"><alternatives><mml:math>
<mml:mi mathvariant="bold-italic">γ</mml:mi></mml:math><tex-math><![CDATA[$\boldsymbol{\gamma }$]]></tex-math></alternatives></inline-formula> denotes the coefficients for the <italic>q</italic> time-varying covariates. However, a key rule for time-dependent covariates is that they must not “look into the future”, or the hazard depends on the covariate values just prior to the event time [<xref ref-type="bibr" rid="j_nejsds98_ref_048">48</xref>]. An approach to handling time-dependent covariates is to use the methods for time intervals that account for left truncation and right censoring [LTRC <xref ref-type="bibr" rid="j_nejsds98_ref_050">50</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_048">48</xref>]. Thernea et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_048">48</xref>] provide one example of coding time-dependent covariates in intervals of time. Imagine a subject whose follow-up period extends from time 0 to death at 185 days. Suppose a time-dependent covariate, such as the repeating laboratory test of creatinine in a clinical trial, is measured on day 0, day 90, and day 120, with recorded values of 0.9, 1.5 and 1.2 mg / dL, respectively. To represent these data in a suitable format for analysis, the follow-up time can be divided into three intervals: 0–90, 90–120, and 120–185 days. Each interval is represented as a separate row of data. The structured data appear in Table <xref rid="j_nejsds98_tab_001">1</xref>. The predictors of TTE in each interval are observed at the beginning of the interval, preventing the use of future data that would bias the results. Each row is also referred to as a pseudo-subject [<xref ref-type="bibr" rid="j_nejsds98_ref_020">20</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_056">56</xref>], as each row represents a “subject” according to the interval time-to-event (TTE) models. However, the rows do not correspond to distinct true subjects in reality. Of note, the number of pseudo-subjects can be substantially larger than that of the true subjects, depending on the number of post-baseline observations for each true subject.</p>
<table-wrap id="j_nejsds98_tab_001">
<label>Table 1</label>
<caption>
<p>An example of using the intervals of time approach to express time-dependent predictor of time-to-event outcome for a single subject.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: justify; border-top: double; border-bottom: solid thin"><bold>Subject</bold></td>
<td style="vertical-align: top; text-align: justify; border-top: double; border-bottom: solid thin"><bold>Start (days)</bold></td>
<td style="vertical-align: top; text-align: justify; border-top: double; border-bottom: solid thin"><bold>Stop (days)</bold></td>
<td style="vertical-align: top; text-align: justify; border-top: double; border-bottom: solid thin"><bold>Event</bold></td>
<td style="vertical-align: top; text-align: justify; border-top: double; border-bottom: solid thin"><bold>Creatinine (mg/dL)</bold></td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: justify">1</td>
<td style="vertical-align: top; text-align: justify">0</td>
<td style="vertical-align: top; text-align: justify">90</td>
<td style="vertical-align: top; text-align: justify">0</td>
<td style="vertical-align: top; text-align: justify">0.9</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: justify">1</td>
<td style="vertical-align: top; text-align: justify">90</td>
<td style="vertical-align: top; text-align: justify">120</td>
<td style="vertical-align: top; text-align: justify">0</td>
<td style="vertical-align: top; text-align: justify">1.5</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: justify; border-bottom: solid thin">1</td>
<td style="vertical-align: top; text-align: justify; border-bottom: solid thin">120</td>
<td style="vertical-align: top; text-align: justify; border-bottom: solid thin">185</td>
<td style="vertical-align: top; text-align: justify; border-bottom: solid thin">1</td>
<td style="vertical-align: top; text-align: justify; border-bottom: solid thin">1.2</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Additionally, ML methods such as survival trees for LTRC data [<xref ref-type="bibr" rid="j_nejsds98_ref_020">20</xref>] provide a nonparametric or semiparametric approach to handling TTE predictive modeling with post-baseline predictors. The survival tree model effectively segments pseudo-subjects into homogeneous subgroups according to time-invariant and time-dependent covariates. Two survival tree models were proposed including the LTRC tree based on conditional inference tree (LTRCIT) and LTRC tree based on classification and regression tree (LTRCART). The construction of LTRCIT is based on the logrank test score specifically adjusted for LTRC data, ensuring unbiased selection of splitting predictors. At each terminal node of the tree, the Kaplan-Meier estimate of the survival function is used to summarize survival times within that subgroup. In comparison, LTRCART is a likelihood-based method and uses deviation reduction and proportional hazards assumptions to select splits. The survival distribution at each terminal node of the tree is estimated through the baseline cumulative hazard function by the Nelson-Aalen estimator and the estimated relative risk of each node. Consequently, LTRCART is well suited for small datasets. In addition, practical experience suggests that the computational speed of modeling fitting and prediction for LTRCART is considerably faster than that of LTRCIT. Hence, in this paper, we use LTRCART as the ML model to predict TTE with predictors varying over time.</p>
<p>The LTRCART method is implemented in the R package <italic>LTRCtrees</italic> [<xref ref-type="bibr" rid="j_nejsds98_ref_021">21</xref>].The Key tuning parameters include the minimum sum of weights required in a terminal node (<italic>minbucket</italic>) and the maximum allowable depth of the tree (<italic>maxdepth</italic>), both of which influence the complexity and performance of the tree. To optimize these parameters, a five-fold cross-validation approach was implemented, with the C-index used as the evaluation metric to determine the local optimal settings. This methodology enables the survival tree to provide accurate and interpretable survival predictions while effectively addressing the challenges posed by the left-truncated and right-censored data.</p>
</sec>
<sec id="j_nejsds98_s_003">
<label>3</label>
<title>Evaluation Metrics for TTE Prediction Models</title>
<p>In this paper, three metrics for evaluating survival models were investigated, including the C-index, integrated Brier score, and time-dependent AUC. They are briefly described in the following.</p>
<sec id="j_nejsds98_s_004">
<label>3.1</label>
<title>C-Index</title>
<p>The C-index is a widely used metric to assess the performance of survival models. It quantifies the concordance between the predicted risk and the order of survival times for all comparable pairs of subjects. The C-index is defined as: 
<disp-formula id="j_nejsds98_eq_003">
<label>(3.1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">C</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&lt;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>·</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mfenced separators="" open="[" close="]">
<mml:mrow>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>·</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ C=\frac{{\textstyle\sum _{(i,j)}}\left[I({Y_{i}}\gt {Y_{j}})\cdot I({r_{i}}\lt {r_{j}})\cdot {\delta _{j}}\right]}{{\textstyle\sum _{(i,j)}}\left[I({Y_{i}}\gt {Y_{j}})\cdot {\delta _{j}}\right]},\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_006"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Y_{i}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_007"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Y_{j}}$]]></tex-math></alternatives></inline-formula> are the follow-up times for subjects <italic>i</italic> and <italic>j</italic>, respectively, <inline-formula id="j_nejsds98_ineq_008"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${r_{i}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_009"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${r_{j}}$]]></tex-math></alternatives></inline-formula> are the predicted risks, and <inline-formula id="j_nejsds98_ineq_010"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\delta _{j}}$]]></tex-math></alternatives></inline-formula> is the censoring indicator for subject <italic>j</italic> (with <inline-formula id="j_nejsds98_ineq_011"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${\delta _{j}}=1$]]></tex-math></alternatives></inline-formula> indicating an event occurred). The numerator counts the number of concordant pairs, where the subject with the shorter survival time is assigned a higher predictive risk score, while the denominator represents the total number of comparable pairs (excluding censored data where <inline-formula id="j_nejsds98_ineq_012"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Y_{j}}$]]></tex-math></alternatives></inline-formula> is not observed).</p>
<p>The interpretation of the C-index is similar to the ratio of concordant pairs to comparable pairs. A value of <inline-formula id="j_nejsds98_ineq_013"><alternatives><mml:math>
<mml:mi mathvariant="italic">C</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.5</mml:mn></mml:math><tex-math><![CDATA[$C=0.5$]]></tex-math></alternatives></inline-formula> indicates that the model predictions are no better than random guessing, while a value of <inline-formula id="j_nejsds98_ineq_014"><alternatives><mml:math>
<mml:mi mathvariant="italic">C</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$C=1$]]></tex-math></alternatives></inline-formula> represents perfect discrimination. Higher C-index values reflect a model’s ability to accurately predict the order of survival times, making it a valuable tool for assessing the predictive performance of survival models. In the simulation study, we used the R package <italic>intsurv</italic> to calculate the C-index [<xref ref-type="bibr" rid="j_nejsds98_ref_053">53</xref>].</p>
</sec>
<sec id="j_nejsds98_s_005">
<label>3.2</label>
<title>Brier Score and Integrated Brier Score</title>
<p>Brier score is a widely used metric to evaluate the predictive precision of survival models by evaluating the difference between predicted survival probabilities and observed survival time and status at a specific time point <italic>t</italic>. It is defined as: 
<disp-formula id="j_nejsds98_eq_004">
<label>(3.2)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right center left" columnspacing="10.0pt 10.0pt">
<mml:mtr>
<mml:mtd class="eqnarray-1">
<mml:mi mathvariant="italic">B</mml:mi>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mtd>
<mml:mtd class="eqnarray-2">
<mml:mo>=</mml:mo>
</mml:mtd>
<mml:mtd class="eqnarray-3">
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mo maxsize="2.45em" minsize="2.45em" fence="true">{</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">δ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="eqnarray-1"/>
<mml:mtd class="eqnarray-2"/>
<mml:mtd class="eqnarray-3">
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced separators="" open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo><mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo maxsize="2.45em" minsize="2.45em" fence="true">}</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle BS(t)& \displaystyle =& \displaystyle \frac{1}{n}{\sum \limits_{i=1}^{n}}\Bigg\{\frac{{\hat{S}^{2}}(t\mid {\mathbf{X}_{i}})}{\hat{G}({Y_{i}})}I({Y_{i}}\le t,{\delta _{i}}=1)+\\ {} & & \displaystyle \frac{{\left(1-\hat{S}(t\mid {\mathbf{X}_{i}})\right)^{2}}}{\hat{G}(t)}I({Y_{i}}\gt t)\Bigg\},\end{array}\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_015"><alternatives><mml:math><mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\hat{S}(t\mid {\mathbf{X}_{i}})$]]></tex-math></alternatives></inline-formula> is the estimated survival function for subject <italic>i</italic> at time <italic>t</italic>. The term <inline-formula id="j_nejsds98_ineq_016"><alternatives><mml:math><mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="italic">G</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">ˆ</mml:mo></mml:mover>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\hat{G}({T_{i}})$]]></tex-math></alternatives></inline-formula> refers to the Kaplan-Meier estimator of the survival function based on the censoring times. Brier score combines both calibration (how well predicted survival probabilities match observed outcomes) and discrimination (the ability to separate subjects with different risks). A lower Brier score indicates better model performance, as it suggests the predictions are closer to the observed outcomes.</p>
<p>Brier score measures the predictive performance of a survival model at a specific survival time. To summarize the overall performance of a survival model, the integrated Brier score integrates the Brier score over the observed range of survival time, from 0 to <inline-formula id="j_nejsds98_ineq_017"><alternatives><mml:math>
<mml:mtext>max</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\text{max}(Y)$]]></tex-math></alternatives></inline-formula>. It is defined as: 
<disp-formula id="j_nejsds98_eq_005">
<label>(3.3)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mi mathvariant="italic">B</mml:mi>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∫</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mtext>max</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mi mathvariant="italic">B</mml:mi>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ IBS=\frac{1}{\text{max}(Y)}{\int _{0}^{\text{max}(Y)}}BS(t)dt.\]]]></tex-math></alternatives>
</disp-formula> 
The integrated Brier score provides a single overall measure of predictive accuracy, making it an intuitive evaluation metric for survival models. Lower <inline-formula id="j_nejsds98_ineq_018"><alternatives><mml:math>
<mml:mi mathvariant="italic">I</mml:mi>
<mml:mi mathvariant="italic">B</mml:mi>
<mml:mi mathvariant="italic">S</mml:mi></mml:math><tex-math><![CDATA[$IBS$]]></tex-math></alternatives></inline-formula> values indicate better overall performance of the model throughout the follow-up period. In the simulation study, we used the R package <italic>survex</italic> to calculate the Brier score and the integrated Brier score [<xref ref-type="bibr" rid="j_nejsds98_ref_046">46</xref>].</p>
</sec>
<sec id="j_nejsds98_s_006">
<label>3.3</label>
<title>Time-Dependent ROC and AUC</title>
<p>The time-dependent ROC curve is a tool for evaluating the discriminatory performance of a continuous predictor, such as a risk score <italic>r</italic>, at a specific time <italic>t</italic>. For a given threshold cutoff <italic>c</italic>, the sensitivity and specificity at time <italic>t</italic> can be defined as follows. The sensitivity at <inline-formula id="j_nejsds98_ineq_019"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(c,t)$]]></tex-math></alternatives></inline-formula> is the probability of correctly identifying subjects who have experienced the event by time <italic>t</italic>: 
<disp-formula id="j_nejsds98_eq_006">
<label>(3.4)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mtext>Sensitivity</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \text{Sensitivity}(c,t)=P(r\gt c\mid Y\le t),\]]]></tex-math></alternatives>
</disp-formula> 
where <italic>Y</italic> is the event time. Similarly, the specificity at <inline-formula id="j_nejsds98_ineq_020"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(c,t)$]]></tex-math></alternatives></inline-formula> is the probability of correctly identifying subjects who have not experienced the event by time <italic>t</italic>, given that their predictor is less than or equal to the threshold <italic>c</italic>: 
<disp-formula id="j_nejsds98_eq_007">
<label>(3.5)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mtext>Specificity</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mo stretchy="false">∣</mml:mo>
<mml:mi mathvariant="italic">Y</mml:mi>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \text{Specificity}(c,t)=P(r\le c\mid Y\gt t).\]]]></tex-math></alternatives>
</disp-formula> 
By varying the cut-off value <italic>c</italic>, the sensitivity and <inline-formula id="j_nejsds98_ineq_021"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mtext>specificity</mml:mtext></mml:math><tex-math><![CDATA[$1-\text{specificity}$]]></tex-math></alternatives></inline-formula> pairs can be plotted to construct the time-dependent ROC curve at time <italic>t</italic>. This curve provides a graphical representation of the predictor’s ability to discriminate between subjects who have experienced the event and those who have not by a given time point <italic>t</italic>.</p>
<p>The AUC at time <italic>t</italic>, denoted as <inline-formula id="j_nejsds98_ineq_022"><alternatives><mml:math>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\text{AUC}(t)$]]></tex-math></alternatives></inline-formula>, summarizes the ROC curve into a single numerical value. It represents the probability that a randomly chosen subject who experiences the event by time <italic>t</italic> will have a higher predictor value <italic>r</italic> than a subject who have not experienced the event by time <italic>t</italic>. Mathematically, <inline-formula id="j_nejsds98_ineq_023"><alternatives><mml:math>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\text{AUC}(t)$]]></tex-math></alternatives></inline-formula> can be expressed as: 
<disp-formula id="j_nejsds98_eq_008">
<label>(3.6)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">P</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">∣</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">≤</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \text{AUC}(t)=P({r_{i}}\gt {r_{j}}\mid {Y_{i}}\le t,{Y_{j}}\gt t),\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_024"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${r_{i}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_025"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${r_{j}}$]]></tex-math></alternatives></inline-formula> are the risk scores for subjects <italic>i</italic> and <italic>j</italic>, respectively, and <inline-formula id="j_nejsds98_ineq_026"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Y_{i}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_027"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Y_{j}}$]]></tex-math></alternatives></inline-formula> represent their event times [<xref ref-type="bibr" rid="j_nejsds98_ref_025">25</xref>]. A higher <inline-formula id="j_nejsds98_ineq_028"><alternatives><mml:math>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\text{AUC}(t)$]]></tex-math></alternatives></inline-formula> value indicates better discrimination, with <inline-formula id="j_nejsds98_ineq_029"><alternatives><mml:math>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0.5</mml:mn></mml:math><tex-math><![CDATA[$\text{AUC}(t)=0.5$]]></tex-math></alternatives></inline-formula> corresponding to random prediction and <inline-formula id="j_nejsds98_ineq_030"><alternatives><mml:math>
<mml:mtext>AUC</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$\text{AUC}(t)=1$]]></tex-math></alternatives></inline-formula> representing perfect discrimination [<xref ref-type="bibr" rid="j_nejsds98_ref_001">1</xref>]. In the simulation study, we used the R package <italic>survivalROC</italic> to calculate the time-dependent AUC [<xref ref-type="bibr" rid="j_nejsds98_ref_028">28</xref>].</p>
</sec>
</sec>
<sec id="j_nejsds98_s_007">
<label>4</label>
<title>Simulation</title>
<sec id="j_nejsds98_s_008">
<label>4.1</label>
<title>Simulation Settings</title>
<p>We conduct simulation studies to evaluate the performance of statistical and ML methods under different model evaluation metrics. In simulation setting 1, data are generated under the assumption that the true underlying model follows a Cox proportional hazards model. The model includes both baseline covariates, measured at the beginning of follow-up, and post-baseline covariates that evolve over time. Specifically, one of the covariates is time-varying, while the other four covariates are time-invariant. For covariate effects, the time-varying covariate is assigned a coefficient of 0.8. The four time-invariant covariates are assigned coefficients of 0.5, -0.5, 0, and 0, respectively. These values represent varying degrees of association, including positive and negative effects, as well as no effect for covariates with coefficients of 0. The baseline hazard function follows an exponential distribution. The training and testing data sets in each replication contain 1,000 observations in each dataset, and 100 replications are generated to evaluate the performance of model prediction and evaluation metrics. To mimic real-world survival data, censoring is introduced at a rate of 20%. The censoring mechanism results in a median time-to-event or censoring of 7 time units, with a maximum follow-up time of 100 time units. This simulation setting evaluates the performance of model fitting and evaluation in the scenarios where both baseline and post-baseline covariates influence the hazard.</p>
<p>In the second setting, data are generated under a Cox proportional hazards model where only baseline covariates are assumed to have an effect on the hazard function. The simulation incorporates two time-invariant covariates and three time-varying covariates; however, only the time-invariant covariates are assigned non-zero effects. Specifically, the coefficients for the two time-invariant covariates are set to (1, 1), indicating a strong and positive association. In contrast, the coefficients for the time-varying covariates are set to 0, ensuring that these covariates have no influence on the outcome. The sample size of each replication for this setting is 1,000, with an average of 7 post-baseline observations per subject. The censoring rate is slightly higher than in the first setting, with approximately 22% of the observations censored. The median time to event or censoring is 16 time units, and the maximum follow-up time is again set to 100 time units. 100 replications are generated to evaluate the performance of model prediction and evaluation metrics.</p>
<p>In the third setting, data are generated under a tree-based survival model where post-baseline time-varying covariates play an important role in determining the hazard function. The simulation includes three time-varying covariates and two time-invariant covariates. However, the primary focus is on the time-varying covariates, which are designed to exhibit complex non-linear effects over time. Specifically, the true underlying model is defined as a survival tree with hazard ratios structured as a step function. This step function is based on 9 distinct regions representing the interaction between the first two time-varying covariates. The hazard ratios assigned to these regions are specified as (3, 0.5, 0.1; 0.1; 3, 0.5; 0.5, 0.1, 0.3), indicating heterogeneous hazard relationships across regions. The inclusion of the remaining time-varying and time-invariant covariates provides additional variability in the model fitting, but does not have a direct impact on the defined step-function hazard structure. The sample size for this setting remains at 1,000 subjects, with each one contributing an average of 7 post-baseline observations. The censoring rate is set at a relatively low level of 10%. The median time-to-event or censoring is 4 time units, while the maximum follow-up time is capped at 100 time units. One hundred replications are generated to evaluate the performance of model prediction and evaluation metrics.</p>
<p>In the fourth simulation setting, the true data-generating mechanism is deliberately designed to deviate from the assumptions of standard survival models, creating a scenario in which none of the candidate models are correctly specified. Specifically, the hazard function is defined as: 
<disp-formula id="j_nejsds98_eq_009">
<label>(4.1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">h</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mn>4</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo fence="true" stretchy="false">}</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ h(t)={h_{0}}(t)\{4({x_{i1}}+{x_{i3}}(t)+{x_{i4}}(t))\}\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_nejsds98_ineq_031"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${h_{0}}(t)$]]></tex-math></alternatives></inline-formula> represents the baseline hazard function, and <inline-formula id="j_nejsds98_ineq_032"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${x_{i1}}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_nejsds98_ineq_033"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i3}}(t)$]]></tex-math></alternatives></inline-formula>, and <inline-formula id="j_nejsds98_ineq_034"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i4}}(t)$]]></tex-math></alternatives></inline-formula> correspond to covariates influencing the hazard. This structure introduces a non-standard form of covariate effects on the hazard, which includes both time-invariant (<inline-formula id="j_nejsds98_ineq_035"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${x_{i1}}$]]></tex-math></alternatives></inline-formula>) and time-varying (<inline-formula id="j_nejsds98_ineq_036"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i3}}(t)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_037"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i4}}(t)$]]></tex-math></alternatives></inline-formula>) predictors. <inline-formula id="j_nejsds98_ineq_038"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${x_{i1}}$]]></tex-math></alternatives></inline-formula> is generated from a Bernoulli distribution with probability equal to 0.5. <inline-formula id="j_nejsds98_ineq_039"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i3}}(t)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_nejsds98_ineq_040"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">t</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${x_{i4}}(t)$]]></tex-math></alternatives></inline-formula> are generated from a uniform distribution on <inline-formula id="j_nejsds98_ineq_041"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[0,1]$]]></tex-math></alternatives></inline-formula>. The inclusion of a single static covariate alongside multiple dynamic covariates creates a complex, nonlinear relationship between the covariates and the time-to-event outcome, challenging the validity of standard survival models like the Cox proportional hazards model. The sample size for this simulation is set to a moderate level of 300 observations. Each individual contributed an average of six post-baseline observations. The censoring rate is about 17%. The median time-to-event or censoring is 17 time units, while the maximum follow-up time is capped at 100 time units. 100 replications are generated to evaluate the performance of model prediction and evaluation metrics.</p>
</sec>
<sec id="j_nejsds98_s_009">
<label>4.2</label>
<title>Simulation Results</title>
<p>In simulation setting 1, data are generated under the Cox proportional hazards model with both effective baseline and post-baseline covariates. The results for this simulation setting are presented in Table <xref rid="j_nejsds98_tab_002">2</xref>. We use the C index, the integrated Brier score, and three time-dependent AUCs, that is, to predict t = 6 with covariates up to t = 3, to predict t = 12 with covariates up to t = 6 and to predict t = 18 with covariates up to t=12, for model evaluation. The true model (Cox-post-baseline), which correctly specifies the underlying data generation process, consistently achieves the best performance metrics across all evaluation criteria. The numbers in the rows of ‘% wrong’ is the numbers of replications that select the models other than the true model according to the corresponding model evaluation metrics in each replication. The comparison highlights the advantage of using the correct model specification, as reflected in superior values for the C-index, integrated Brier score, and time-dependent AUC. Specifically, the Cox post-baseline model outperforms ML tree-based models and the Cox PH model that incorporates only baseline covariates (Cox-baseline). This demonstrates the importance of incorporating both baseline and post-baseline covariates when they are relevant to the hazard function, as failure to do so leads to suboptimal performance.</p>
<table-wrap id="j_nejsds98_tab_002">
<label>Table 2</label>
<caption>
<p>Simulation results of setting 1.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: double; border-bottom: solid thin">Metric</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-post-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">SRF-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">LTRCART-post-baseline</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">C-index</td>
<td style="vertical-align: top; text-align: center">0.696 (0.010)</td>
<td style="vertical-align: top; text-align: center">0.681 (0.010)</td>
<td style="vertical-align: top; text-align: center">0.671 (0.011)</td>
<td style="vertical-align: top; text-align: center">0.615 (0.021)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.05</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">IBS</td>
<td style="vertical-align: top; text-align: center">0.112 (0.005)</td>
<td style="vertical-align: top; text-align: center">0.116 (0.004)</td>
<td style="vertical-align: top; text-align: center">0.119 (0.004)</td>
<td style="vertical-align: top; text-align: center">0.132 (0.006)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.01</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=6‖t=3)</td>
<td style="vertical-align: top; text-align: center">0.756 (0.022)</td>
<td style="vertical-align: top; text-align: center">0.701 (0.024)</td>
<td style="vertical-align: top; text-align: center">0.691 (0.025)</td>
<td style="vertical-align: top; text-align: center">0.711 (0.025)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.01</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=12‖t=6)</td>
<td style="vertical-align: top; text-align: center">0.761 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.714 (0.018)</td>
<td style="vertical-align: top; text-align: center">0.700 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.721 (0.019)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=18‖t=12)</td>
<td style="vertical-align: top; text-align: center">0.785 (0.015)</td>
<td style="vertical-align: top; text-align: center">0.748 (0.016)</td>
<td style="vertical-align: top; text-align: center">0.734 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.724 (0.021)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">% wrong</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">-</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.01</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.02</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Note: The numbers outside the parentheses represent the means, while those inside the parentheses are the standard deviations. C-index: concordance index; IBS: integrated Brier score; AUC: area under the receiver operating characteristic curve; Cox-post-baseline: Cox proportional hazards model with both baseline and post-baseline predictors; Cox-baseline: Cox proportional hazards model with baseline predictors only; SRF-baseline: survival random forest with baseline predictors only; LTRCART: left-truncated and right-censored tree based on the classification and regression tree, incorporating both baseline and post-baseline predictors. AUC(t=x ‖ t=y) represents the time-varying AUC used to predict the presence of an event at time x, given the post-baseline predictors up to time y. % wrong represents the percentage of simulation replications in which the specific misspecified model achieves the best model evaluation metric among the four fitted models.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="j_nejsds98_tab_003">
<label>Table 3</label>
<caption>
<p>Simulation results of setting 2.</p>
</caption>
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left; border-top: double">Metric</td>
<td style="vertical-align: top; text-align: center; border-top: double">Cox-post-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double">Cox-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double">SRF-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double">LTRCART-post-baseline</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">C-index</td>
<td style="vertical-align: top; text-align: center">0.690 (0.011)</td>
<td style="vertical-align: top; text-align: center">0.692 (0.011)</td>
<td style="vertical-align: top; text-align: center">0.690 (0.011)</td>
<td style="vertical-align: top; text-align: center">0.671 (0.014)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.05</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.18</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">IBS</td>
<td style="vertical-align: top; text-align: center">0.144 (0.003)</td>
<td style="vertical-align: top; text-align: center">0.144 (0.003)</td>
<td style="vertical-align: top; text-align: center">0.145 (0.003)</td>
<td style="vertical-align: top; text-align: center">0.148 (0.003)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.05</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.05</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=6‖t=3)</td>
<td style="vertical-align: top; text-align: center">0.721 (0.022)</td>
<td style="vertical-align: top; text-align: center">0.721 (0.022)</td>
<td style="vertical-align: top; text-align: center">0.713 (0.023)</td>
<td style="vertical-align: top; text-align: center">0.699 (0.023)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.4</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.14</td>
<td style="vertical-align: top; text-align: center">0.02</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=12‖t=6)</td>
<td style="vertical-align: top; text-align: center">0.741 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.742 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.735 (0.018)</td>
<td style="vertical-align: top; text-align: center">0.717 (0.020)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.33</td>
<td style="vertical-align: top; text-align: center">-</td>
<td style="vertical-align: top; text-align: center">0.08</td>
<td style="vertical-align: top; text-align: center">0</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=18‖t=12)</td>
<td style="vertical-align: top; text-align: center">0.756 (0.014)</td>
<td style="vertical-align: top; text-align: center">0.758 (0.014)</td>
<td style="vertical-align: top; text-align: center">0.754 (0.014)</td>
<td style="vertical-align: top; text-align: center">0.732 (0.018)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">% wrong</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.25</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">-</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.15</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Note: The numbers outside the parentheses represent the means, while those inside the parentheses are the standard deviations. C-index: concordance index; IBS: integrated Brier score; AUC: area under the receiver operating characteristic curve; Cox-post-baseline: Cox proportional hazards model with both baseline and post-baseline predictors; Cox-baseline: Cox proportional hazards model with baseline predictors only; SRF-baseline: survival random forest with baseline predictors only; LTRCART: left-truncated and right-censored tree based on the classification and regression tree, incorporating both baseline and post-baseline predictors. AUC(t=x ‖ t=y) represents the time-varying AUC used to predict the presence of an event at time x, given the post-baseline predictors up to time y. % wrong represents the percentage of simulation replications in which the specific misspecified model achieves the best model evaluation metric among the four fitted models.</p>
</table-wrap-foot>
</table-wrap>
<p>When comparing the tree model that incorporates both baseline and post-baseline covariates (LTRCART-post-baseline) with the SRF that uses only baseline covariates (SRF-baseline), we observe that despite its ability to account for post-baseline predictors, the performance of LTRCART-post-baseline is inferior to that of SRF-baseline, which does not model the effective post-baseline predictors. A possible explanation for this is that SRF, due to its ensemble nature, is better equipped to capture the continuous covariate effects present in the true data-generating model. In contrast, LTRCART relies on a single-tree structure, which represents relationships using step functions. This approach is inefficient for modeling the continuous effects of predictors, limiting its performance in scenarios where such effects are significant.</p>
<p>In simulation setting 2, data are generated under a Cox proportional hazards model with effective baseline covariates only. The results for this setting are presented in Table <xref rid="j_nejsds98_tab_003">3</xref>. Under this scenario, the Cox model with baseline covariates (Cox-baseline) and the Cox model incorporating both baseline and post-baseline covariates (Cox-post-baseline) yield overall similar values for the C-index, integrated Brier score, and time-dependent AUC. When comparing the model in each replication, the C index and the integrated Brier score can distinguish the true model with the Cox post-baseline model reasonably well. In comparison, AUC metrics often prefer the over-fitted Cox-post-baseline model with ineffective predictors. Similarly, the SRF model using only baseline covariates (SRF-baseline) demonstrates comparable overall performance, indicating that post-baseline covariates provide no additional predictive value when the true underlying model involves only baseline covariates. All evaluation metrics tend to select the parametric model Cox-baseline over the nonparametric model SRF-baseline when the parametric assumptions hold. In contrast, the model evaluation metrics for LTRCART-post-baseline are inferior to those of the other methods, likely for the same reasons observed in setting 1. These evaluation metrics effectively distinguish models that adequately characterize the effects of relevant predictors from those that do not fit the data. However, they are less sensitive to identifying over-fitted models, such as Cox-post-baseline.</p>
<p>In simulation setting 3, data are generated based on a survival tree model where effective post-baseline time-varying covariates play a critical role in forming the tree structure. The results for this setting, shown in Table <xref rid="j_nejsds98_tab_004">4</xref>, demonstrate key findings related to model performance when nonlinear relationships and post-baseline predictors are involved. ML models, particularly survival tree-based methods, outperform the Cox proportional hazards model in identifying complex, nonlinear predictor impacts and interactions. This highlights the strength of ML approaches when dealing with intricate relationships that deviate from the linear assumptions of the Cox model.</p>
<table-wrap id="j_nejsds98_tab_004">
<label>Table 4</label>
<caption>
<p>Simulation results of setting 3.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: double; border-bottom: solid thin">Metric</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-post-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">SRF-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">LTRCART-post-baseline</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">C-index</td>
<td style="vertical-align: top; text-align: center">0.481 (0.015)</td>
<td style="vertical-align: top; text-align: center">0.500 (0.011)</td>
<td style="vertical-align: top; text-align: center">0.613 (0.017)</td>
<td style="vertical-align: top; text-align: center">0.601 (0.050)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0</td>
<td style="vertical-align: top; text-align: center">0.07</td>
<td style="vertical-align: top; text-align: center">0.54</td>
<td style="vertical-align: top; text-align: center">-</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">IBS</td>
<td style="vertical-align: top; text-align: center">0.172 (0.001)</td>
<td style="vertical-align: top; text-align: center">0.172 (0.001)</td>
<td style="vertical-align: top; text-align: center">0.161 (0.003)</td>
<td style="vertical-align: top; text-align: center">0.161 (0.010)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.11</td>
<td style="vertical-align: top; text-align: center">0.12</td>
<td style="vertical-align: top; text-align: center">0.53</td>
<td style="vertical-align: top; text-align: center">-</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=6‖t=3)</td>
<td style="vertical-align: top; text-align: center">0.490 (0.023)</td>
<td style="vertical-align: top; text-align: center">0.499 (0.019)</td>
<td style="vertical-align: top; text-align: center">0.711 (0.024)</td>
<td style="vertical-align: top; text-align: center">0.674 (0.087)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.04</td>
<td style="vertical-align: top; text-align: center">0.08</td>
<td style="vertical-align: top; text-align: center">0.63</td>
<td style="vertical-align: top; text-align: center">-</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=12‖t=6)</td>
<td style="vertical-align: top; text-align: center">0.485 (0.021)</td>
<td style="vertical-align: top; text-align: center">0.498 (0.018)</td>
<td style="vertical-align: top; text-align: center">0.650 (0.022)</td>
<td style="vertical-align: top; text-align: center">0.640 (0.070)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">% wrong</td>
<td style="vertical-align: top; text-align: center">0.01</td>
<td style="vertical-align: top; text-align: center">0.05</td>
<td style="vertical-align: top; text-align: center">0.40</td>
<td style="vertical-align: top; text-align: center">-</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=18‖t=12)</td>
<td style="vertical-align: top; text-align: center">0.478 (0.025)</td>
<td style="vertical-align: top; text-align: center">0.499 (0.020)</td>
<td style="vertical-align: top; text-align: center">0.614 (0.022)</td>
<td style="vertical-align: top; text-align: center">0.624 (0.064)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">% wrong</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.02</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.06</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.36</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">-</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Note: The numbers outside the parentheses represent the means, while those inside the parentheses are the standard deviations. C-index: concordance index; IBS: integrated Brier score; AUC: area under the receiver operating characteristic curve; Cox-post-baseline: Cox proportional hazards model with both baseline and post-baseline predictors; Cox-baseline: Cox proportional hazards model with baseline predictors only; SRF-baseline: survival random forest with baseline predictors only; LTRCART: left-truncated and right-censored tree based on the classification and regression tree, incorporating both baseline and post-baseline predictors. AUC(t=x ‖ t=y) represents the time-varying AUC used to predict the presence of an event at time x, given the post-baseline predictors up to time y. % wrong represents the percentage of simulation replications in which the specific misspecified model achieves the best model evaluation metric among the four fitted models.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap id="j_nejsds98_tab_005">
<label>Table 5</label>
<caption>
<p>Simulation results of setting 4.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: double; border-bottom: solid thin">Metric</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-post-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">Cox-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">SRF-baseline</td>
<td style="vertical-align: top; text-align: center; border-top: double; border-bottom: solid thin">LTRCART-post-baseline</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">C-index</td>
<td style="vertical-align: top; text-align: center">0.599 (0.024)</td>
<td style="vertical-align: top; text-align: center">0.600 (0.023)</td>
<td style="vertical-align: top; text-align: center">0.597 (0.021)</td>
<td style="vertical-align: top; text-align: center">0.584 (0.035)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">IBS</td>
<td style="vertical-align: top; text-align: center">0.164 (0.005)</td>
<td style="vertical-align: top; text-align: center">0.163 (0.004)</td>
<td style="vertical-align: top; text-align: center">0.164 (0.004)</td>
<td style="vertical-align: top; text-align: center">0.166 (0.006)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=6‖t=3)</td>
<td style="vertical-align: top; text-align: center">0.629 (0.039)</td>
<td style="vertical-align: top; text-align: center">0.621 (0.042)</td>
<td style="vertical-align: top; text-align: center">0.603 (0.041)</td>
<td style="vertical-align: top; text-align: center">0.584 (0.056)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">AUC (t=12‖t=6)</td>
<td style="vertical-align: top; text-align: center">0.639 (0.034)</td>
<td style="vertical-align: top; text-align: center">0.633 (0.032)</td>
<td style="vertical-align: top; text-align: center">0.627 (0.037)</td>
<td style="vertical-align: top; text-align: center">0.609 (0.050)</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">AUC (t=18‖t=12)</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.650 (0.040)</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.646 (0.038)</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.647 (0.038)</td>
<td style="vertical-align: top; text-align: center; border-bottom: solid thin">0.627 (0.056)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>Note: The numbers outside the parentheses represent the means, while those inside the parentheses are the standard deviations. C-index: concordance index; IBS: integrated Brier score; AUC: area under the receiver operating characteristic curve; Cox-post-baseline: Cox proportional hazards model with both baseline and post-baseline predictors; Cox-baseline: Cox proportional hazards model with baseline predictors only; SRF-baseline: survival random forest with baseline predictors only; LTRCART: left-truncated and right-censored tree based on the classification and regression tree, incorporating both baseline and post-baseline predictors. AUC(t=x ‖ t=y) represents the time-varying AUC used to predict the presence of an event at time x, given the post-baseline predictors up to time y. % wrong represents the percentage of simulation replications in which the specific misspecified model achieves the best model evaluation metric among the four fitted models.</p>
</table-wrap-foot>
</table-wrap>
<p>Among the evaluated ML models, the SRF model using baseline predictors shows slightly better performance, reflecting its robustness in this setting. However, the LTRCART model incorporating post-baseline predictors exhibits notable challenges. The results indicate that LTRCART requires larger sample sizes to stabilize its performance, as the metrics for this model display considerable variation. A larger sample size, such as <inline-formula id="j_nejsds98_ineq_042"><alternatives><mml:math>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>2000</mml:mn></mml:math><tex-math><![CDATA[$n=2000$]]></tex-math></alternatives></inline-formula>, is necessary to reduce variability and enhance the reliability of LTRCART in capturing the nonlinear effects of post-baseline covariates. Overall, this setting emphasizes the superiority of ML models over the Cox model in identifying complex predictor impacts and interactions. It also underscores the importance of sufficient sample size when applying tree-based models with post-baseline time-varying covariates to ensure stable and accurate performance.</p>
<p>In simulation setting 4, no model is correctly specified to represent the true data-generating mechanism, creating a scenario where all evaluated models are misspecified. The results, shown in Table <xref rid="j_nejsds98_tab_005">5</xref>, reflect the challenges associated with model misspecification when the sample size is moderate. Interestingly, despite the non-linearity of the logarithmic hazard ratio function, Cox proportional hazards models perform similarly to the Survival Random Forest (SRF) model. This indicates that with a moderate sample size, the Cox models can still approximate the relationships reasonably well, even when the assumptions about linearity and proportional hazards are violated. The comparable performance of the SRF and Cox models suggests that moderate sample sizes may limit the ability to detect nonlinearity effectively. Overall, this setting highlights the robustness of the Cox model under mild misspecification and demonstrates that larger sample sizes may be necessary for ML models such as SRF to fully exploit their ability to capture complex nonlinear relationships in survival data.</p>
</sec>
</sec>
<sec id="j_nejsds98_s_010">
<label>5</label>
<title>Discussion</title>
<p>In this study, we systematically evaluated and compared the predictive performance of the Cox proportional hazards (PH) model and ML methods, including tree-based models, for analyzing time-to-event (TTE) outcomes. We investigated four distinct simulation settings that varied in complexity, incorporating both baseline and post-baseline time-varying covariates, as well as scenarios where no model was correctly specified. To comprehensively assess model performance, we utilized evaluation metrics such as the concordance index (C index), the integrated Brier score (IBS) and the time-dependent area under the curve (AUC).</p>
<p>The simulation studies highlight the importance of correct model specification, the strengths and limitations of ML approaches, and the role of sample size in model performance. In settings where the true model is correctly specified (e.g., Cox-post-baseline in Setting 1), it consistently outperforms alternatives, emphasizing the value of including both baseline and post-baseline covariates when relevant. ML models, such as SRF, excel in capturing complex nonlinear relationships and interactions, as seen in Setting 3, but require larger sample sizes to stabilize performance, particularly for tree-based models like LTRCART. When only baseline covariates are relevant (Setting 2), post-baseline predictors add no value, and simpler models perform comparably. However, evaluation metrics such as the time-dependent AUCs can fail to distinguish over-fitted models. In misspecified scenarios (Setting 4), the Cox model demonstrates robustness, performing similarly to ML models despite violating assumptions. The Cox model may serve as a valuable option, offering performance comparable to that of machine learning (ML) methods in practical applications, particularly when the sample size is moderate and/or the model assumptions, such as proportional hazards and linear effects of predictors on the log hazard ratio, are reasonably satisfied. In general, the studies underscore the utility of ML methods for complex data structures, the robustness of traditional models with mild misspecification, and the need for sufficient sample sizes for reliable performance in ML approaches.</p>
<p>Several studies in the literature have investigated model comparison criteria that include penalties for model complexity in the context of time-to-event (TTE) outcomes with baseline predictors only. Karabey and Tutkun [<xref ref-type="bibr" rid="j_nejsds98_ref_033">33</xref>] used the Akaike Information Criterion [AIC; <xref ref-type="bibr" rid="j_nejsds98_ref_002">2</xref>] and the Bayesian Information Criterion [BIC; <xref ref-type="bibr" rid="j_nejsds98_ref_043">43</xref>] to compare nested survival models. Habibi et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_024">24</xref>] employed AIC to compare various survival models, including Exponential, Weibull, Gompertz, Log-normal, Log-logistic, and Generalized Gamma models. Ozaki and Ninomiy [<xref ref-type="bibr" rid="j_nejsds98_ref_039">39</xref>] utilized AIC to identify change-points in the Cox proportional hazards model. Similarly, Fagbamigb et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_017">17</xref>] compared parametric and semi-parametric survival models using both AIC and BIC. However, these model evaluation methods require the specification of a likelihood function and may not be directly applicable to many machine learning models. More research is needed to develop model evaluation metrics that are more effective in detecting and addressing over-fitting in machine learning contexts and with post-baseline predictors.</p>
<p>For ML methods with post-baseline predictors, we only considered the tree model LTRCART and did not include ensemble methods. In the literature, Yao et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_056">56</xref>] proposed ensemble approaches to estimate survival functions with time-varying covariates, based on conditional inference [<xref ref-type="bibr" rid="j_nejsds98_ref_051">51</xref>] and relative risk forests [<xref ref-type="bibr" rid="j_nejsds98_ref_030">30</xref>]. These methods are implemented in the R package <italic>LTRCforests</italic> [<xref ref-type="bibr" rid="j_nejsds98_ref_057">57</xref>]. However, as noted in this study, the number of pseudo-subjects can be substantially larger than the number of true subjects, leading to computational challenges. The computational demands of <italic>LTRCforests</italic> are significantly higher than those of SRF and LTRCART, exceeding our computational capacity for simulation studies. Furthermore, the large sample size requirement observed for LTRCART in simulation studies is likely applicable to <italic>LTRCforests</italic> as well. Further research in this direction is warranted to better understand the performance of ensemble tree methods for time-to-event outcomes with post-baseline predictors.</p>
<p>It is important to emphasize that our comparison was not intended to serve as an exhaustive benchmark of all machine learning approaches for survival prediction. Rather, we deliberately focused on methods that are most appropriate for clinical trial settings characterized by moderate sample sizes and post-baseline time-varying covariates. We did not include gradient boosting or deep learning–based survival models for three interrelated reasons. First, sample size considerations play a critical role. Deep learning models generally require substantially larger datasets to achieve stable optimization and outperform classical approaches. As demonstrated by Billichová et al. [<xref ref-type="bibr" rid="j_nejsds98_ref_005">5</xref>], DeepSurv required a sample size of approximately 6,000 to match the performance of the Cox model. Likewise, Silvey and Liu [<xref ref-type="bibr" rid="j_nejsds98_ref_044">44</xref>] showed that gradient boosting methods required considerably larger sample sizes than random forests to achieve stable AUC estimates in clinical tabular data. These requirements can exceed the typical sample sizes available in Phase II and III clinical trials. Second, hyperparameter robustness was a key consideration. Survival random forests were selected because random forest–based methods have been shown to exhibit low hyperparameter sensitivity, with default parameter settings often yielding near-optimal performance [<xref ref-type="bibr" rid="j_nejsds98_ref_042">42</xref>]. In contrast, gradient boosting methods rely on extensive tuning of multiple interdependent hyperparameters, including learning rate, tree depth, number of boosting iterations, and regularization parameters, to achieve optimal performance [<xref ref-type="bibr" rid="j_nejsds98_ref_042">42</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_006">6</xref>]. In simulation studies, this sensitivity can introduce variability that reflects the efficiency of the tuning strategy rather than the intrinsic predictive capability of the method itself. Third, overfitting concerns are particularly relevant in smaller or noisy datasets. Boosting algorithms such as AdaBoost are known to overemphasize misclassified observations, which may include mislabeled or noisy data points, leading to excessive fitting of noise rather than underlying signal [<xref ref-type="bibr" rid="j_nejsds98_ref_016">16</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_019">19</xref>]. Although regularization and early stopping can mitigate these effects, their successful application further depends on careful hyperparameter tuning, compounding the robustness issues discussed above [<xref ref-type="bibr" rid="j_nejsds98_ref_010">10</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_052">52</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_004">4</xref>]. For these reasons, we restricted our comparison to the Cox model, tree-based method for LTRC data and survival random forest, which are computationally feasible, robust to moderate sample sizes, and well suited for time-varying covariates. Future work leveraging large-scale real-world evidence databases may enable more comprehensive comparisons that include gradient boosting and deep learning methods under conditions where their advantages can be more fully realized.</p>
<p>Handling missing data is critical in data analysis and modeling, particularly in contexts like survival analysis with longitudinal post-baseline predictors. In the current paper, we only consider statistical and ML methods for datasets without missingness. Specifically, only pseudo-subjects with complete observations of baseline and post-baseline covariates consider are kept in the model. This approach allows us to focus on the modeling fitting and evaluation of TTE models. However, in practice, such an approach could lead to a significant reduction in sample size and potential bias when the missing mechanism is not trivial, for example, it is not random [<xref ref-type="bibr" rid="j_nejsds98_ref_037">37</xref>]. Methods for fitting Cox model with missing data have been proposed in the literature [<xref ref-type="bibr" rid="j_nejsds98_ref_012">12</xref>, <xref ref-type="bibr" rid="j_nejsds98_ref_054">54</xref>]. SRF introduces a novel adaptive tree imputation algorithm to manage missing covariates and outcomes during the tree growth and prediction phases [<xref ref-type="bibr" rid="j_nejsds98_ref_031">31</xref>]. Evaluating TTE models with missing data will be considered in future studies. The R code supporting the computation in this paper is available at <uri>https://github.com/zhaohualu/SurvivalPredictiveModelEvaluation</uri>.</p>
</sec>
</body>
<back>
<ack id="j_nejsds98_ack_001">
<title>Disclosure</title>
<p>ZL and PH are employees of Daiichi Sankyo, Inc and may own its stocks.</p></ack>
<ack id="j_nejsds98_ack_002">
<title>Disclaimer</title>
<p>Contributions by the authors are solely their own and are not intended to express the views of their employer.</p></ack>
<ref-list id="j_nejsds98_reflist_001">
<title>References</title>
<ref id="j_nejsds98_ref_001">
<label>[1]</label><mixed-citation publication-type="book"><string-name><surname>Agresti</surname>, <given-names>A.</given-names></string-name> (<year>2010</year>) <source>Analysis of Ordinal Categorical Data</source> <edition>2nd</edition> ed. <series>Wiley Series in Probability and Statistics</series>. <publisher-name>John Wiley &amp; Sons</publisher-name>, <publisher-loc>Hoboken, NJ</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/9780470594001" xlink:type="simple">https://doi.org/10.1002/9780470594001</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2742515">MR2742515</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_002">
<label>[2]</label><mixed-citation publication-type="journal"><string-name><surname>Akaike</surname>, <given-names>H.</given-names></string-name> (<year>1974</year>). <article-title>A New Look at the Statistical Model Identification</article-title>. <source>IEEE Transactions on Automatic Control</source> <volume>19</volume>(<issue>6</issue>) <fpage>716</fpage>–<lpage>723</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/TAC.1974.1100705" xlink:type="simple">https://doi.org/10.1109/TAC.1974.1100705</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=0423716">MR0423716</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_003">
<label>[3]</label><mixed-citation publication-type="journal"><string-name><surname>Andersen</surname>, <given-names>P. K.</given-names></string-name> and <string-name><surname>Gill</surname>, <given-names>R. D.</given-names></string-name> (<year>1982</year>). <article-title>Cox’s Regression Model for Counting Processes: A Large Sample Study</article-title>. <source>The Annals of Statistics</source> <volume>10</volume>(<issue>4</issue>) <fpage>1100</fpage>–<lpage>1120</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/aos/1176345976" xlink:type="simple">https://doi.org/10.1214/aos/1176345976</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_004">
<label>[4]</label><mixed-citation publication-type="journal"><string-name><surname>Bentéjac</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Csörg</surname>, <given-names>A.</given-names></string-name> and <string-name><surname>Martínez-Muñoz</surname>, <given-names>G.</given-names></string-name> (<year>2021</year>). <article-title>A comparative analysis of gradient boosting algorithms</article-title>. <source>Artificial Intelligence Review</source> <volume>54</volume>(<issue>3</issue>) <fpage>1937</fpage>–<lpage>1967</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s10462-020-09896-5" xlink:type="simple">https://doi.org/10.1007/s10462-020-09896-5</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_005">
<label>[5]</label><mixed-citation publication-type="journal"><string-name><surname>Billichová</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Coan</surname>, <given-names>L. J.</given-names></string-name>, <string-name><surname>Czanner</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Kováová</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Sharifian</surname>, <given-names>F.</given-names></string-name> and <string-name><surname>Czanner</surname>, <given-names>G.</given-names></string-name> (<year>2024</year>). <article-title>Comparing the performance of statistical, machine learning, and deep learning algorithms to predict time-to-event: A simulation study for conversion to mild cognitive impairment</article-title>. <source>PLOS ONE</source> <volume>19</volume>(<issue>1</issue>) <fpage>0297190</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1371/journal.pone.0297190" xlink:type="simple">https://doi.org/10.1371/journal.pone.0297190</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_006">
<label>[6]</label><mixed-citation publication-type="journal"><string-name><surname>Boldini</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Grisoni</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Kuhn</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Friedrich</surname>, <given-names>L.</given-names></string-name> and <string-name><surname>Sieber</surname>, <given-names>S. A.</given-names></string-name> (<year>2023</year>). <article-title>Practical guidelines for the use of gradient boosting for molecular property prediction</article-title>. <source>Journal of Cheminformatics</source> <volume>15</volume>(<issue>1</issue>) <fpage>73</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1186/s13321-023-00743-7" xlink:type="simple">https://doi.org/10.1186/s13321-023-00743-7</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_007">
<label>[7]</label><mixed-citation publication-type="journal"><string-name><surname>Breiman</surname>, <given-names>L.</given-names></string-name> (<year>2001</year>). <article-title>Random forests</article-title>. <source>Machine Learning</source> <volume>45</volume>(<issue>1</issue>) <fpage>5</fpage>–<lpage>32</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_008">
<label>[8]</label><mixed-citation publication-type="journal"><string-name><surname>Breslow</surname>, <given-names>N. E.</given-names></string-name> (<year>1972</year>). <article-title>Contribution to the Discussion of the Paper by D. R. Cox</article-title>. <source>Journal of the Royal Statistical Society: Series B (Methodological)</source> <volume>34</volume> <fpage>187</fpage>–<lpage>220</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_009">
<label>[9]</label><mixed-citation publication-type="journal"><string-name><surname>Brier</surname>, <given-names>G. W.</given-names></string-name> (<year>1950</year>). <article-title>Verification of forecasts expressed in terms of probability</article-title>. <source>Monthly weather review</source> <volume>78</volume>(<issue>1</issue>) <fpage>1</fpage>–<lpage>3</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_010">
<label>[10]</label><mixed-citation publication-type="journal"><string-name><surname>Bühlmann</surname>, <given-names>P.</given-names></string-name> and <string-name><surname>Hothorn</surname>, <given-names>T.</given-names></string-name> (<year>2007</year>). <article-title>Boosting algorithms: Regularization, prediction and model fitting</article-title>. <source>Statistical Science</source> <volume>22</volume>(<issue>4</issue>) <fpage>477</fpage>–<lpage>505</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/07-STS242" xlink:type="simple">https://doi.org/10.1214/07-STS242</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2420454">MR2420454</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_011">
<label>[11]</label><mixed-citation publication-type="journal"><string-name><surname>Chen</surname>, <given-names>C. -Y.</given-names></string-name> and <string-name><surname>Chang</surname>, <given-names>Y. -W.</given-names></string-name> (<year>2024</year>). <article-title>Missing data imputation using classification and regression trees</article-title>. <source>PeerJ Computer Science</source> <volume>10</volume> <fpage>2119</fpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_012">
<label>[12]</label><mixed-citation publication-type="journal"><string-name><surname>Chen</surname>, <given-names>M. -H.</given-names></string-name>, <string-name><surname>Ibrahim</surname>, <given-names>J. G.</given-names></string-name> and <string-name><surname>Shao</surname>, <given-names>Q. -M.</given-names></string-name> (<year>2009</year>). <article-title>Maximum likelihood inference for the Cox regression model with applications to missing covariates</article-title>. <source>Journal of multivariate analysis</source> <volume>100</volume>(<issue>9</issue>) <fpage>2018</fpage>–<lpage>2030</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.jmva.2009.03.013" xlink:type="simple">https://doi.org/10.1016/j.jmva.2009.03.013</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2543083">MR2543083</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_013">
<label>[13]</label><mixed-citation publication-type="chapter"><string-name><surname>Chen</surname>, <given-names>T.</given-names></string-name> and <string-name><surname>Guestrin</surname>, <given-names>C.</given-names></string-name> (<year>2016</year>). <chapter-title>XGBoost: A scalable tree boosting system</chapter-title>. In <source>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source> <fpage>785</fpage>–<lpage>794</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_014">
<label>[14]</label><mixed-citation publication-type="journal"><string-name><surname>Cox</surname>, <given-names>D. R.</given-names></string-name> (<year>1972</year>). <article-title>Regression Models and Life-Tables</article-title>. <source>Journal of the Royal Statistical Society: Series B (Methodological)</source> <volume>34</volume>(<issue>2</issue>) <fpage>187</fpage>–<lpage>202</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/j.2517-6161.1972.tb00899.x" xlink:type="simple">https://doi.org/10.1111/j.2517-6161.1972.tb00899.x</ext-link>. <uri>https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1972.tb00899.x</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_015">
<label>[15]</label><mixed-citation publication-type="journal"><string-name><surname>Cuthbert</surname>, <given-names>A. R.</given-names></string-name>, <string-name><surname>Giles</surname>, <given-names>L. C.</given-names></string-name>, <string-name><surname>Glonek</surname>, <given-names>G.</given-names></string-name> <etal>et al.</etal> (<year>2022</year>). <article-title>A comparison of survival models for prediction of eight-year revision risk following total knee and hip arthroplasty</article-title>. <source>BMC Medical Research Methodology</source> <volume>22</volume> <fpage>164</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1186/s12874-022-01644-3" xlink:type="simple">https://doi.org/10.1186/s12874-022-01644-3</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_016">
<label>[16]</label><mixed-citation publication-type="journal"><string-name><surname>Dietterich</surname>, <given-names>T. G.</given-names></string-name> (<year>2000</year>). <article-title>An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization</article-title>. <source>Machine Learning</source> <volume>40</volume>(<issue>2</issue>) <fpage>139</fpage>–<lpage>157</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_017">
<label>[17]</label><mixed-citation publication-type="journal"><string-name><surname>Fagbamigbe</surname>, <given-names>A. F.</given-names></string-name>, <string-name><surname>Norrman</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Bergh</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Wennerholm</surname>, <given-names>U. -B.</given-names></string-name> and <string-name><surname>Petzold</surname>, <given-names>M.</given-names></string-name> (<year>2021</year>). <article-title>Comparison of the performances of survival analysis regression models for analysis of conception modes and risk of type-1 diabetes among 1985–2015 Swedish birth cohort</article-title>. <source>PLOS ONE</source> <volume>16</volume>(<issue>6</issue>) <fpage>1</fpage>–<lpage>23</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1371/journal.pone.0253389" xlink:type="simple">https://doi.org/10.1371/journal.pone.0253389</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_018">
<label>[18]</label><mixed-citation publication-type="journal"><string-name><surname>Friedman</surname>, <given-names>J. H.</given-names></string-name> (<year>2001</year>). <article-title>Greedy function approximation: a gradient boosting machine</article-title>. <source>Annals of Statistics</source> <volume>29</volume>(<issue>5</issue>) <fpage>1189</fpage>–<lpage>1232</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/aos/1013203451" xlink:type="simple">https://doi.org/10.1214/aos/1013203451</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1873328">MR1873328</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_019">
<label>[19]</label><mixed-citation publication-type="journal"><string-name><surname>Frénay</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Verleysen</surname>, <given-names>M.</given-names></string-name> (<year>2014</year>). <article-title>Classification in the presence of label noise: A survey</article-title>. <source>IEEE Transactions on Neural Networks and Learning Systems</source> <volume>25</volume>(<issue>5</issue>) <fpage>845</fpage>–<lpage>869</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/TNNLS.2013.2292894" xlink:type="simple">https://doi.org/10.1109/TNNLS.2013.2292894</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_020">
<label>[20]</label><mixed-citation publication-type="journal"><string-name><surname>Fu</surname>, <given-names>W.</given-names></string-name> and <string-name><surname>Simonoff</surname>, <given-names>J. S.</given-names></string-name> (<year>2016</year>). <article-title>Survival trees for left-truncated and right-censored data, with application to time-varying covariate data</article-title>. <source>Biostatistics</source> <volume>18</volume>(<issue>2</issue>) <fpage>352</fpage>–<lpage>369</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1093/biostatistics/kxw047" xlink:type="simple">https://doi.org/10.1093/biostatistics/kxw047</ext-link>. <uri>https://academic.oup.com/biostatistics/article-pdf/18/2/352/11057459/kxw047.pdf</uri>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3825124">MR3825124</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_021">
<label>[21]</label><mixed-citation publication-type="other"><string-name><surname>Fu</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Simonoff</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>Jing</surname>, <given-names>W.</given-names></string-name> (2021). LTRCtrees: Survival Trees to Fit Left-Truncated and Right-Censored and Interval-Censored Survival Data. R package version 1.1.1. <uri>https://CRAN.R-project.org/package=LTRCtrees</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_022">
<label>[22]</label><mixed-citation publication-type="journal"><string-name><surname>Fu</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Jung</surname>, <given-names>A. W.</given-names></string-name>, <string-name><surname>Torne</surname>, <given-names>R. V.</given-names></string-name> <etal>et al.</etal> (<year>2020</year>). <article-title>Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis</article-title>. <source>Nature Cancer</source>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1038/s43018-020-0085-8" xlink:type="simple">https://doi.org/10.1038/s43018-020-0085-8</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_023">
<label>[23]</label><mixed-citation publication-type="journal"><string-name><surname>Graf</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Schmoor</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Sauerbrei</surname>, <given-names>W.</given-names></string-name> and <string-name><surname>Schumacher</surname>, <given-names>M.</given-names></string-name> (<year>1999</year>). <article-title>Assessment and comparison of prognostic classification schemes for survival data</article-title>. <source>Statistics in medicine</source> <volume>18</volume>(<issue>17-18</issue>) <fpage>2529</fpage>–<lpage>2545</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_024">
<label>[24]</label><mixed-citation publication-type="journal"><string-name><surname>Habibi</surname>, <given-names>D.</given-names></string-name>, <string-name><surname>Rafiei</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Chehrei</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Shayan</surname>, <given-names>Z.</given-names></string-name> and <string-name><surname>Tafaqodi</surname>, <given-names>S.</given-names></string-name> (<year>2018</year>). <article-title>Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients</article-title>. <source>Asian Pacific Journal of Cancer Prevention</source> <volume>19</volume>(<issue>3</issue>) <fpage>749</fpage>–<lpage>753</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.22034/APJCP.2018.19.3.749" xlink:type="simple">https://doi.org/10.22034/APJCP.2018.19.3.749</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_025">
<label>[25]</label><mixed-citation publication-type="journal"><string-name><surname>Hanley</surname>, <given-names>J. A.</given-names></string-name> and <string-name><surname>McNeil</surname>, <given-names>B. J.</given-names></string-name> (<year>1982</year>). <article-title>The meaning and use of the area under a receiver operating characteristic (ROC) curve.</article-title> <source>Radiology</source> <volume>143</volume>(<issue>1</issue>) <fpage>29</fpage>–<lpage>36</lpage>. <comment>PMID: 7063747</comment>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1148/radiology.143.1.7063747" xlink:type="simple">https://doi.org/10.1148/radiology.143.1.7063747</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_026">
<label>[26]</label><mixed-citation publication-type="journal"><string-name><surname>Harrell Jr</surname>, <given-names>F. E.</given-names></string-name>, <string-name><surname>Lee</surname>, <given-names>K. L.</given-names></string-name> and <string-name><surname>Mark</surname>, <given-names>D. B.</given-names></string-name> (<year>1996</year>). <article-title>Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors</article-title>. <source>Statistics in medicine</source> <volume>15</volume>(<issue>4</issue>) <fpage>361</fpage>–<lpage>387</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_027">
<label>[27]</label><mixed-citation publication-type="book"><string-name><surname>Hastie</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Tibshirani</surname>, <given-names>R.</given-names></string-name> and <string-name><surname>Friedman</surname>, <given-names>J.</given-names></string-name> (<year>2009</year>) <source>The Elements of Statistical Learning: Data Mining, Inference, and Prediction</source>. <publisher-name>Springer Science &amp; Business Media</publisher-name>, <publisher-loc>New York</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-0-387-84858-7" xlink:type="simple">https://doi.org/10.1007/978-0-387-84858-7</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=2722294">MR2722294</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_028">
<label>[28]</label><mixed-citation publication-type="other"><string-name><surname>Heagerty</surname>, <given-names>P. J.</given-names></string-name> and <string-name><surname>packaging by Paramita Saha-Chaudhuri</surname></string-name> (2022). survivalROC: Time-Dependent ROC Curve Estimation from Censored Survival Data. R package version 1.0.3.1. <uri>https://CRAN.R-project.org/package=survivalROC</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_029">
<label>[29]</label><mixed-citation publication-type="journal"><string-name><surname>Heagerty</surname>, <given-names>P. J.</given-names></string-name>, <string-name><surname>Lumley</surname>, <given-names>T.</given-names></string-name> and <string-name><surname>Pepe</surname>, <given-names>M. S.</given-names></string-name> (<year>2000</year>). <article-title>Time-dependent ROC curves for censored survival data and a diagnostic marker</article-title>. <source>Biometrics</source> <volume>56</volume>(<issue>2</issue>) <fpage>337</fpage>–<lpage>344</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_030">
<label>[30]</label><mixed-citation publication-type="journal"><string-name><surname>Hemant Ishwaran</surname>, <given-names>C. E. P.</given-names> <suffix>Eugene H Blackstone</suffix></string-name> and <string-name><surname>Lauer</surname>, <given-names>M. S.</given-names></string-name> (<year>2004</year>). <article-title>Relative Risk Forests for Exercise Heart Rate Recovery as a Predictor of Mortality</article-title>. <source>Journal of the American Statistical Association</source> <volume>99</volume>(<issue>467</issue>) <fpage>591</fpage>–<lpage>600</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1198/016214504000000638" xlink:type="simple">https://doi.org/10.1198/016214504000000638</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_031">
<label>[31]</label><mixed-citation publication-type="journal"><string-name><surname>Ishwaran</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Kogalur</surname>, <given-names>U. B.</given-names></string-name>, <string-name><surname>Blackstone</surname>, <given-names>E. H.</given-names></string-name> and <string-name><surname>Lauer</surname>, <given-names>M. S.</given-names></string-name> (<year>2008</year>). <article-title>Random survival forests</article-title>. <source>The Annals of Applied Statistics</source> <volume>2</volume>(<issue>3</issue>) <fpage>841</fpage>–<lpage>860</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/08-AOAS169" xlink:type="simple">https://doi.org/10.1214/08-AOAS169</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_032">
<label>[32]</label><mixed-citation publication-type="other"><string-name><surname>Ishwaran</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Lauer</surname>, <given-names>M. S.</given-names></string-name>, <string-name><surname>Blackstone</surname>, <given-names>E. H.</given-names></string-name>, <string-name><surname>Lu</surname>, <given-names>M.</given-names></string-name> and <string-name><surname>Kogalur</surname>, <given-names>U. B.</given-names></string-name> (2021). <italic>randomForestSRC: random survival forests vignette</italic>. [accessed date]. <uri>http://randomforestsrc.org/articles/survival.html</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_033">
<label>[33]</label><mixed-citation publication-type="journal"><string-name><surname>Karabey</surname>, <given-names>U.</given-names></string-name> and <string-name><surname>Tutkun</surname>, <given-names>N. A.</given-names></string-name> (<year>2017</year>). <article-title>Model selection criterion in survival analysis</article-title>. <source>AIP Conference Proceedings</source> <volume>1863</volume>(<issue>1</issue>) <fpage>120003</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1063/1.4992296" xlink:type="simple">https://doi.org/10.1063/1.4992296</ext-link>. <uri>https://pubs.aip.org/aip/acp/article-pdf/doi/10.1063/1.4992296/13748246/120003_1_online.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_034">
<label>[34]</label><mixed-citation publication-type="journal"><string-name><surname>Katzman</surname>, <given-names>J. L.</given-names></string-name>, <string-name><surname>Shaham</surname>, <given-names>U.</given-names></string-name>, <string-name><surname>Cloninger</surname>, <given-names>A.</given-names></string-name> <etal>et al.</etal> (<year>2018</year>). <article-title>DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network</article-title>. <source>BMC Medical Research Methodology</source> <volume>18</volume> <fpage>24</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1186/s12874-018-0482-1" xlink:type="simple">https://doi.org/10.1186/s12874-018-0482-1</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_035">
<label>[35]</label><mixed-citation publication-type="book"><string-name><surname>Klein</surname>, <given-names>J. P.</given-names></string-name> and <string-name><surname>Moeschberger</surname>, <given-names>M. L.</given-names></string-name> (<year>2003</year>) <source>Survival Analysis: Techniques for Censored and Truncated Data</source>, <edition>2nd</edition> ed. <publisher-name>Springer</publisher-name>, <publisher-loc>New York, NY</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-1-4419-6646-9" xlink:type="simple">https://doi.org/10.1007/978-1-4419-6646-9</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_036">
<label>[36]</label><mixed-citation publication-type="journal"><string-name><surname>Lee</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Zame</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Yoon</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>van der Schaar</surname>, <given-names>M.</given-names></string-name> (<year>2018</year>). <article-title>DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks</article-title>. <source>Proceedings of the AAAI Conference on Artificial Intelligence</source> <volume>32</volume>(<issue>1</issue>). <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1609/aaai.v32i1.11842" xlink:type="simple">https://doi.org/10.1609/aaai.v32i1.11842</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_037">
<label>[37]</label><mixed-citation publication-type="book"><string-name><surname>Little</surname>, <given-names>R. J.</given-names></string-name> and <string-name><surname>Rubin</surname>, <given-names>D. B.</given-names></string-name> (<year>2019</year>) <source>Statistical analysis with missing data</source> <volume>793</volume>. <publisher-name>John Wiley &amp; Sons</publisher-name>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/9781119013563" xlink:type="simple">https://doi.org/10.1002/9781119013563</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1925014">MR1925014</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_038">
<label>[38]</label><mixed-citation publication-type="journal"><string-name><surname>Mantel</surname>, <given-names>N.</given-names></string-name> <etal>et al.</etal> (<year>1966</year>). <article-title>Evaluation of survival data and two new rank order statistics arising in its consideration</article-title>. <source>Cancer Chemother Rep</source> <volume>50</volume>(<issue>3</issue>) <fpage>163</fpage>–<lpage>170</lpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_039">
<label>[39]</label><mixed-citation publication-type="journal"><string-name><surname>Ozaki</surname>, <given-names>R.</given-names></string-name> and <string-name><surname>Ninomiya</surname>, <given-names>Y.</given-names></string-name> (<year>2023</year>). <article-title>Information criteria for detecting change-points in the Cox proportional hazards model</article-title>. <source>Biometrics</source> <volume>79</volume>(<issue>4</issue>) <fpage>3050</fpage>–<lpage>3065</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1111/biom.13855" xlink:type="simple">https://doi.org/10.1111/biom.13855</ext-link>. <uri>https://onlinelibrary.wiley.com/doi/pdf/10.1111/biom.13855</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_040">
<label>[40]</label><mixed-citation publication-type="journal"><string-name><surname>Park</surname>, <given-names>S. Y.</given-names></string-name>, <string-name><surname>Park</surname>, <given-names>J. E.</given-names></string-name>, <string-name><surname>Kim</surname>, <given-names>H.</given-names></string-name> and <string-name><surname>Park</surname>, <given-names>S. H.</given-names></string-name> (<year>2021</year>). <article-title>Review of Statistical Methods for Evaluating the Performance of Survival or Other Time-to-Event Prediction Models (from Conventional to Deep Learning Approaches)</article-title>. <source>Korean Journal of Radiology</source> <volume>22</volume>(<issue>10</issue>) <fpage>1697</fpage>–<lpage>1707</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.3348/kjr.2021.0223" xlink:type="simple">https://doi.org/10.3348/kjr.2021.0223</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_041">
<label>[41]</label><mixed-citation publication-type="chapter"><string-name><surname>Pölsterl</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Sarasua</surname>, <given-names>I.</given-names></string-name>, <string-name><surname>Gutiérrez-Becker</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Wachinger</surname>, <given-names>C.</given-names></string-name> (<year>2020</year>). <chapter-title>A Wide and Deep Neural Network for Survival Analysis from Anatomical Shape and Tabular Clinical Data</chapter-title>. In <source>Machine Learning and Knowledge Discovery in Databases</source> (<string-name><given-names>P.</given-names> <surname>Cellier</surname></string-name> and <string-name><given-names>K.</given-names> <surname>Driessens</surname></string-name>, eds.) <fpage>453</fpage>–<lpage>464</lpage>. <publisher-name>Springer International Publishing</publisher-name>, <publisher-loc>Cham</publisher-loc>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_042">
<label>[42]</label><mixed-citation publication-type="journal"><string-name><surname>Probst</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Bischl</surname>, <given-names>B.</given-names></string-name> and <string-name><surname>Boulesteix</surname>, <given-names>A. -L.</given-names></string-name> (<year>2019</year>). <article-title>Tunability: Importance of hyperparameters of machine learning algorithms</article-title>. <source>Journal of Machine Learning Research</source> <volume>20</volume>(<issue>53</issue>) <fpage>1</fpage>–<lpage>32</lpage>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=3948093">MR3948093</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_043">
<label>[43]</label><mixed-citation publication-type="journal"><string-name><surname>Schwarz</surname>, <given-names>G.</given-names></string-name> (<year>1978</year>). <article-title>Estimating the Dimension of a Model</article-title>. <source>The Annals of Statistics</source> <volume>6</volume>(<issue>2</issue>) <fpage>461</fpage>–<lpage>464</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1214/aos/1176344136" xlink:type="simple">https://doi.org/10.1214/aos/1176344136</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_044">
<label>[44]</label><mixed-citation publication-type="journal"><string-name><surname>Silvey</surname>, <given-names>S.</given-names></string-name> and <string-name><surname>Liu</surname>, <given-names>J.</given-names></string-name> (<year>2024</year>). <article-title>Sample size requirements for popular classification algorithms in tabular clinical data: Empirical study</article-title>. <source>Journal of Medical Internet Research</source> <volume>26</volume> <fpage>60231</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.2196/60231" xlink:type="simple">https://doi.org/10.2196/60231</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_045">
<label>[45]</label><mixed-citation publication-type="journal"><string-name><surname>Spooner</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Sowmya</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Sachdev</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Kochan</surname>, <given-names>N. A.</given-names></string-name>, <string-name><surname>Trollor</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>Brodaty</surname>, <given-names>H.</given-names></string-name> (<year>2020</year>). <article-title>A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction</article-title>. <source>Scientific Reports</source> <volume>10</volume>(<issue>1</issue>) <fpage>20410</fpage>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_046">
<label>[46]</label><mixed-citation publication-type="other"><string-name><surname>Spytek</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Krzyziski</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Langbein</surname>, <given-names>S. H.</given-names></string-name>, <string-name><surname>Baniecki</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Wright</surname>, <given-names>M. N.</given-names></string-name> and <string-name><surname>Biecek</surname>, <given-names>P.</given-names></string-name> (2023). survex: an R package for explaining machine learning survival models. <italic>arXiv preprint arXiv:</italic><ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2308.16113"><italic>2308.16113</italic></ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_047">
<label>[47]</label><mixed-citation publication-type="other"><string-name><surname>Therneau</surname>, <given-names>T.</given-names></string-name> (2023). A Package for Survival Analysis in R. R Core Team. Available at: <uri>https://cran.r-project.org/web/packages/survival/vignettes/survival.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_048">
<label>[48]</label><mixed-citation publication-type="other"><string-name><surname>Therneau</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Crowson</surname>, <given-names>C.</given-names></string-name> and <string-name><surname>Atkinson</surname>, <given-names>E.</given-names></string-name> (2024). Using Time Dependent Covariates and Time Dependent Coefficients in the Cox Model.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_049">
<label>[49]</label><mixed-citation publication-type="other"><string-name><surname>Therneau</surname>, <given-names>T. M.</given-names></string-name> (2023). A Package for Survival Analysis in R. R package version 3.5-5. <uri>https://CRAN.R-project.org/package=survival</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_050">
<label>[50]</label><mixed-citation publication-type="book"><string-name><surname>Therneau</surname>, <given-names>T. M.</given-names></string-name> and <string-name><surname>Grambsch</surname>, <given-names>P. M.</given-names></string-name> (<year>2000</year>) <source>Modeling Survival Data: Extending the Cox Model</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>New York</publisher-loc>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/978-1-4757-3294-8" xlink:type="simple">https://doi.org/10.1007/978-1-4757-3294-8</ext-link>. <ext-link ext-link-type="uri" xlink:href="https://mathscinet.ams.org/mathscinet-getitem?mr=1774977">MR1774977</ext-link></mixed-citation>
</ref>
<ref id="j_nejsds98_ref_051">
<label>[51]</label><mixed-citation publication-type="journal"><string-name><surname>Torsten Hothorn</surname>, <given-names>K. H.</given-names></string-name> and <string-name><surname>Zeileis</surname>, <given-names>A.</given-names></string-name> (<year>2006</year>). <article-title>Unbiased Recursive Partitioning: A Conditional Inference Framework</article-title>. <source>Journal of Computational and Graphical Statistics</source> <volume>15</volume>(<issue>3</issue>) <fpage>651</fpage>–<lpage>674</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1198/106186006X133933" xlink:type="simple">https://doi.org/10.1198/106186006X133933</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_052">
<label>[52]</label><mixed-citation publication-type="journal"><string-name><surname>Wang</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Zhuang</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Zheng</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Fan</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Kong</surname>, <given-names>J.</given-names></string-name> and <string-name><surname>Zhan</surname>, <given-names>J.</given-names></string-name> (<year>2021</year>). <article-title>Application of Bayesian Hyperparameter Optimized Random Forest and XGBoost Model for Landslide Susceptibility Mapping</article-title>. <source>Frontiers in Earth Science</source> <volume>9</volume> <fpage>712240</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.3389/feart.2021.712240" xlink:type="simple">https://doi.org/10.3389/feart.2021.712240</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_053">
<label>[53]</label><mixed-citation publication-type="other"><string-name><surname>Wang</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Chen</surname>, <given-names>K.</given-names></string-name> and <string-name><surname>Yan</surname>, <given-names>J.</given-names></string-name> (2021). intsurv: Integrative Survival Models. R package version 0.2.2. <uri>https://github.com/wenjie2wang/intsurv</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_054">
<label>[54]</label><mixed-citation publication-type="journal"><string-name><surname>White</surname>, <given-names>I. R.</given-names></string-name> and <string-name><surname>Royston</surname>, <given-names>P.</given-names></string-name> (<year>2009</year>). <article-title>Imputing missing covariate values for the Cox model</article-title>. <source>Statistics in Medicine</source> <volume>28</volume>(<issue>15</issue>) <fpage>1982</fpage>–<lpage>1998</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1002/sim.3618" xlink:type="simple">https://doi.org/10.1002/sim.3618</ext-link>. <uri>https://onlinelibrary.wiley.com/doi/pdf/10.1002/sim.3618</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_055">
<label>[55]</label><mixed-citation publication-type="other"><string-name><surname>Yao</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Frydman</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Larocque</surname>, <given-names>D.</given-names></string-name> and <string-name><surname>Simonoff</surname>, <given-names>J. S.</given-names></string-name> (2022). <italic>Ensemble Methods for Survival Function Estimation with Time-Varying Covariates</italic>. arXiv. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.48550/arXiv.2006.00567" xlink:type="simple">https://doi.org/10.48550/arXiv.2006.00567</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_056">
<label>[56]</label><mixed-citation publication-type="journal"><string-name><surname>Yao</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Frydman</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Larocque</surname>, <given-names>D.</given-names></string-name> and <string-name><surname>Simonoff</surname>, <given-names>J. S.</given-names></string-name> (<year>2022</year>). <article-title>Ensemble methods for survival function estimation with time-varying covariates</article-title>. <source>Statistical Methods in Medical Research</source> <volume>31</volume>(<issue>11</issue>) <fpage>2217</fpage>–<lpage>2236</lpage>. <comment>PMID: 35895510</comment>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1177/09622802221111549" xlink:type="simple">https://doi.org/10.1177/09622802221111549</ext-link>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_057">
<label>[57]</label><mixed-citation publication-type="other"><string-name><surname>Yao</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Frydman</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Larocque</surname>, <given-names>D.</given-names></string-name> and <string-name><surname>Simonoff</surname>, <given-names>J. S.</given-names></string-name> (2023). LTRCforests: Ensemble Methods for Survival Data with Time-Varying Covariates. R package version 0.7.0. <uri>https://CRAN.R-project.org/package=LTRCforests</uri>.</mixed-citation>
</ref>
<ref id="j_nejsds98_ref_058">
<label>[58]</label><mixed-citation publication-type="other"><string-name><surname>Zhou</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Cheng</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Zou</surname>, <given-names>Y.</given-names></string-name> and <string-name><surname>Wang</surname>, <given-names>H.</given-names></string-name> (2022). SurvMetrics: Predictive Evaluation Metrics in Survival Analysis. R package version 0.5.0. <uri>https://CRAN.R-project.org/package=SurvMetrics</uri>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
