1 Introduction
Modeling time-to-event data for cancer treatments has become an increasingly important area of research. In recent years, survival models with cure fraction, also called cure models, have gained increasing interest, in which a considerable proportion of patients are considered “cured,” that is, to remain disease-free after a certain follow-up period. It has been used for different forms of cancer such as breast cancer, non-Hodgkins lymphoma, leukemia, prostate cancer, melanoma, and head and neck cancer [13].
Two types of cure models, the mixture cure model (MCM) and the promotion time cure model (PTCM), are commonly used. The mixture cure model (MCM), proposed by Berkson and Gage [1], is formed by two mixture components, and it has been commonly used and discussed in [7, 8, 12, 23]. Other than the standard mixture model for cure rates, Chen et al. [3] proposed a promotion time cure model (PTCM) based on tumor growth characteristics using the Poisson distribution. These models are built on different assumptions, challenging researchers to choose the appropriate model for their study and interpret the different definitions of the treatment effect. The connection between the two types of models was discussed in [3], and Yin and Ibrahim [24] proposed a unified approach to bridge the mixture cure model and the promotion cure model via Box-Cox transformation. However, with different choices of the power parameter, the interpretations of the parameters are different and, thus, it is not straightforward to define the treatment effect. Therefore, a unified approach is needed to decide between these two types of cure models and to evaluate the treatment effect with different model choices.
In 2021, the Food and Drug Administration (FDA) released a guidance entitled “E9(R1) Statistical Principles for Clinical Trials: Addendum: Estimands and Sensitivity Analysis in Clinical Trials,”[10] which introduced the concept of “Estimand.” The addendum provides a structured estimand framework to enhance the communication amongst disciplines about the clinical trial objectives, design, conduct, analysis, and interpretation. Also, it emphasizes the attention regarding the treatment effects of interest that a clinical trial should investigate, which properly informs decision-making for the pharmaceutical industry. Furthermore, FDA published a final guidance in 2023 entitled “Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products,”[11] which mainly focuses on the use of prognostic covariates to improve statistical efficiency for estimating and testing treatment effects. It points out that adjusting for prognostic covariates in the analyses of efficacy endpoints in randomized clinical trials will generally reduce the variability of estimation of treatment effects and thus lead to narrower confidence intervals and power gains in hypothesis testing. Specifically, it emphasizes the need to differentiate conditional and unconditional treatment effects to ensure the specific objectives defined within the “Estimand” framework are examined when making statistical inferences. It should be noted that the conditional treatment effects, such as the hazard ratio in survival models, are influenced by choices in model and variable selections, posing challenges to the intentions of the invariant “Estimand” concept. Therefore, the guidance and the topics of “Estimand” and “Covariate Adjustment” are particularly relevant to the discussion of treatment effects in survival models with cure fraction.
This article sets out with two primary objectives. Firstly, we propose a unified estimand for assessing treatment effects under survival models with cure fraction that is invariant to model and variable selections and examine its theoretical properties. Secondly, we develop a Bayesian inference framework for the unified estimand, and illustrate its usefulness through simulation and a case study.
The rest of the paper is organized as follows. Section 2 provides a summary of the survival models with cure fraction. Section 3 introduces the proposed unified estimand under survival models with cure fraction. The Bayesian inference of the treatment effects is discussed in Section 4, including Bayesian hypothesis testing, model comparisons, and Bayesian computation. Section 5 evaluates the performance of the unified estimand through a comprehensive simulation study. An application of the proposed methodology to the E1684 melanoma cancer data is presented in Section 6. Section 7 concludes the paper with a discussion of main findings with future research directions.
2 Survival Models with Cure Fraction
2.1 Mixture Cure Models
The concept of the mixture cure models (MCM) was first introduced by Berkson and Gage [1], in which it is assumed that a certain fraction π of the population is cured $(y=1)$ and the remaining $1-\pi $ is not cured $(y=0)$. For the i-th individual, ${y_{i}}$ denotes the unobserved cured indicator, and ${\pi _{i}}=\mathbb{P}({y_{i}}=1)\in (0,1)$ denotes the cured probability. If a patient is cured, it is assumed that the patient will not die for a sufficiently long period of time. The survival (failure) time ${t_{i}^{surv}}$ can be written as
\[ {t_{i}^{surv}}=\left\{\begin{array}{l@{\hskip10.0pt}l}{t_{i}^{nc}}\hspace{1em}& \hspace{2.5pt}\text{if}\hspace{2.5pt}{y_{i}}=0\\ {} \infty \hspace{1em}& \hspace{2.5pt}\text{if}\hspace{2.5pt}{y_{i}}=1\end{array}\right.,\]
where ${t_{i}^{nc}}$ is the survival time of subject i if the individual is not cured. Let ${t_{i}}=\min \{{t_{i}^{surv}},{c_{i}}\}$ denote the observed right-censored survival time, where ${c_{i}}$ denotes the censoring time and ${\delta _{i}}=1\{{t_{i}^{surv}}\lt {c_{i}}\}$ denotes the censoring indicator for subject i such that ${\delta _{i}}=1$ if ${t_{i}}$ is a failure time and 0 if ${t_{i}}$ is right-censored. If subject i is not cured, i.e., ${y_{i}}=0$, denote the probability density function, the survival function, and hazard function for the failure time ${t_{i}^{nc}}$ by ${f_{i}^{nc}}(t)$, ${S_{i}^{nc}}(t)$ and ${h_{i}^{nc}}(t)$.Since the cure indicator ${y_{i}}$ is unobserved, considering the existence of cure fraction, the unconditional probability density function of the survival time for the i-th individual can be expressed as
and the unconditional survival function is given by
The hazard function for the mixture cure model can be derived as
Since ${\lim \nolimits_{t\to \infty }}{S_{i}^{nc}}(t)=0$, ${\lim \nolimits_{t\to \infty }}{S_{i}}(t)={\pi _{i}}\gt 0$, the survival function for the mixture cure model is not a proper survival function. The hazard ratio between subject i and ${i^{\prime }}$ is a function of time t. Thus, the mixture cure model is a non-proportional hazards model. When ${\pi _{i}}\to 0$, ${S_{i}}(t)\to {S_{i}^{nc}}(t)$ and ${h_{i}}(t)\to {h_{i}^{nc}}(t)$, the mixture cure model reduces to the standard survival model.
(2.2)
\[ {h_{i}}(t)=\frac{(1-{\pi _{i}}){f_{i}^{nc}}(t)}{{\pi _{i}}+(1-{\pi _{i}}){S_{i}^{nc}}(t)}.\]2.2 Promotion Time Cure Model
The promotion time cure model (PTCM) was developed by Chen et al. [3], where a Poisson distribution is assumed for the number of active carcinogenic cells. For the i-th subject, let ${N_{i}}$ denote the number of active carcinogenic cells, and assume that ${N_{i}}\sim \mathcal{P}({\lambda _{i}})$ follows a Poisson distribution with mean ${\lambda _{i}}$. By assuming the distribution of ${N_{i}}$, the cured probability can be expressed as
Denote ${Z_{ij}},j=1,\dots ,{N_{i}}$ as i.i.d random variables of incubation time for the j-th carcinogenic cells. It is assumed that ${Z_{ij}}$ follows the same distribution independent of ${N_{i}}$, and let $f(t)$ and $S(t)$ denote its density function and survival function. The event time for i-th individual is defined as ${T_{i}}=\min \{{Z_{ij}},0\le j\le {N_{i}}\}$, where $P({Z_{i0}}=\infty )=1$.
The survival function for the i-th individual can be expressed as
Since ${\lim \nolimits_{t\to \infty }}S(t)=0$, ${\lim \nolimits_{t\to \infty }}{S_{i}}(t)=\exp (-{\lambda _{i}})\gt 0$. Therefore, similar to the MCM, the survival function for the subject in the PTCM ${S_{i}}(t)$ is also not a proper survival function. The density function and hazard function of the survival time for the i-th subject can be further expressed as
and
Under this promotion time cure model, the hazard ratio between i-th subject and ${i^{\prime }}$-th subject is denoted as
which is a constant over time. Thus, the promotion time cure model follows the proportional hazards assumption.
(2.3)
\[ \begin{aligned}{}{S_{i}}(t)& ={\sum \limits_{k=0}^{\infty }}\mathbb{P}({N_{i}}=k)\mathbb{P}({Z_{ij}}\gt t,j=0,\dots ,k)\\ {} & ={\sum \limits_{k=0}^{\infty }}\frac{{\lambda _{i}^{k}}}{k!}\exp (-{\lambda _{i}})S{(t)^{k}}\\ {} & =\exp (-{\lambda _{i}}(1-S(t))).\end{aligned}\]2.3 Regression Forms and Likelihood Functions
2.3.1 Mixture Cure Model
We consider the mixture cure model defined in Section 2.1. Since both the cure fraction component and the non-cured survival component can be affected by a set of prognostic covariates, these two components can be modeled separately in regression forms. For the i-th subject, let ${\boldsymbol{x}_{i}}={(1,{z_{i}},{\tilde{\boldsymbol{x}}_{i}^{\top }})^{\top }}$ denote a $(p+2)$-dimensional vector of covariates, where ${\tilde{\boldsymbol{x}}_{i}}$ is the prognostic factors and ${z_{i}}$ is the treatment group indicator such that ${z_{i}}=1$ if the i-th subject is in the treatment group and ${z_{i}}=0$ if the subject is assigned to the control group. We assume a logistic regression model for ${y_{i}}$ given by
where $\sigma (.)$ denotes the standard logistic function, $\boldsymbol{\beta }={({\beta _{0}},{\beta _{z}},{\tilde{\boldsymbol{\beta }}^{\top }})^{\top }}\in {\mathbb{R}^{p+2}}$ is a $(p+2)$-dimensional vector of regression coefficients. Especially, ${\beta _{0}}$, ${\beta _{z}}$, and $\tilde{\boldsymbol{\beta }}$ are the regression coefficients associated with the intercept, the treatment indicators ${z_{i}}$, and ${\tilde{\boldsymbol{x}}_{i}}$. After adjusting for the other covariates, ${\beta _{z}}$ represents the conditional treatment effect in the log odds ratio of being cured between the treatment and control groups. For the non-cured survival component, to cover a wide variety of survival curves while maintaining the proportional hazards assumption, the Weibull distributed failure time is assumed. If the i-th individual is not cured, the survival function is assumed to be
where α is a fixed but unknown Weibull shape parameter, $\boldsymbol{\gamma }={({\gamma _{0}},{\gamma _{z}},{\tilde{\boldsymbol{\gamma }}^{\top }})^{\top }}\in {\mathbb{R}^{p+2}}$ is the vector of regression coefficients associated with ${\boldsymbol{x}_{i}}$ for the non-cured survival component, and $\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })$ is the subject-specific scale parameter. After adjusting for the other covariates, ${\gamma _{z}}$ represents the conditional treatment effect in the log hazard ratio between the treatment and control groups for the non-cured patients.
(2.6)
\[ {\pi _{i}}=\mathbb{P}({y_{i}}=1)=\sigma (-{\boldsymbol{x}_{i}^{\top }}\boldsymbol{\beta })=\frac{1}{1+\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\beta })},\](2.7)
\[ {S_{i}^{nc}}(t\mid {\boldsymbol{x}_{i}})=\exp (-{t^{\alpha }}\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })),\]Plugging (2.6) and (2.7) in (2.1), the unconditional survival function for the mixture cure model is given by
(2.8)
\[ \begin{aligned}{}{S_{i}}(t\mid {\boldsymbol{x}_{i}})=& \frac{1}{1+\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\beta })}+\\ {} & \frac{\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\beta })}{1+\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\beta })}\exp (-{t^{\alpha }}\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })).\end{aligned}\]Let ${\boldsymbol{\theta }_{M}}=\{\boldsymbol{\beta },\boldsymbol{\gamma },\alpha \}$ denote the set of parameters, where $\{{\beta _{z}},{\gamma _{z}}\}$ serve as the conditional treatment effect parameters of interest. Suppose there are n subjects in the dataset. Let ${\mathcal{D}^{obs}}=\{{\boldsymbol{x}_{i}},{t_{i}},{\delta _{i}},i=1,\dots ,n\}$ denote the observed data under the mixture cure model, $\mathcal{Y}=\{{y_{i}},i=1,\dots ,n\}$ denote the hidden state of cured indicators, and let ${\mathcal{D}_{M}^{comp}}={\mathcal{D}^{obs}}\cup \mathcal{Y}$ denote the complete data.
For the individual with failure time (${\delta _{i}}=1$), the likelihood can be expressed by multiplying the non-cured probability $1-{\pi _{i}}$ with the probability density function of non-cured patient ${f_{i}^{nc}}(t)$. For the censored individual with ${\delta _{i}}=0$, the likelihood can be constructed by adding the cured probability with the probability of non-cured survival. The observed-data likelihood function is given by
In the second parenthesis of the observed-data likelihood, the mixture component can be converted into a multiplied form after including the hidden state $\mathcal{Y}$ in the likelihood function. The complete-data likelihood can be written as
Note that under mixture cure model, the cured fraction component (2.6) and non-cured survival component (2.7) are linked through the hidden state $\mathcal{Y}$. The following Theorem and Remarks imply that conditioning on ${\mathcal{D}^{obs}}$, the key parameters of interest ${\beta _{z}}$ and ${\gamma _{z}}$ interact with each other. Also, the maximum-likelihood estimators (MLE) and the posterior samples of ${\beta _{z}}$ and ${\gamma _{z}}$ are not independent.
(2.9)
\[ \begin{aligned}{}{\mathcal{L}_{M}^{obs}}({\boldsymbol{\theta }_{M}}\mid {\mathcal{D}^{obs}})=& \prod \limits_{{\delta _{i}}=1}\bigg((1-{\pi _{i}}){f_{i}^{nc}}({t_{i}}\mid {\boldsymbol{x}_{i}})\bigg)\times \\ {} & \prod \limits_{{\delta _{i}}=0}\bigg({\pi _{i}}+(1-{\pi _{i}}){S_{i}^{nc}}({t_{i}}\mid {\boldsymbol{x}_{i}})\bigg).\end{aligned}\](2.10)
\[ \begin{aligned}{}& {\mathcal{L}_{M}^{comp}}({\boldsymbol{\theta }_{M}}\mid {\mathcal{D}_{M}^{comp}})=\prod \limits_{{y_{i}}=1}{\pi _{i}}\times \\ {} & \hspace{1em}\prod \limits_{{y_{i}}=0}\left((1-{\pi _{i}}){f_{i}^{nc}}{({t_{i}}\mid {\boldsymbol{x}_{i}})^{{\delta _{i}}}}{S_{i}^{nc}}{({t_{i}}\mid {\boldsymbol{x}_{i}})^{(1-{\delta _{i}})}}\right).\end{aligned}\]Theorem 1.
Under the mixture cure regression model defined in (2.8), the $({\beta _{z}},{\gamma _{z}})$ entry of the observed Fisher information
The proof is given in Appendix A.
Remark 1.
The off-diagonal entry in observed Fisher information matrix $\mathcal{I}{({\boldsymbol{\theta }_{M}})_{({\beta _{z}},{\gamma _{z}})}}\gt 0$ likely implies the off-diagonal entry in the inverse Fisher information matrix ${\mathcal{I}^{-1}}{({\boldsymbol{\theta }_{M}})_{({\beta _{z}},{\gamma _{z}})}}\lt 0$.
Intuitively, since the hidden state $\mathcal{Y}$ is unknown, the conditional log odds ratio estimates from the cure rate component and the conditional log hazard ratio estimates from the non-cured population are correlated. The following Remarks reveal how the correlation will impact the maximum likelihood estimator (MLE) from the Frequentist perspective and posterior samples from Bayesian perspective.
Remark 2.
Since the covariance matrix associated with the maximum-likelihood estimators can be approximated by the inverse matrix of the Fisher information matrix, it is likely that under mixture cure model,
where $\widehat{{\beta _{z}}}$ and $\widehat{{\gamma _{z}}}$ refer to the MLEs of ${\beta _{z}}$ and ${\gamma _{z}}$. That is, $\widehat{{\beta _{z}}}$ and $\widehat{{\gamma _{z}}}$ are negatively correlated.
Remark 3.
By Bernstein–von Mises Theorem in Bayesian statistics, the asymptotic distribution of the posterior converges to a multivariate Gaussian distribution with covariance matrix given by ${n^{-1}}\mathcal{I}{({\boldsymbol{\theta }_{M}})^{-1}}$. Thus, the posterior samples of ${\beta _{z}}$ and ${\gamma _{z}}$ are correlated.
Considering the correlation, looking at the conditional treatment effects from the cured fraction component and the non-cured survival component separately for the mixture cure model is not ideal. A unified estimand (measure of treatment effect) should be proposed to unify the treatment effect measure and enable comparisons.
In one extreme case, when ${\beta _{0}}\to \infty $, the cured probability ${\pi _{i}}$ goes to 0 for all subjects, and the mixture cure model reduces to the Weibull regression model with survival function
where α is a fixed but unknown Weibull shape parameter, $\boldsymbol{\gamma }={({\gamma _{0}},{\gamma _{z}},{\tilde{\boldsymbol{\gamma }}^{\top }})^{\top }}\in {\mathbb{R}^{p+2}}$ is the vector of regression coefficients associated with ${\boldsymbol{x}_{i}}$ under Weibull regression model, and $\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })$ is the subject-specific scale parameter. The model becomes proportional hazards, and ${\gamma _{z}}$ represents the treatment effect in the log hazard ratio between the treatment and control groups after adjusting for other factors. The reduced model can be fitted by Cox regression by maximizing the partial likelihood function
where ${t_{(1)}}\lt {t_{(2)}}\lt \cdots \lt {t_{(d)}}$ are the d ordered distinct event times, $D(t)$ denotes the set of subjects who die at time t, and $R(t)$ represents the risk set at time ${t^{-}}$, that is, the set of individuals who have not failed or been censored by that time. The inference with respect to $\boldsymbol{\gamma }$ can be made via the semi-parametric partial likelihood $\mathcal{PL}(\boldsymbol{\gamma })$.
(2.11)
\[ {S_{i}}(t\mid {\boldsymbol{x}_{i}})=\exp (-{t^{\alpha }}\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })),\](2.12)
\[ \mathcal{PL}(\boldsymbol{\gamma })={\prod \limits_{j=1}^{d}}\prod \limits_{i\in D({t_{(j)}})}\frac{\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })}{{\textstyle\sum _{i\in R({t_{(j)}})}}\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\gamma })},\]2.3.2 Promotion Time Cure Model
For the promotion time cure model defined in (2.3), for the i-th subject, let
where $\boldsymbol{\zeta }={({\zeta _{0}},{\zeta _{z}},{\tilde{\boldsymbol{\zeta }}^{\top }})^{\top }}\in {\mathbb{R}^{p+2}}$ is a $(p+2)$-dimensional vector of regression coefficients. Especially, ${\zeta _{z}}$ is the regression coefficient associated with the treatment indicators ${z_{i}}$. After adjusting for the other covariates, ${\zeta _{z}}$ represents the conditional treatment effect in the log hazard ratio between the treatment and control groups under this model. The relationship serves as a canonical link under the Poisson regression model. Parameter ${\zeta _{z}}$ is the conditional treatment effect parameter of interest in the PTCM.
Assume the incubation time for each active carcinogenic cell follows an $i.i.d$. Weibull distribution with shape parameter α and scale parameter $\exp (\mu )$. The survival function for the incubation is given by
The survival function and probability density function for the promotion time cure model can be expressed as
and
Suppose there are n subjects in the dataset. Under the promotion time cure model defined in (2.15), let ${\mathcal{D}^{obs}}=\{{\boldsymbol{x}_{i}},{t_{i}},{\delta _{i}},i=1,\dots ,n\}$ denote the observed data under mixture cure model, $\mathcal{N}=\{{N_{i}},i=1,\dots ,n\}$ denote unobserved data for the number of active carcinogenic cells, and let ${\mathcal{D}_{P}^{comp}}={\mathcal{D}^{obs}}\cup \mathcal{N}$ denote the complete data. Let ${\boldsymbol{\theta }_{P}}=\{\boldsymbol{\zeta },\alpha ,\mu \}$ denote the parameters associated with the promotion time cure model. The observed-data likelihood function is given by
(2.17)
\[ \begin{aligned}{}& {\mathcal{L}_{P}^{obs}}({\boldsymbol{\theta }_{P}}\mid {\mathcal{D}^{obs}})\\ {} =& {\prod \limits_{i=1}^{n}}{f_{i}}{({t_{i}}\mid {\boldsymbol{x}_{i}})^{{\delta _{i}}}}{S_{i}}{({t_{i}}\mid {\boldsymbol{x}_{i}})^{(1-{\delta _{i}})}}\\ {} =& {\prod \limits_{i=1}^{n}}\exp (-\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\zeta })(1-\exp (-{t_{i}^{\alpha }}\exp (\mu ))))\times \\ {} & \prod \limits_{{\delta _{i}}=1}\alpha {t_{i}^{\alpha -1}}\exp ({\boldsymbol{x}_{i}^{\top }}\boldsymbol{\zeta }+\mu -{t_{i}^{\alpha }}\exp (\mu )).\end{aligned}\]3 A Unified Estimand under Cure Models
Under the proportional-hazards models, the conditional treatment effect in the hazard ratio between treatments is overwhelmingly used to characterize efficacy of the treatment. However, as shown in (2.2), the mixture cure model does not belong to the proportional hazards framework. Thus, how to develop an appropriate measure of the treatment effect (estimand) for mixture cure models has become an interesting topic. There are various approaches to summarise and make inference of the treatment effects under the non-proportional hazards models. The Weighted Log-Rank Test (WLRT) is one of the most popular statistical tests dealing with the non-proportional hazards models, and Fleming and Harrington [9] developed a class of test statistics $FH(\rho ,\gamma )$. Another popular type of test is developed based on the Kaplan-Meier curve, including the Weighted Kaplan-Meier tests (WKM) [20], the restricted mean survival time (RMST) comparison [21], and a divergence measure between the two survival functions [6]. Especially, the difference in restricted mean survival time (${\Delta _{RMST}}$) shows an increasing popularity as an alternative to hazard ratio, and it is defined as
where ${T_{0}}$ and ${T_{1}}$ denote the survival times for the control arm and the treatment arm, ${S_{pop,1}}(t)$ and ${S_{pop,0}}(t)$ represent the survival functions for the treatment and control arms, and ${t^{\ast }}$ is the duration of follow-up or truncation time. ${\Delta _{RMST}}({t^{\ast }})$ describes differences between treatment and control arms in their ${t^{\ast }}$-year life expectancy. Moreover, several classes of combination tests have been developed in recent years, including the Breslow test [2], Lee’s combo test [16], and the MaxCombo test [17]. Lin et al. [19] evaluated different tests for the non-proportional hazards models, and Li et al. [18] developed an overlapping approach to measure the comparability between two arms through resampling technique. Regarding the treatment effect measure, Chen et al. [5] developed the averaged hazard ratio estimates for the non-proportional hazards model.
(3.1)
\[ \begin{aligned}{}{\Delta _{RMST}}({t^{\ast }})=& \mathbb{E}[\min ({T_{1}},{t^{\ast }})-\min ({T_{0}},{t^{\ast }})]\\ {} =& {\int _{0}^{{t^{\ast }}}}[{S_{pop,1}}(t)-{S_{pop,0}}(t)]dt,\end{aligned}\]Motivated by the restricted mean survival time (RMST) [21] and a divergence measure proposed in [6], we define a new unified estimand under the survival models of the form
where ${T_{0}}$ and ${T_{1}}$ denote the survival times for the control arm and the treatment arm, ${S_{pop,1}}(t)$, ${S_{pop,0}}(t)$, ${f_{pop,0}}(t)$, and ${f_{pop,1}}(t)$ represent the survival functions and probability density functions for the treatment and control arms, and $sgn(.)$ is the sign function such that
(3.2)
\[ \begin{aligned}{}\Delta & =\mathbb{E}[sgn({T_{1}}-{T_{0}})]\\ {} & =\mathbb{P}({T_{1}}\gt {T_{0}})-\mathbb{P}({T_{0}}\gt {T_{1}})\\ {} & ={\int _{0}^{\infty }}({S_{pop,1}}(t){f_{pop,0}}(t)-{S_{pop,0}}(t){f_{pop,1}}(t))dt\\ {} & \in [-1,1],\end{aligned}\]Following the idea of the causal effect model, ${T_{0}}$ and ${T_{1}}$ can be viewed as the potential outcomes (survival time) if the individuals are assigned to the treatment arm and the control arm, respectively. An intuitive interpretation for Δ is that Δ characterizes the probability that the treatment extends survival for patients.
The proposed unified estimand, representative of the unconditional treatment effect between treatment and control groups, remains invariant to model and variable selections. This characteristic ensures its alignment with the FDA’s guidelines and discussions about the “Estimand” concept.
The following Theorems give parametric derivations for the unified estimand Δ under the proportional hazards model, mixture cure model, and promotion time cure model. In addition, it connects the unconditional treatment effect in the proposed estimand with the conditional treatment effect under different models.
See proof in Appendix A.
Theorem 2 shows that the unified estimand ${\Delta _{PH}}$ serves as a one-to-one transformation, negative hyperbolic tangent transformation, of half of the log hazard ratio ${\gamma _{z}}$. When the treatment benefits patients (${\gamma _{z}}\lt 0$), ${\Delta _{PH}}\gt 0$, and vice versa. ${\Delta _{PH}}$ quantifies the ratio between the differences in hazard functions to the sum of hazard functions.
Theorem 3.
Under the mixture cure model defined in (2.8), when the prognostic variables are available, the unified estimand ${\Delta _{M}}$ is defined as the average treatment effect (ATE) based on the finite-sample population by integrating out the covariates, such that
where ${\pi _{01}}=\mathbb{P}({T_{1}},{T_{0}}\lt \infty )=\textstyle\int \frac{\exp ({\beta _{0}}+{\beta _{z}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}{1+\exp ({\beta _{0}}+{\beta _{z}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}\frac{\exp ({\beta _{0}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}{1+\exp ({\beta _{0}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}dP(\tilde{\boldsymbol{X}})$, ${\pi _{1}}=\mathbb{P}({T_{1}}=\infty )=\textstyle\int \frac{1}{1+\exp ({\beta _{0}}+{\beta _{z}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}dP(\tilde{\boldsymbol{X}})$, and ${\pi _{0}}=\mathbb{P}({T_{0}}=\infty )=\textstyle\int \frac{1}{1+\exp ({\beta _{0}}+{\tilde{\boldsymbol{X}}^{\top }}\tilde{\boldsymbol{\beta }})}dP(\tilde{\boldsymbol{X}})$ are the cured probabilities for the treatment and control arms.
(3.4)
\[ \begin{aligned}{}{\Delta _{M}}& ={\mathbb{E}_{\tilde{\boldsymbol{X}}}}[\mathbb{E}[sgn({T_{1}}-{T_{0}})\mid \tilde{\boldsymbol{X}}]]\\ {} & ={\pi _{01}}\tanh (-\frac{{\gamma _{z}}}{2})+{\pi _{1}}-{\pi _{0}},\end{aligned}\]See proof in Appendix A.
The following Remarks reveal that the unified estimand connects and bridges the risk difference definition from the binary endpoint and the hazard ratio definition from the survival endpoint.
Remark 5.
When ${\beta _{0}}\to \infty $, the cured probabilities converge to 0 for both the treatment and control arms (${\lim \nolimits_{{\beta _{0}}\to \infty }}{\pi _{0}}={\lim \nolimits_{{\beta _{0}}\to \infty }}{\pi _{1}}=0$), the model reduces to the proportional hazards model.
Remark 6.
If there is no treatment effect for the non-cured population $({\gamma _{z}}=0)$, ${\Delta _{M}}={\pi _{1}}-{\pi _{0}}$, which coincides with the definition of risk difference.
Theorem 4.
Under the promotion time cure model defined in (2.15), let ${\zeta _{z}}$ denote the log hazard ratio between the treatment and control arms. The unified estimand is a one-to-one transformation of the log hazard ratio ${\zeta _{z}}$.
4 Bayesian Inference under Cure Models
4.1 Priors and Posteriors
4.1.1 Mixture Cure Model
For the mixture cure model, it has been shown in Theorem 2 of [3] that if we take an improper uniform prior for $\boldsymbol{\beta }$ [i.e., ${\pi _{0}}(\boldsymbol{\beta },\boldsymbol{\gamma },\alpha )\propto {\pi _{0}}(\boldsymbol{\gamma },\alpha )$], the posterior distribution
\[ \pi (\boldsymbol{\beta },\boldsymbol{\gamma },\alpha \mid {\mathcal{D}^{obs}})\propto L(\boldsymbol{\beta },\boldsymbol{\gamma },\alpha \mid {\mathcal{D}^{obs}}){\pi _{0}}(\boldsymbol{\gamma },\alpha )\]
is always improper regardless of the propriety of ${\pi _{0}}(\boldsymbol{\gamma },\alpha )$, i.e.,
\[ {\int _{\Theta }}\pi (\boldsymbol{\beta },\boldsymbol{\gamma },\alpha \mid {\mathcal{D}^{obs}})d\boldsymbol{\theta }=\infty .\]
Thus, a uniform improper prior ${\pi _{0}}(\boldsymbol{\theta })\propto 1$ cannot guarantee convergence of MCMC sampling. Instead, weak independent normal priors will be put on each parameter in $\boldsymbol{\beta }$, that is, ${\pi _{0}}(\boldsymbol{\theta })\sim {\textstyle\prod _{j}}\phi ({\beta _{j}};0,{\sigma _{\beta }^{2}})$, where $\phi (x;{\mu _{x}},{\sigma ^{2}})$ denotes the probability density function of the normal distribution $N({\mu _{x}},{\sigma ^{2}})$. The posterior distribution of $(\boldsymbol{\beta },\boldsymbol{\gamma },\alpha )$ given the observed data ${\mathcal{D}^{obs}}$ can be written as
4.1.2 Promotion Time Cure Model
Suppose that we consider a joint weak informative prior for the parameters under the promotion time cure model. Suppose ${\pi _{0}}(\boldsymbol{\zeta },\alpha ,\mu )\propto {\pi _{0}}(\alpha \mid {\nu _{0}},{\tau _{0}}){\pi _{0}}(\mu )$, where ${\pi _{0}}(\alpha \mid {\nu _{0}},{\tau _{0}})$ is a gamma prior for shape parameter α, and ${\pi _{0}}(\mu )$ is a week normal prior for μ. The posterior distribution of $(\boldsymbol{\zeta },\alpha ,\mu )$ given the observed data ${\mathcal{D}^{obs}}$ can be written as
4.2 Bayesian Hypothesis Testing
We consider a clinical trial that aims to demonstrate that a new treatment presents treatment benefits for the population. Each patient will be randomly allocated into one of the two arms: the treatment arm and the active control arm. We assume the randomization ratio is fixed as 1:1, and the total sample size is denoted by n. Progression-free survival (PFS) or Relapse-free survival (RFS) is the primary endpoint, and the survival time and event indicator $({t_{i}},{\delta _{i}})$ will be collected for each patient.
The general hypothesis can be expressed as
or
where the treatment effect parameter we are interested in, $g(\boldsymbol{\theta })$, can take any functional form of model parameters $\boldsymbol{\theta }$. Also, constructing the functional form of $g(\boldsymbol{\theta })$ may include prognostic factors as well.
(4.3)
\[ {H_{1}}(g):g(\boldsymbol{\theta })\gt 0\hspace{2.5pt}\text{v.s.}\hspace{2.5pt}{H_{0}}(g):g(\boldsymbol{\theta })\le 0\](4.4)
\[ {H_{1}}(g):g(\boldsymbol{\theta })\lt 0\hspace{2.5pt}\text{v.s.}\hspace{2.5pt}{H_{0}}(g):g(\boldsymbol{\theta })\ge 0,\]Under the mixture cure model, denote the vector of parameters as ${\boldsymbol{\theta }_{M}}=\{\boldsymbol{\beta },\boldsymbol{\gamma },\alpha \}$. The treatment effect parameter may include
-
(i) ${g_{1}}({\boldsymbol{\theta }_{M}})={\beta _{z}}$ for testing the conditional treatment effect in the cure fraction,
-
(ii) ${g_{2}}({\boldsymbol{\theta }_{M}})={\gamma _{z}}$ for testing the conditional treatment effect in the non-cured survival component, and
-
(iii) ${g_{3}}({\boldsymbol{\theta }_{M}})={\Delta _{M}}$ for testing the unconditional treatment effect in the unified estimand defined for MCM in Section 3.
Under the promotion time cure model, let ${\boldsymbol{\theta }_{P}}=\{\boldsymbol{\zeta },\alpha ,\mu \}$ be the vector of parameters. The treatment effect parameter could take the following form:
To be noted that although ${\Delta _{M}}$ and ${\Delta _{P}}$ are defined based on different model assumptions, both of them represent the unified estimand Δ.
-
(iv) ${g_{4}}({\boldsymbol{\theta }_{P}})={\zeta _{z}}$ for testing the conditional treatment effect in log hazard ratio, and
-
(v) ${g_{5}}({\boldsymbol{\theta }_{P}})={\Delta _{P}}$ for testing the unconditional treatment effect in the unified estimand defined for PTCM in Section 3.
From Bayesian hypothesis testing point of view, the hypothesis ${H_{0}}(g)$ can be rejected and the data is in favor of ${H_{1}}(g)$ if the posterior predictive probability
where ${\gamma ^{\ast }}$ is a pre-specified threshold, which equals 0.95 by default, for rejecting ${H_{0}}(g)$. On the other hand, the null hypothesis, ${H_{0}}(g)$, cannot be rejected if $\mathbb{P}({H_{1}}(g)\mid {\mathcal{D}^{(n)}})\lt {\gamma ^{\ast }}$.
4.3 Model Comparisons
In an ideal scenario, clinicians and researchers ought to predetermine the choice between the two types of cure models based on an in-depth understanding of the disease mechanism and the nature of the intervention. Nevertheless, when the cure model type has not been pre-specified, goodness-of-fit (GoF) can be evaluated to facilitate selection and comparison. By using the unified estimand proposed, the model selection and variable selection processes do not alter the definition of the estimand since it reflects the unconditional treatment effect.
In this section, we consider two Bayesian model comparison criteria, including the Deviance Information Criterion (DIC) [22] and the Logarithm of Pseudo-Marginal Likelihood (LPML) [14]. These two goodness-of-fit measures can help choose from the two types of cure model when it is needed.
The DIC is defined as
where deviance $\text{Dev}(\boldsymbol{\theta })=-2\log ({\mathcal{L}^{obs}}(\boldsymbol{\theta }\mid {\mathcal{D}^{obs}}))$ and ${\mathcal{L}^{obs}}(\boldsymbol{\theta }\mid {\mathcal{D}^{obs}})$ refers to either (2.9) or (2.17) depending on the model, $\text{Dev}(\overline{\boldsymbol{\theta }})$ is the deviance evaluated at the posterior mean of $\boldsymbol{\theta }$, and ${p_{D}}=\overline{\text{Dev}(\boldsymbol{\theta })}-\text{Dev}(\overline{\boldsymbol{\theta }})$ represents the effective number of parameters. The DIC offers a measure of the trade-off between the fit of the model and its complexity, selecting models that provide a good fit without overfitting.
Besides DIC, the LPML is another commonly used criterion for Bayesian model comparison, which is a summary statistics based on ${\text{CPO}_{i}}$’s and judges the predictive performance of different models based on the predictive distribution for each observation. The LPML is defined as
Under the cure models, we define the conditional predictive ordinate (CPO) statistics for the i-th subject as
\[\begin{aligned}{}{\text{CPO}_{i}}& =f{({t_{i}}\mid {\boldsymbol{x}_{i}},{\mathcal{D}^{obs(-i)}})^{{\delta _{i}}}}S{({t_{i}}\mid {\boldsymbol{x}_{i}},{\mathcal{D}^{obs(-i)}})^{1-{\delta _{i}}}}\\ {} & =\int {f_{i}}{({t_{i}}\mid {\boldsymbol{x}_{i}},\boldsymbol{\theta })^{{\delta _{i}}}}{S_{i}}{({t_{i}}\mid {\boldsymbol{x}_{i}},\boldsymbol{\theta })^{1-{\delta _{i}}}}\pi (\boldsymbol{\theta }\mid {\mathcal{D}^{obs(-i)}})d\boldsymbol{\theta },\end{aligned}\]
where ${\mathcal{D}^{obs(-i)}}$ is the observed data ${\mathcal{D}^{obs}}$ with the i-th observation ignored, and $\pi (\boldsymbol{\theta }\mid {\mathcal{D}^{(-i)}})$ is the posterior given ${\mathcal{D}^{obs(-i)}}$ given in (4.1) or (4.2).By [4],
\[ {\text{CPO}_{i}}={\left(\int \frac{1}{{f_{i}}{({t_{i}}|{\boldsymbol{x}_{i}},\boldsymbol{\theta })^{{\delta _{i}}}}{S_{i}}{({t_{i}}|{\boldsymbol{x}_{i}},\boldsymbol{\theta })^{1-{\delta _{i}}}}}\pi (\boldsymbol{\theta }\mid {\mathcal{D}^{obs}})d\boldsymbol{\theta }\right)^{-1}},\]
and it can be approximated via posterior computation after obtaining the MCMC samples $\{{\boldsymbol{\theta }^{(l)}},l=1,\dots ,L\}$ from the posterior distribution $\pi (\boldsymbol{\theta }\mid {\mathcal{D}^{obs}})$ by
\[ \widehat{{\text{CPO}_{i}}}={\left[\frac{1}{L}{\sum \limits_{l=1}^{L}}\frac{1}{{f_{i}}{({t_{i}}|{\boldsymbol{x}_{i}}\hspace{0.1667em}{\boldsymbol{\theta }^{(l)}})^{{\delta _{i}}}}{S_{i}}{({t_{i}}|{\boldsymbol{x}_{i}},{\boldsymbol{\theta }^{(l)}})^{1-{\delta _{i}}}}}\right]^{-1}}.\]
Plugging it in (4.6), we have
4.4 Bayesian Computation
4.4.1 Mixture Cure Model
Under the mixture cure model, conditioning on the hidden state $\mathcal{Y}$, the cure rate component and the non-cured survival component can be separated, and both are log-concave to the set of parameters. The Gibbs sampler can be developed to draw MCMC samples from the posterior distribution by an accept-reject sampling scheme, which yields dependent samples.
The pseudo-algorithm for the updating scheme is written in Algorithm 1.
After MCMC samples $\{{\boldsymbol{\theta }_{M}^{(l)}},l=1,\dots ,L\}$ are obtained, ${\Delta _{M}^{(l)}}={g_{3}}({\boldsymbol{\theta }_{M}^{(l)}})$ for the l-th iteration can be estimated by g-estimation as
where
are the l-th posterior predicted cured probabilities for patient i if the i-th patient is assigned to the treatment or control arms, correspondingly.
(4.7)
\[ \begin{aligned}{}& {\Delta _{M}^{(l)}}={g_{3}}({\boldsymbol{\theta }_{M}^{(l)}})\\ {} =& \tanh (-\frac{{\gamma _{z}^{(l)}}}{2})\frac{1}{n}{\sum \limits_{i=1}^{n}}\left((1-{\pi _{i0}^{(l)}})(1-{\pi _{i1}^{(l)}})\right)+\\ {} & \frac{1}{n}{\sum \limits_{i=1}^{n}}\bigg({\pi _{i1}^{(l)}}-{\pi _{i0}^{(l)}}\bigg),\end{aligned}\](4.8)
\[ \begin{aligned}{}{\pi _{i1}^{(l)}}& =\frac{1}{1+\exp ({\beta _{0}^{(l)}}+{\beta _{z}^{(l)}}+{\tilde{\boldsymbol{x}}_{i}^{\top }}{\tilde{\boldsymbol{\beta }}^{(l)}})}\hspace{2.5pt}\text{and}\hspace{2.5pt}\\ {} {\pi _{i0}^{(l)}}& =\frac{1}{1+\exp ({\beta _{0}^{(l)}}+{\tilde{\boldsymbol{x}}_{i}^{\top }}{\tilde{\boldsymbol{\beta }}^{(l)}})}\end{aligned}\]Then, the posterior mean $\overline{{\Delta _{M}}}=\frac{1}{L}{\textstyle\sum _{l=1}^{L}}{\Delta _{M}^{(l)}}$ will be used as a point estimate of ${\Delta _{M}}$ from the Bayesian computation, and the posterior inference can be made based on $\{{\Delta _{M}^{(l)}},l=1,\dots ,L\}$.
4.4.2 Promotion Time Cure Model
After MCMC samples $\{{\boldsymbol{\theta }_{P}^{(l)}},l=1,\dots ,L\}$ are obtained, ${\Delta _{P}^{(l)}}={g_{5}}({\boldsymbol{\theta }_{P}^{(l)}})$ for the l-th iteration can be estimated by g-estimation as
The posterior mean $\overline{{\Delta _{P}}}=\frac{1}{L}{\textstyle\sum _{l=1}^{L}}{\Delta _{P}^{(l)}}$ will be used as a point estimate of ${\Delta _{P}}$ from the Bayesian computation, and the posterior inference can be made based on $\{{\Delta _{M}^{(l)}},l=1,\dots ,L\}$.
(4.9)
\[ {\Delta _{P}^{(l)}}={g_{5}}({\boldsymbol{\theta }_{P}^{(l)}})=\tanh (-\frac{{\zeta _{z}^{(l)}}}{2}).\]5 Simulation
In the simulation study, we simulate independent data from (2.8) for mixture cure models or (2.15) for promotion time cure models with ${\boldsymbol{\theta }^{true}}$ based on a 1:1 randomization ratio with arm sizes ${n_{1}}={n_{0}}=200$. The covariates $\tilde{\boldsymbol{x}}$, including sex and gender, for each patient are randomly drawn from the melanoma cancer E1684 dataset. More details about the E1684 dataset are introduced in Section 6. The time-to-event data is generated with independent censoring, but no maximum follow-up period is set.
Our main focus is on comparing the performance of the Bayesian inference with different choices of the five functional forms of treatment effect parameters $g(\boldsymbol{\theta })$ mentioned in Section 4.2. In addition, we evaluate and compare the performance when the model is either correctly specified or misspecified between the two types of cure models. Based on the posterior samples obtained from MCMC, the rejection probability (RP), root mean square error (RMSE), and coverage probability (CP) serve as Frequentist evaluations of the Bayesian inference for those parameters based on B replicates. When the model is correctly specified, we assess the RP, RMSE, and CP of the treatment effect parameter $g(\boldsymbol{\theta })$ by
\[ RP(g(\boldsymbol{\theta }))=\frac{1}{B}{\sum \limits_{b=1}^{B}}\left[1\{0\in (g{(\boldsymbol{\theta })^{LL(b)}},g{(\boldsymbol{\theta })^{UL(b)}})\}\right],\]
\[ RMSE(g(\boldsymbol{\theta }))=\sqrt{\frac{1}{B}{\sum \limits_{b=1}^{B}}{\left({\overline{g(\boldsymbol{\theta })}^{(b)}}-g({\boldsymbol{\theta }^{true}})\right)^{2}}},\]
and
\[ CP(g(\boldsymbol{\theta }))=\frac{1}{B}{\sum \limits_{b=1}^{B}}\left[1\{g({\boldsymbol{\theta }^{true}})\in (g{(\boldsymbol{\theta })^{LL(b)}},g{(\boldsymbol{\theta })^{UL(b)}})\}\right],\]
where $(g{(\boldsymbol{\theta })^{LL(b)}},g{(\boldsymbol{\theta })^{UL(b)}})$ is the 95% highest posterior density (HPD) interval of $g(\boldsymbol{\theta })$ from the b-th replicate. However, when the type of cure model is misspecified, we assess the RP, RMSE, and CP of the unified estimand Δ by
\[ RP({\Delta _{mis}})=\frac{1}{B}{\sum \limits_{b=1}^{B}}\left[1\{0\in ({\Delta _{mis}^{LL(b)}},{\Delta _{mis}^{UL(b)}})\}\right],\]
\[ RMSE({\Delta _{mis}})=\sqrt{\frac{1}{B}{\sum \limits_{b=1}^{B}}{\left({\overline{{\Delta _{mis}}}^{(b)}}-{\Delta ^{true}}\right)^{2}}},\]
and
\[ CP({\Delta _{mis}})=\frac{1}{B}{\sum \limits_{b=1}^{B}}\left[1\{{\Delta ^{true}}\in ({\Delta _{mis}^{LL(b)}},{\Delta _{mis}^{UL(b)}})\}\right],\]
where ${\Delta ^{true}}=g({\boldsymbol{\theta }^{true}})$ is the associated with the true model, ${\overline{{\Delta _{mis}}}^{(b)}}$ is the posterior mean and $({\Delta _{mis}^{LL(b)}},{\Delta _{mis}^{UL(b)}})$ is the 95% highest posterior density (HPD) interval of Δ based on the misspecified model from the b-th replicate.To fit the simulated data by the mixture cure model, we apply independent normal initial priors for each parameter in $\boldsymbol{\beta }$ to ensure identifiability. The variance of the initial prior is set to ${\sigma _{\beta }^{2}}=100$ for each parameter of $\boldsymbol{\beta }$. We set the initial prior for the shape parameter α to ${\pi ^{(f)}}(\alpha )\propto 1$. To sum up, the initial prior is set as ${\pi ^{(f)}}({\boldsymbol{\theta }_{M}})={\textstyle\prod _{j=1}}\phi ({\beta _{j}};0,{10^{2}})$. For the initial priors of the promotion time cure model, we set ${\pi ^{(f)}}({\zeta _{j}})\sim \mathcal{N}(0,{100^{2}})$ for each dimension in $\boldsymbol{\zeta }$, consider the shape parameter α as unknown, and set the prior for μ and α to be flat ${\pi ^{(f)}}(\mu ,\alpha )\propto 1$. Therefore, for the promotion time cure model, the initial prior is set as ${\pi ^{(f)}}({\boldsymbol{\theta }_{P}})={\textstyle\prod _{j=1}}\phi ({\zeta _{j}};0,{100^{2}})$.
All Bayesian simulation studies are based on Markov Chain Monte Carlo samples of size $L=1000$, with additional 1000 burn-in iterations unless otherwise noted. We set ${\gamma ^{\ast }}=0.95$ for making the Bayesian decision. $B=30,000$ replications are used for evaluating the performance under null hypothesis at 5% level of significance and $B=5,000$ replications are used for evaluating the performance under alternative hypothesis.
5.1 Simulation Results
The MCMC samples demonstrate adequate convergence and mixing across all examined scenarios, which supports the posterior propriety and model identifiability. The MCMC trace plot and mixing diagnoses are thus omitted.
5.1.1 The Mixture Cure Model
Figure 5.1
The contour plots of the posterior sample means $\overline{{\beta _{z}}}$ and $\overline{{\gamma _{z}}}$ with different ${\beta _{z}},{\gamma _{z}}$.
For the true model, we choose ${\beta _{z}}$ from $\{0,-0.25,-0.5\}$, which correspond to odds ratios of $\{1,0.78,0.61\}$ for the cured probability, and ${\gamma _{z}}$ from $\{0,-0.2,-0.4\}$, which correspond to hazard ratios of $\{1,0.82,0.67\}$ for the non-cured population. We specify the shape of the Weibull distribution as 1 and assume an independent censoring rate of 0.2. The other hyperparameters are set to be ${\beta _{0}}=1$, $\tilde{\boldsymbol{\beta }}={(0.3,-0.5)^{\top }}$, ${\gamma _{0}}=0$, and $\tilde{\boldsymbol{\gamma }}={(0.2,-0.4)^{\top }}$ such that the event rates are ranged between 40% and 60%.
Figure 5.1 presents contour plots of the posterior samples means $\overline{{\beta _{z}}}$ and $\overline{{\gamma _{z}}}$ obtained from the mixture cure model. Different subplots correspond to different ${\beta _{z}}$ and ${\gamma _{z}}$ in the true model, with the red point indicating the true values of $({\beta _{z}},{\gamma _{z}})$. The plots illustrate that the posterior sample means converge and the mode of the posterior sample means is close to the truth of ${\beta _{z}}$ and ${\gamma _{z}}$. Remarkably, the negative correlations stated in Remark 2 are evident in all scenarios.
The results of $RP(g(\boldsymbol{\theta }))$ under the mixture cure model are shown in Table 5.1. Each row represents different true values for ${\beta _{z}}$, ${\gamma _{z}}$, and their resulting ${\Delta _{M}}$. When ${\beta _{z}}={\gamma _{z}}=0$, $RP(g(\boldsymbol{\theta }))$ characterizes the probability of rejecting ${H_{0}}(g)$ under null hypothesis (false rejection) for the treatment effect parameters $g(\boldsymbol{\theta })$. As shown in the first row of Table 5.1, $RP({\beta _{z}})$, $RP({\gamma _{z}})$, and $RP({\Delta _{M}})$ are right around the pre-specified level of significance 0.05. When the data are incorrectly fitted by the promotion time cure model, the rejection probability $RP({\Delta _{P}})=0.054$ is slightly over the level of 0.05. When ${\beta _{z}}$ and ${\gamma _{z}}$ become larger in absolute value, $RP(g(\boldsymbol{\theta }))$ characterizes the probability of favoring ${H_{1}}(g)$ under alternative hypothesis (true rejection) for the treatment effect parameters $g(\boldsymbol{\theta })$. When the investigational treatment shows efficacy for the cure fraction only (row 4 and 7 in Table 5.1), the rejection probability of the unified estimand ${\Delta _{M}}$, $RP(\Delta )$, is slightly smaller than $RP({\beta _{z}})$. When the treatment shows efficacy on the non-cured survival only (row 2 and 3 in Table 5.1), $RP(\Delta )$ is about 60%–70% of $RP({\gamma _{z}})$. When the investigational treatment shows efficacy for both cure fraction and non-cured survival (row 5,6,8, and 9 in Table 5.1), the rejection probability of the unified estimand ${\Delta _{M}}$, $RP(\Delta )$, is greater than $RP({\beta _{z}})$ or $RP({\gamma _{z}})$ when looking at ${\beta _{z}}$ and ${\gamma _{z}}$ alone. Also, when the model is misspecified, the $RP(\Delta )$ remains comparable with the correctly specified model in all simulation circumstances.
Table 5.1
Results of $RP(g(\boldsymbol{\theta }))$ under MCM.
Truth of ${\boldsymbol{\theta }_{M}}$ | Fitted model: | MCM | PTCM | ||||
${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | $g(\boldsymbol{\theta })$: | ${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | ${\Delta _{P}}$ |
0 | 0 | 0 | $RP(g(\boldsymbol{\theta }))$: | 0.053 | 0.052 | 0.051 | 0.054 |
0 | −0.2 | 0.052 | 0.049 | 0.373 | 0.234 | 0.212 | |
0 | −0.4 | 0.103 | 0.049 | 0.812 | 0.546 | 0.481 | |
-0.25 | 0 | 0.050 | 0.235 | 0.054 | 0.214 | 0.230 | |
-0.25 | −0.2 | 0.098 | 0.204 | 0.372 | 0.499 | 0.498 | |
-0.25 | −0.4 | 0.146 | 0.191 | 0.817 | 0.830 | 0.751 | |
-0.5 | 0 | 0.103 | 0.602 | 0.052 | 0.547 | 0.565 | |
-0.5 | −0.2 | 0.148 | 0.561 | 0.376 | 0.819 | 0.799 | |
-0.5 | −0.4 | 0.192 | 0.552 | 0.799 | 0.963 | 0.937 |
Table 5.2
Results of Bayesian $RMSE(g(\boldsymbol{\theta }))$ and $CP(g(\boldsymbol{\theta }))$ under MCM.
$RMSE(g(\boldsymbol{\theta }))$ | $CP(g(\boldsymbol{\theta }))$ | ||||||||||
Truth of ${\boldsymbol{\theta }_{P}}$ | Fitted Models $(f)$: | MCM | PTCM | MCM | PTCM | ||||||
${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | $g(\boldsymbol{\theta })$: | ${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | ${\Delta _{P}}$ | ${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | ${\Delta _{P}}$ |
0 | 0 | 0 | 0.002 | 0.002 | 0.0001 | 0.0002 | 0.944 | 0.941 | 0.940 | 0.937 | |
0 | −0.2 | 0.052 | 0.021 | 0.005 | 0.001 | 0.001 | 0.944 | 0.942 | 0.945 | 0.937 | |
0 | −0.4 | 0.103 | 0.026 | 0.008 | 0.001 | 0.002 | 0.945 | 0.941 | 0.938 | 0.935 | |
-0.25 | 0 | 0.050 | 0.007 | 0.005 | 0.0002 | 0.009 | 0.940 | 0.940 | 0.943 | 0.933 | |
-0.25 | −0.2 | 0.098 | 0.014 | 0.006 | 0.002 | 0.011 | 0.942 | 0.943 | 0.945 | 0.931 | |
-0.25 | −0.4 | 0.146 | 0.005 | 0.019 | 0.005 | 0.012 | 0.941 | 0.935 | 0.941 | 0.931 | |
-0.5 | 0 | 0.103 | 0.026 | 0.004 | 0.001 | 0.019 | 0.943 | 0.942 | 0.942 | 0.920 | |
-0.5 | −0.2 | 0.148 | 0.017 | 0.014 | 0.003 | 0.021 | 0.943 | 0.936 | 0.940 | 0.922 | |
-0.5 | −0.4 | 0.192 | 0.026 | 0.014 | 0.006 | 0.023 | 0.940 | 0.942 | 0.940 | 0.917 |
Table 5.2 displays the results of root mean square error (RMSE) and coverage probability (CP). Within the Bayesian framework, the initial prior with a variance of 100 performs well recovering the truth since the RMSEs of all treatment effect parameters are relatively low and the coverage probabilities are close to 0.95. The coverage probabilities are slightly lower than 0.95 when an informative prior (shrinkage prior) is implemented when constructing the initial prior. The estimation is more accurate for the unified estimand Δ than looking at the conditional treatment effect parameters $\{{\beta _{z}},{\gamma _{z}}\}$ since the RMSE of Δ is close to zero for each and every simulation scenario. For the misspecified model fittings, the RMSEs are slightly larger than the correctly-specified model, and the coverage probabilities are slightly smaller. The difference appears to increase as ${\beta _{z}}$ and ${\gamma _{z}}$ deviate from 0, but the RMSE and the CP for the misspecified model are pretty robust overall.
5.1.2 The Promotion Time Cure Model
For the true model, we choose ${\zeta _{z}}$ from $\{0,-0.25,-0.5\}$, corresponding to hazard ratios of $\{1,0.78,0.61\}$ for the conditional treatment effect. We specify the shape of the Weibull survival time to be 1 with an independent censoring rate of 0.1. The other hyperparameters are set as ${\zeta _{0}}=1$ and $\tilde{\boldsymbol{\zeta }}={(0.3,-0.5)^{\top }}$.
Table 5.3
Results of $RP(g(\boldsymbol{\theta }))$ under PTCM.
Truth of ${\boldsymbol{\theta }_{P}}$ | Fitted Models $(f)$: | PTCM | MCM | ||||
${\zeta _{z}}$ | ${\Delta _{P}}$ | $g(\boldsymbol{\theta })$: | ${\zeta _{z}}$ | ${\Delta _{P}}$ | ${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ |
0 | 0 | $RP(g(\boldsymbol{\theta }))$: | 0.053 | 0.053 | 0.037 | 0.068 | 0.058 |
-0.25 | 0.124 | 0.612 | 0.612 | 0.070 | 0.583 | 0.636 | |
-0.5 | 0.24 | 0.981 | 0.981 | 0.132 | 0.899 | 0.980 |
The results of $RP(g(\boldsymbol{\theta }))$ under the promotion time cure model are shown in Table 5.3. When ${\zeta _{z}}$ and its resulting ${\Delta _{P}}$ are 0, $RP(g(\boldsymbol{\theta }))$ represents the probability of rejecting ${H_{0}}(g)$ under null hypothesis (false rejection) for $g(\boldsymbol{\theta })$. As shown in the first row of Table 5.3, $RP({\zeta _{z}})=0.053$ and $RP({\Delta _{P}})=0.053$. When the model is misspecified as the mixture cure model, $RP({\beta _{z}})$ is deflated to 0.037, and $RP({\gamma _{z}})$ is inflated to 0.068. Hence, making inference based on ${\beta _{z}}$ and ${\gamma _{z}}$ will not maintain the rejection probability at the proper level. However, the $RP({\Delta _{M}})$ is less affected by model misspecification, which is 0.058.
When the assumed treatment effect becomes larger as ${\zeta _{z}}$ decreases from 0, $RP({\Delta _{P}})$ is similar to $RP({\zeta _{z}})$ considering the one-to-one relationship under promotion time cure model shown in theorem 4. If the model is misspecified, among the examined scenarios, the rejection probability $RP({\Delta _{M}})$ is higher than $RP({\beta _{z}})$ or $RP({\gamma _{z}})$. In addition, $RP({\Delta _{M}})$ is comparable to the rejection probability $RP({\Delta _{P}})$ from the correctly specified model.
Table 5.4
Results of Bayesian $RMSE(g(\boldsymbol{\theta }))$ and $CP(g(\boldsymbol{\theta }))$ under PTCM.
$RMSE(g(\boldsymbol{\theta }))$ | $CP(g(\boldsymbol{\theta }))$ | |||||||
Truth of ${\boldsymbol{\theta }_{P}}$ | Fitted Models: | PTCM | MCM | PTCM | MCM | |||
${\zeta _{z}}$ | ${\Delta _{P}}$ | $g(\boldsymbol{\theta })$: | ${\zeta _{z}}$ | ${\Delta _{P}}$ | ${\Delta _{M}}$ | ${\zeta _{z}}$ | ${\Delta _{P}}$ | ${\Delta _{M}}$ |
0 | 0 | 0.001 | 0.0005 | 0.001 | 0.942 | 0.941 | 0.927 | |
-0.25 | 0.124 | 0.005 | 0.001 | 0.004 | 0.936 | 0.934 | 0.924 | |
-0.5 | 0.24 | 0.007 | 0.001 | 0.005 | 0.938 | 0.935 | 0.920 |
Table 5.4 displays the results of root mean square error (RMSE) and coverage probability (CP). Within the Bayesian framework, the models with a variance of ${100^{2}}$ in the initial prior converge pretty well and could recover the truth as the RMSEs of treatment effect parameters ${\zeta _{z}}$ and ${\Delta _{P}}$ are all rather low and the coverage probabilities are close to 0.95. The coverage probabilities with model misspecification are slightly lower, and RMSEs are slightly larger than those fitted by the correct model, but the misspecified model remains comparable with the correctly specified model.
6 Real Data Example
We consider data from the Eastern Cooperative Oncology Group’s phase III melanoma clinical trial, E1684 [15], involving three treatment arms, namely, high-dose interferon, low-dose interferon, or observation. The results of the E1684 clinical trial suggested that interferon has a significant impact on relapse-free and overall survival, which led to the approval of this regimen by the U.S. Food and Drug Administration as standard adjuvant therapy for high-risk melanoma patients. For our analysis, we only consider the high-dose interferon (treatment) and observation (control). The E1684 dataset includes four predictors: treatment (${x_{1}}$, coded as 2 for high-dose interferon and 1 for observation), type of primary tumor (${x_{2}}$, coded as 2 for nodular and 1 for other), age (${x_{3}}$, coded in years), and gender (${x_{4}}$, coded as 2 for female and 1 for male). We include treatment, age, and gender as predictors in our real data example.
Table 6.1
A comparison of E1684 results under different models.
Fitted Models:1 | Cox PH | RMST | MCM | PTCM | |||||
Parameter $g(\boldsymbol{\theta })$: | ${\gamma _{z}}$ | ${\Delta _{PH}}$ | ${\Delta _{RMST}}(9.63)$ | ${\Delta _{RMST}}(1.24)$ | ${\beta _{z}}$ | ${\gamma _{z}}$ | ${\Delta _{M}}$ | ${\zeta _{z}}$ | ${\Delta _{P}}$ |
Estimation1 | −0.360 | 0.178 | −0.022 | 0.148 | −0.596 | −0.103 | 0.114 | −0.376 | 0.185 |
95% lower CI1 | −0.642 | 0.039 | −1.156 | 0.043 | −1.169 | −0.409 | 0.005 | −0.669 | 0.055 |
95% upper CI1 | −0.079 | 0.310 | 1.112 | 0.253 | −0.067 | 0.221 | 0.231 | −0.109 | 0.322 |
p1 | 0.006 | 0.006 | 0.970 | 0.006 | 0.014 | 0.258 | 0.022 | 0.002 | 0.002 |
The Cox PH model and Restricted Mean Survival Time (RMST) comparison are performed under the Frequentist framework, while the MCM and PTCM are fitted by Bayesian procedures. The “parameter estimation” refers to MLE or posterior mean, the “CI” refers to 95% two-sided confidence intervals or 95% HPD intervals, and p refers to the two-sided p-value or posterior predictive probability of $\mathbb{P}({H_{0}}(g)\mid {\mathcal{D}^{obs}})$ depending on the models.
The outcome variable is relapse-free survival (RFS), which is defined as the time from randomization until the progression of the tumor or the occurrence of mortality, whichever comes first. The RFS time, ${t_{i}}$ in years, is continuous and subject to right censoring. ${\delta _{i}}$ denotes the censoring indicator, which equals one if the i-th subject relapsed and 0 otherwise.
After eliminating the incomplete data from the E1684 dataset, we have a total of $n=284$ patients, with ${n_{1}}=144$ in the high-dose interferon arm and ${n_{0}}=140$ in the observation arm. The Kaplan-Meier curves for the treatment and placebo arm of E1684 are shown in Figure 6.1. The Kaplan-Meier curves demonstrate plateau patterns of the cure models after a sufficiently long follow-up period. The median relapse-free survival time (95% confidence interval) for the high-dose interferon and the observation arms are $1.72\hspace{2.5pt}(1.09,3.02)$ and $0.98\hspace{2.5pt}(0.52,1.70)$, respectively. The median overall survival (OS) time (95% confidence interval) for the high-dose interferon and the observation arms are $3.82\hspace{2.5pt}(2.56,\infty )$ and $2.67\hspace{2.5pt}(1.83,4.24)$, respectively.
We carry out a comprehensive analysis for the relapse-free survival (RFS) of the E1684 dataset using different models, including the Cox proportional hazards model, restricted mean survival time comparison, mixture cure model, and promotion time cure model. The effect of age and gender are adjusted in all comparisons, and the unified estimand are analyzed and compared with individual model parameters under each model. The results are summarised in Table 6.1.
Under the Cox proportional hazards model, the conditional treatment effect of log hazards ratio ${\beta _{z}}$ between the treatment and control arms shows significance ($\widehat{{\beta _{z}}}=-0.360$, 95% confidence interval: (−0.642, −0.079)), resulting in a significant treatment effect of the unified estimand ${\Delta _{PH}}$ ($\widehat{{\Delta _{PH}}}=0.178$, 95% confidence interval: (0.039, 0.310)). The p-values associated with testing ${\beta _{z}}$ or ${\Delta _{PH}}$ are 0.006. For the RMST comparison, the results are sensitive to the choice of the truncation time ${t^{\ast }}$, and could lead to entirely different conclusion. When the truncation time is set to be the last observed event (by default), the difference in restricted mean survival time (RMST) between the treatment and control groups $\widehat{{\Delta _{RMST}}}(9.63)=-0.022$ with a $95\% $ confidence interval $(-1.156,1.112)$ and p-value 0.970. When the truncation time is set to be median RFS (1.24 months), $\widehat{{\Delta _{RMST}}}(1.24)=0.148$ with a $95\% $ confidence interval $(0.043,0.253)$ and p-value 0.006.
Under the mixture cure model, when looking at the conditional treatment effect parameters, the conditional treatment effect on the cured probability ${\beta _{z}}$ is significant (posterior mean: $-0.596$, 95% credible interval: $(-1.169,-0.067)$, posterior probability $\mathbb{P}({\beta _{z}}\lt 0\mid {\mathcal{D}^{obs}})=0.986$), but the conditional treatment effect on non-cured population ${\gamma _{z}}$ is not (posterior mean: $-0.103$, 95% credible interval: $(-0.409,0.221)$, posterior probability $\mathbb{P}({\gamma _{z}}\lt 0\mid {\mathcal{D}^{obs}})=0.742)$). However, the unconditional treatment effect on the unified estimand ${\Delta _{M}}$ is significant (posterior mean: 0.114, 95% credible interval: $(0.005,0.231)$, posterior probability $\mathbb{P}({\Delta _{M}}\gt 0\mid {\mathcal{D}^{obs}})=0.978$).
Under the promotion time cure model, the treatment shows significant efficacy in the conditional treatment effect parameter ${\zeta _{z}}$ (posterior mean: $-0.376$, 95% credible interval: $(-0.669,-0.109)$, posterior probability $\mathbb{P}({\zeta _{z}}\lt 0\mid {\mathcal{D}^{obs}})=0.998$), resulting in significance in the unified estimand ${\Delta _{P}}$ (posterior mean: 0.185, 95% credible interval: $(0.055,0.322)$, posterior probability $\mathbb{P}({\Delta _{P}}\gt 0\mid {\mathcal{D}^{obs}})=0.998$).
The two types of cure models are rigorously compared using the goodness-of-fit (GoF) criteria introduced in Section 4.3. For the MCM, the Deviance Information Criterion (DIC) stands at 772.01 with ${p_{d}}=8.81$, whereas for the PTCM, it’s 765.95 with ${p_{d}}=6.19$. Additionally, the Pseudo-Marginal Likelihood (LPML) values for MCM and PTCM are $-386.72$ and $-382.97$, respectively. In summary, these metrics suggest that the promotion time cure model exhibits a better fit and displays greater predictive power, as evidenced by its reduced DIC and enhanced LPML values.
For the mixture cure model, the high-dose interferon presents significance in ${\beta _{z}}$ but insignificance in ${\gamma _{z}}$, which confuses the overall efficacy of high-dose interferon and making a final decision. However, the results of the unified estimand are easier to interpret, and it allows direct comparisons across different models. Other than the inconsistent inference from RMST comparisons, the other three survival models lead to similar values for the estimators of Δ and reach the same conclusions. The similarity in the values of the estimators from different models reveals the robustness of the unified estimand against model selections between different types of cure models in capturing the overall unconditional treatment effect. Therefore, the proposed estimand provides a unified way to compare the efficacy on treatment and aids in making informed clinical decisions.
7 Discussions
In cancer research, time-to-event endpoints, such as progression-free survival (PFS) and overall survival (OS), are critical in evaluating the efficacy of cancer treatments. However, this process is often time-consuming and costly, necessitating advanced modeling and inference strategies about the treatment effect in drug development [25]. In recent years, cure models have gained increasing interest due to their ability to account for a subset of patients considered “cured,” who remain event-free after a certain follow-up period. These models provide more flexible and precise treatment outcome modeling and enhance decision-making about the treatment effect in clinical trials.
In this article, we have summarised two types of cure models and have proposed a unified estimand $\Delta =\mathbb{E}[sgn({T_{1}}-{T_{0}})]$ for the survival models with cure fraction, where ${T_{0}}$ and ${T_{1}}$ denote the potential outcome when assigning to the control group and treatment group, and $sgn()$ denotes the sign function. The proposed unified estimand measures unconditional treatment effect by examining whether the treatment extends survival for patients. The relationships between the unified estimand Δ and model parameters have been established explicitly for the proportional hazards model, the mixture cure model, and the promotion time cure model, making comparisons across different models more straightforward. Based on the recent hot topic about the “Estimand” and “Covariate Adjustment” from FDA’s guidances, including covariates that are prognostic in the regression models could potentially increase the power of RCTs, or it could make the results more interpretable in non-randomized trials. However, it is well noted that the estimand from the adjusted analysis about the conditional treatment effect is not always the same estimand for the unconditional treatment effect via direct comparison. The proposed unified estimand Δ is invariant to model and variable selections, and it can be applied to different types of cure models either without covariates or in regression forms adjusting for the prognostic covariates, without altering the definition of treatment effect. Therefore, when the unified estimand is utilized under the “Estimand” framework, the inference based on different models with or without covariate adjustment can serve as sensitivity analyses towards the same unified estimand instead of supplementary analyses with an altered definition of treatment effect.
Compared with ${\Delta _{RMST}}({t^{\ast }})$, which is highly sensitive to the choice of truncation time ${t^{\ast }}$ and the tail of the Kaplan-Meier curve, the proposed unified estimand Δ reflects an overall difference in survival profiles, providing a clear and robust basis for determining treatment efficacy. In addition, the value of ${\Delta _{RMST}}({t^{\ast }})$ alone does not clearly indicate the effect size without comparing it to the RMST of the control arm. However, the unified estimand Δ directly measures the size of the treatment effect by the proportion of patients who experience extended survival on treatment.
In addition to proposing the unified estimand, we describe a general Bayesian inference procedure for the cure models. The Bayesian computation for the cure models are developed via Gibbs sampling for Bayesian inference, and model comparisons among different types of survival models with cure fraction are made through DIC and LPML.
Based on the simulation study, the Bayesian inference of the unified estimand Δ maintains the probabilities of rejecting ${H_{0}}(g)$ under null hypothesis at the desired level for both cure models. The unified estimand Δ demonstrates larger probabilities of favoring ${H_{1}}(g)$ when treatment benefits both cure fraction and non-cured survival.
Also, the unified estimand Δ is robust against model misspecification between the two types of cure models. When data are generated from one of the cure models but fitted by the other one, the inference of the unified estimand Δ always leads to reasonable probability of rejecting ${H_{0}}(g)$ under null, probability of favoring ${H_{1}}(g)$ under alternative hypothesis, root mean square error, and coverage probability.
In the current simulation scenarios, ${\sigma _{\beta }^{2}}$ is chosen as ${10^{2}}$ for MCM. When ${\sigma _{\beta }^{2}}$ decreases from ∞ to 1, the probability of rejecting ${H_{0}}(g)$ will be expected to decrease generally, and the RMSEs of the treatment effect estimators will decrease and then increase, which is similar to the trend from the Ridge regressions or using Bayesian shrinkage priors. Thus, an appropriate variance for the initial prior should be specified under the mixture cure model. If ${\sigma _{\beta }^{2}}$ is too large, the prior will be less informative, and the parameters will suffer from a lack of identifiability, leading to unreliable inference. If ${\sigma _{\beta }^{2}}$ is too small, the prior imposes too much information on the model, which leads to a biased estimator. A trade-off choice of ${\sigma _{\beta }^{2}}$ will balance model identifiability and estimation unbiasedness. In the end, we conduct an in-depth analysis of ECOG’s melanoma cancer data E1684 using the unified estimand Δ under different models, including the proportional hazards model, mixture cure model, and promotion time cure model.
All three models lead to similar results and consistent conclusions, which demonstrates Δ’s consistency compared to the difference in RMST.
Although this article focuses on estimation and Bayesian inference, the proposed estimand and the Bayesian inference framework are also useful in the Bayesian clinical trial designs under survival models with cure fraction. However, it is crucial to note that the simulation under the null hypothesis in this article corresponds to a sharp null hypothesis, assuming all individual treatment effect parameters are zero. However, to develop a rigorous Bayesian clinical trial design applying the proposed estimand, the null hypothesis should be relaxed to only restrict the expectation form of the estimand Δ to be 0. Such a relaxation would impact the size of the hypothesis testing and the sample size determination. Consequently, there is a need for rigorous evaluations of the Bayesian type I error and Bayesian power, noting an important direction for further research.
All computations and implementations detailed in this paper are developed in the R environment. The corresponding code for the proposed methodologies is available at https://github.com/hongfei-li/UniCure.