1 Introduction
Characterising the dose-response relationship and finding the right dose are important but challenging in the pharmaceutical drug development process. Nearly half of failures in Phase III trials result in part from a lack of understanding of the dose-response relationship in Phase II trials (Sacks et al., 2014). Over the last decade, the Multiple Comparison Procedure-Modelling (MCP-Mod) method, developed by Bretz et al. (2005), has been increasingly popular for Phase II trials as it can provide superior statistical evidence for dose selection.
MCP-Mod is a two-step approach that combines MCP principles and modelling techniques. In the first MCP step, it establishes a dose-response signal (Proof of Concept, PoC), and in the second Mod step, it estimates the dose-response curve and target doses of interest. MCP-Mod overcomes shortcomings of traditional approaches for dose-finding studies (see e.g. Ting, 2006, for an excellent introduction). MCP-Mod requires the pre-specification of plausible candidate models and model parameters to capture model uncertainty, which however is primarily based on the limited knowledge of the agent/compound being studied, if it is available at all (Chen and Liu, 2020). Misspecification of candidate models and model parameters in MCP-Mod may cause a loss in power and unreliable model selection (see Saha and Brannath, 2019, and references therein).
Motivated by the limitations of MCP-Mod, non-parametric methods have gained popularity for detecting dose-response trends and estimating dose-response relationships in Phase II trials. These methods offer flexibility and adaptability to capture complex patterns in the data without imposing strong assumptions on the underlying functional form of the dose-response curve. West and Harrison (2006) introduced the normal dynamic linear model (NDLM), which leverages the flexibility of dynamic linear models to estimate the dose-response curve. By accommodating time-varying coefficients, NDLM can capture the dynamic nature of the dose-response relationship, allowing for more accurate and interpretable estimates. Kirby et al. (2009) applied cubic smoothing splines with generalised cross-validation for the smoothing parameter to estimate the dose-response curve.
However, these approaches still face challenges in incorporating prior knowledge about the curvature or shape of the dose-response curve. Determining the optimal level of smoothness or choosing the appropriate model can be subjective and dependent on the specific data at hand. To address these limitations, in this work, we develop a model-free Bayesian approach, which is a novel Bayesian hierarchical framework incorporating the total (in the ${L^{2}}$ sense) curvature of the dose-response curve as a prior parameter. Our approach avoids the requirement of a set of pre-specified candidate models. The responses at the given set of doses are estimated through maximum a posteriori (MAP), with which we construct a test statistic to establish PoC through simulations. We can then estimate the dose-response relationship with simple interpolation.
The remainder of this work is organised as follows. We describe in detail our MAP approach with a curvature prior, abbreviated LiMAP-curvature, in Section 2. In Section 3, we evaluate the operating characteristics of LiMAP-curvature through simulations and compare its performance to that of MCP-Mod and smoothing spline. We present concluding remarks and future directions in Section 4.
2 Methods
We consider a trial with a total of $M+1$ distinct doses ${x_{0}},{x_{1}},\dots ,{x_{M}}$, where ${x_{0}}$ represents placebo. We let ${N_{i}}$ be the number of patients in dose group i and $f(x)$ be the true dose-response function at dose x. We assume that $f(x)$ is defined on the interval $[0,1]$ to align with the typical range of doses in pharmaceutical drug development where doses are often scaled or normalised within this range for convenience and comparability. For $i=0,1,\dots ,M$ and $j=1,2,\dots ,{N_{i}}$, we let ${Y_{ij}}$ be the response observed for patient j allocated to dose ${x_{i}}$. We assume that
where ${\mu _{i}}=f({x_{i}})$ denotes the mean response at dose ${x_{i}}$, and ${\epsilon _{ij}}\stackrel{iid}{\sim }N(0,{\sigma ^{2}})$ denotes the error term for patient j in dose group i. For the sake of simplicity, we assume that the variance ${\sigma ^{2}}$ is known and constant across all dose groups. This assumption is commonly used in different areas, including MCP-Mod (Bretz et al., 2005), BMCPMod (Fleischer et al., 2022) and Bayesian meta-analysis (Burke et al., 2018). See Fleischer et al. (2022) for further discussion on the assumption on the standard deviation σ.
In MCP-Mod, a set of plausible candidate models is required. This constrains the possible set of dose-response curves. However, with limited knowledge of the agent/compounds in Phase II trials, it is more likely to mis-specify the set of candidate models. The procedure for specifying candidate models is also somewhat cumbersome. So we would like to avoid the pre-specification of possible models beforehand, but still want to impose a certain degree of smoothness on the dose-response curve. To this end, we introduce the ${L^{2}}$-total curvature
\[ Sf={\Bigg({\int _{{x_{0}}}^{{x_{M}}}}{f^{\prime\prime }}{(x)^{2}}\hspace{0.1667em}dx\Bigg)^{1/2}}\]
to measure how far the dose-response curve $f(x)$ is from being a straight line. The ${L^{2}}$-total curvature is chosen as the prior parameter because it provides a mathematically tractable and interpretable measure of the smoothness of the dose-response curve. Specifically, it quantifies the global curvature by integrating the squared second derivative of the function, ensuring that the method captures overall trends in the data rather than being overly influenced by local deviations. We will impose a half-normal $HN({\gamma ^{2}})$ prior on $Sf$ to give low prior probabilities to the dose-response relationship that is very curved, where the standard deviation γ controls the trade-off between the ${L^{2}}$-total curvature of $f(x)$ and fidelity to data $\boldsymbol{Y}$. To capture an appropriate level of curvature and allow for optimal model performance, we assign a hyperprior for γ with a half-normal distribution $HN({\tau ^{2}})$, which reflects our prior beliefs about the likely range of values for γ before observing any data. Smaller values of τ favour smoother curves, which may be appropriate when the dose-response relationship is expected to be close to linear. Larger values of τ allow for more curvature, which may be necessary when the dose-response relationship is highly non-linear. By incorporating a hyperprior, we introduce additional uncertainty and flexibility into the prior distribution of $Sf$, influencing the estimation of the dose-response curve. The specification of the standard deviation τ will be discussed in Section 4.Given the dose-response function $f(x)$ being available at $M+1$ distinct doses, for $i=1,2,\dots ,M-1$, we have
\[\begin{aligned}{}{f^{\prime\prime }}({x_{i}})& \approx 2\bigg(\frac{{\mu _{i+1}}-{\mu _{i}}}{({x_{i+1}}-{x_{i}})({x_{i+1}}-{x_{i-1}})}\\ {} & \hspace{1em}-\frac{{\mu _{i}}-{\mu _{i-1}}}{({x_{i}}-{x_{i-1}})({x_{i+1}}-{x_{i-1}})}\bigg)\end{aligned}\]
through the second-order central difference scheme (Burden et al., 2015), therefore the ${L^{2}}$-total curvature $Sf$ being approximated through numerical integration with
\[\begin{aligned}{}{S_{\boldsymbol{\mu }}}& =2\Bigg({\sum \limits_{i=1}^{M-1}}\bigg(\frac{{\mu _{i+1}}-{\mu _{i}}}{({x_{i+1}}-{x_{i}})({x_{i+1}}-{x_{i-1}})}\\ {} & \hspace{1em}-\frac{{\mu _{i}}-{\mu _{i-1}}}{({x_{i}}-{x_{i-1}})({x_{i+1}}-{x_{i-1}})}\bigg){^{2}}\Delta {x_{i}}\Bigg){^{1/2}},\end{aligned}\]
where $\Delta {x_{i}}=({x_{i+1}}-{x_{i-1}})/2$ for $i=2,3,\dots ,M-2$ with $\Delta {x_{1}}=({x_{2}}+{x_{1}})/2-{x_{0}}$ and $\Delta {x_{M-1}}={x_{M}}-({x_{M-1}}+{x_{M-2}})/2$.We let
and define our Bayesian hierarchical model to be
where the prior
(2.2)
\[ p(\boldsymbol{\mu },\gamma \mid \boldsymbol{Y})\propto p(\boldsymbol{\mu },\gamma )p(\boldsymbol{Y}\mid \boldsymbol{\mu }),\]
\[ p(\boldsymbol{\mu },\gamma )=p(\gamma ){\prod \limits_{i=0}^{M}}p({\mu _{i}})p({S_{\boldsymbol{\mu }}}\mid \gamma )\]
with
and the likelihood
Suppose the standard deviation τ in the hyperprior for γ is pre-specified. The MAP estimates of all parameters in the model defined in Eq. (2.2) can be obtained by maximising the corresponding log-likelihood
\[\begin{aligned}{}\hat{\boldsymbol{\mu }},\hat{\gamma }& =\arg \underset{\boldsymbol{\mu },\gamma }{\max }\log p(\boldsymbol{\mu },\gamma \mid \boldsymbol{Y})\\ {} & \propto \arg \underset{\boldsymbol{\mu },\gamma }{\max }\Bigg\{-{\sum \limits_{i=0}^{M}}{\sum \limits_{j=1}^{{N_{i}}}}{\bigg(\frac{{Y_{ij}}-{\mu _{i}}}{\sigma }\bigg)^{2}}-2\log \gamma \\ {} & \hspace{2.5pt}\hspace{2.5pt}\hspace{2em}\hspace{1em}\hspace{2em}-{\bigg(\frac{{S_{\boldsymbol{\mu }}}}{\gamma }\bigg)^{2}}-{\bigg(\frac{\gamma }{\tau }\bigg)^{2}}\Bigg\}\end{aligned}\]
through a numerical optimisation algorithm like the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method and its variants (see, e.g. Nocedal and Wright, 1999, for more details).To establish PoC, we propose a test statistic
with hypotheses
\[\begin{aligned}{}{H_{0}}& :{\mu _{1}}=\cdots ={\mu _{M}}={\mu _{0}}\\ {} {H_{1}}& :\max \{{\mu _{1}},{\mu _{2}},\dots ,{\mu _{M}}\}\gt {\mu _{0}}.\end{aligned}\]
We define a significance level α for a dose-response signal, such that the corresponding critical value c satisfies
For a given α, we compute the critical value c via simulation
\[ \mathbb{P}(T\gt c\mid {H_{0}})\approx \frac{1}{R}{\sum \limits_{r=1}^{R}}{1_{\{{T^{(r)}}\gt c\}}},\]
where ${T^{(r)}}$ is the test statistic computed from the data ${\boldsymbol{Y}^{(r)}}$ simulated under the null hypothesis ${H_{0}}$, R is the total number of replicates, and ${1_{A}}$ is the indicator function equal to 1 if condition A holds and 0 otherwise.Finally, the mean response estimates $\hat{\boldsymbol{\mu }}$ can be linearly interpolated (or using a more sophisticated interpolation scheme) to obtain an estimate of the dose-response curve $f(x)$, which will also yield an estimate of the target dose of interest. In this work, we only perform simple linear interpolation as we shall focus on the PoC stage of dose-finding trials.
3 Simulations
In this section, we assess the performance of LiMAP-curvature, in terms of power to detect dose-response signals as well as estimates of the dose-response curve and target doses of interest. We compare LiMAP-curvature to smoothing spline and MCP-Mod.
3.1 Simulation Settings
We simulate randomised, double-blind, placebo-controlled, parallel-group trials with patients being equally allocated to placebo (0) or one of four active doses (0.15, 0.50, 0.80 and 1). We take the placebo effect to be 0 and the maximum treatment effect to be 0.5, respectively. We vary the sample size per dose group in $\{10,20,30,40,50,60\}$. We choose one of 12 dose-response shapes to be the true dose-response model, which are plotted in Figure 1. These models are selected to represent a wide range of realistic and theoretical scenarios commonly encountered in Phase II trials. The first six models include linear, Emax, exponential, quadratic, and logistic shapes, which are widely used in dose-finding studies and are part of the candidate model set in MCP-Mod. These models are well-established in the literature and reflect practical dose-response relationships (Bretz et al., 2005; Chen and Liu, 2020; Fleischer et al., 2022). The remaining six models include variations of the candidate models (e.g., exponential2, power, and logistic2) as well as more distinct shapes like sigmoid Emax and beta. While most of these models are also commonly used in practice, the beta model is included as an extreme case to assess the performance of MCP-Mod when the true dose-response relationship significantly differs from the pre-specified candidate models. The corresponding parameters for these models are summarised in Supplemental Material, Table S1. We simulate each patient’s response according to Eq. (2.1) with a standard deviation of $\sigma =1$. For each of 72 combinations of parameters, consisting of sample size and dose-response shape, we run 10,000 simulated trials.
For each simulated trial, we run LiMAP-curvature to establish PoC and find the target doses of interest with a standard deviation of $\tau \in \{1,3,5\}$ for the hyperprior $HN({\tau ^{2}})$. To benchmark LiMAP-curvature against smoothing spline and MCP-Mod, we run smoothing spline through the stats package (version 3.6.2) in R and employ the generalised cross-validation to choose the smoothing parameter for smoothing spline (Green and Silverman, 1993). We also run MCP-Mod through the DoseFinding package (version 1.0-4) in R, where we specify a fixed set of candidate models made up of linear, emax1, emax2, exponential1, quadratic1 and logistic1 (top six figures of Figure 1).
3.2 Simulation Results
To evaluate the performance of LiMAP-curvature, smoothing spline and MCP-Mod, we plot receiver operating characteristic (ROC) curves for all true dose-response models for sample size 40 in Figure 2. The ROC curve is produced by plotting the true positive rate against the false positive rate across a range of critical values. The closer the ROC curve approaches the top left corner, the better the method performs overall. Although we do not compute confidence intervals for the ROC curve in this work, such intervals can be essential for assessing the uncertainty in the ROC curve estimates. Confidence intervals for the ROC curve can be calculated using bootstrapping methods. Specifically, this involves resampling the data with replacement and recalculating the ROC curve for each resample. This process generates a distribution of ROC curve estimates, from which percentiles can be taken to form confidence intervals around any point on the original ROC curve. See Supplemental Material, Figures S1–S5 for ROC curves for all other sample sizes. We also summarise their powers at 5% and 10% type I error rates in Supplemental Material, Table S2 and S3, respectively.
Figure 2
ROC curves of LiMAP-curvature, smoothing spline and MCP-Mod across different true dose-response models with the sample size of 40 patients per arm. The ROC curves of MCP-Mod in (a)–(f) are produced with the true dose-response model included in the candidate model set, and the ROC curves of MCP-Mod in (g)–(l) are produced with the true dose-response model not included in the candidate model set.
As illustrated in Figure 2, we find that LiMAP-curvature has better performance when the true dose-response relationship exhibits low curvature, e.g. linear, emax1, logistic1, logistic2 and sigEmax, achieving over 80% power at a type I error rate of 5%. Choosing an appropriate τ can further improve the performance of LiMAP-curvature. More specifically, LiMAP-curvature achieves better performance with a larger τ for the true dose-response curve that is more curved, e.g. quadratic1 and beta models. Otherwise, e.g. in logistic1 and power models, a smaller τ performs better. How to select an appropriate τ in practice will be discussed in Section 4.
To benchmark the performance of LiMAP-curvature, smoothing spline and MCP-Mod, we consider two cases:
Figures 2(a–f) compare the performance of LiMAP-curvature, smoothing spline and MCP-Mod for the first case, which is a fairly rare situation in practice. The resulting ROC curves are in favour of MCP-Mod as expected due to its utilisation of exact population parameter estimates in its candidate model set, but LiMAP-curvature with $\tau =3$ achieves comparable performance. We note that for the true dose-response curve with low curvature, e.g. linear, emax1 and logistic1, $\tau =1$ yields better performance than when $\tau =3$, resulting in a power gain of around 2–9% compared to MCP-Mod and around 8–11% compared to smoothing spline, all at 5% type I error rate.
Figures 2(g–l) compare the performance of LiMAP-curvature, smoothing spline and MCP-Mod for the second case. This is the situation we expect to encounter in practice. We see that with $\tau =3$, LiMAP-curvature uniformly outperforms MCP-Mod, especially when the true dose-response model drastically deviates from the candidate model set in MCP-Mod such as sigEmax and beta models. In the latter case, LiMAP-curvature has a power gain of approximately 5–13% compared to MCP-Mod at a type I error rate of 5%. Some true dose-response models such as exponential2, quadratic2 and power models, are well captured by the candidate model set of MCP-Mod so that we see the similar performance of LiMAP-curvature and MCP-Mod. Even with an inappropriate choice of τ, LiMAP-curvature is still able to achieve significantly better performance, e.g. a power gained by approximately 2–3% over MCP-Mod at a 5% type I error rate. Moreover, compared to the smoothing spline, LiMAP-curvature consistently outperforms it, except for the beta model. LiMAP-curvature demonstrates a power gain of about 3–9% over smoothing spline at a 5% type I error rate.
Table 1
Power values and their 95% confidence intervals for LiMAP-curvature, smoothing spline and MCP-Mod across different true dose-response models with the sample size of 40 patients per arm by controlling the type I error rate at 5%.
| Model | LiMAP-curvature | Smoothing spline | MCP-Mod | ||
| $\tau =1$ | $\tau =3$ | $\tau =5$ | |||
| linear | 0.839 (0.832, 0.846) | 0.819 (0.811, 0.827) | 0.786 (0.778, 0.794) | 0.755 (0.740, 0.763) | 0.788 (0.780, 0.794) |
| emax1 | 0.884 (0.878, 0.890) | 0.799 (0.791, 0.807) | 0.777 (0.769, 0.785) | 0.768 (0.761, 0.775) | 0.791 (0.782, 0.799) |
| emax2 | 0.685 (0.676, 0.694) | 0.710 (0.701, 0.719) | 0.722 (0.720, 0.731) | 0.738 (0.726, 0.744) | 0.774 (0.766, 0.782) |
| exponential1 | 0.724 (0.715, 0.73) | 0.709 (0.701, 0.717) | 0.698 (0.689, 0.707) | 0.655 (0.643, 0.661) | 0.763 (0.755, 0.771) |
| quadratic1 | 0.547 (0.538, 0.557) | 0.652 (0.643, 0.660) | 0.676 (0.670, 0.685) | 0.622 (0.611, 0.629) | 0.678 (0.665, 0.684) |
| logistic1 | 0.886 (0.881, 0.892) | 0.851 (0.846, 0.858) | 0.827 (0.824, 0.834) | 0.795 (0.787, 0.803) | 0.867 (0.860, 0.872) |
| exponential2 | 0.804 (0.798, 0.812) | 0.774 (0.766, 0.782) | 0.747 (0.749, 0.753) | 0.711 (0.703, 0.719) | 0.777 (0.769, 0.785) |
| quadratic2 | 0.738 (0.742, 0.747) | 0.778 (0.770, 0.786) | 0.761 (0.753, 0.769) | 0.727 (0.715, 0.733) | 0.752 (0.744, 0.760) |
| sigEmax | 0.889 (0.887, 0.895) | 0.878 (0.871, 0.885) | 0.858 (0.857, 0.865) | 0.832 (0.825, 0.839) | 0.830 (0.823, 0.837) |
| power | 0.809 (0.801, 0.817) | 0.786 (0.778, 0.794) | 0.760 (0.757, 0.762) | 0.740 (0.731, 0.749) | 0.779 (0.774, 0.787) |
| logistic2 | 0.891 (0.885, 0.891) | 0.863 (0.856, 0.870) | 0.843 (0.840, 0.850) | 0.812 (0.805, 0.820) | 0.826 (0.818, 0.834) |
| betaMod | 0.271 (0.261, 0.276) | 0.532 (0.523, 0.541) | 0.562 (0.556, 0.570) | 0.555 (0.546, 0.564) | 0.424 (0.415, 0.433) |
We also evaluate and compare the performance in estimating the dose-response curve and target doses of interest, known as the minimum effective dose (MED). See Supplemental Material, Figures S6–S11 for the dose-response curves estimated using LiMAP-curvature, smoothing spline and MCP-Mod for various true dose-response models and sample sizes. The corresponding results for MED estimation are summarised in Supplemental Material, Table S4.
4 Discussion
In this work, we have introduced LiMAP-curvature, a novel Bayesian approach for establishing PoC and estimating the dose-response curve alongside target doses of interest in Phase II trials. LiMAP-curvature is “model-free”, in the sense that it does not require pre-specification of candidate dose-response models, which can influence the performance of MCP-Mod. It is built on a Bayesian hierarchical framework incorporating prior information on the ${L^{2}}$-total curvature of the dose-response curve.
We have shown through simulations that LiMAP-curvature has performance comparable to that of MCP-Mod in establishing PoC and estimating MED when the true dose-response model is included in the candidate model set of MCP-Mod, which is fairly rare in practice. When the true dose-response model deviates from the candidate model set of MCP-Mod, LiMAP-curvature has been demonstrated to outperform MCP-Mod. Furthermore, we note that LiMAP-curvature also outperforms smoothing spline in terms of power to detect dose-response signals in most cases. This indicates the advantages of LiMAP-curvature over the widely used non-parametric method of smoothing spline. The ability of LiMAP-curvature to consistently outperform both MCP-Mod and smoothing spline across various scenarios highlights its effectiveness and potential as a valuable tool in Phase II trial analysis.
To obtain optimal performance, MCP-Mod requires specialised expertise to pre-specify plausible candidate models and model parameters, but the knowledge of the agent/compounds being studied is usually limited. Compared to MCP-Mod, the only requirement for pre-specification in LiMAP-curvature is the standard deviation τ for the hyperprior $\gamma \sim HN({\tau ^{2}})$, which encodes prior knowledge of how far the dose-response curve is from a straight line. With additional simulations with varying values of the standard deviation τ (see Supplemental Material, Figure S12 and Table S5), we recommend choosing
A number of relevant issues for LiMAP-curvature deserve further research. One such extension involves integrating the sigmoid Emax model as the default response function within the LiMAP-curvature framework. The sigmoid Emax model is noted for its ability to approximate most common monotonic dose-response relationships, providing a robust framework for modelling complex dose-response relationships. However, the challenge lies in selecting appropriate prior distributions for the additional parameters introduced by the sigmoid Emax model, which requires careful consideration to ensure accurate and reliable dose-response estimation. Another important extension involves adapting LiMAP-curvature to handle a wider variety of endpoints and trial designs. The current version of LiMAP-curvature is limited to the analysis Phase II dose-finding trials with continuous endpoints, i.e. a single normally distributed homoscedastic response measured at the end of the trial. To expand its applicability, extensions to other common types of endpoints (e.g. binary, counts and survival endpoints) and trials (e.g. longitudinal dose-finding trials) require further investigation. Other directions of future research include the investigation of trial designs tailored to LiMAP-curvature and the development of statistical software implementing LiMAP-curvature.