1 Introduction
Experimental design plays a fundamental role in scientific research. An optimal design thoughtfully organizes limited experimental resources to ensure reproducible and trustworthy outcomes. This paper focuses on searching for optimal designs for crossover and interference models using a general numerical algorithm proposed by [1].
In a typical experiment with a crossover design, there are n subjects, t treatments, p periods, and $m(\le n)$ unique sequences ${s_{i}}$ of length p. Here, n, t, p, m are positive integers and we defer the mathematical rigorous definition of the experimental design to a later section. In the experiment, each of the subjects is assigned to a sequence and will receive treatments in a specific order defined in that sequence. Since subjects receive multiple treatments and have data collected throughout all periods, they serve as their own control and typically yield more data compared to other designs with the same number of subjects. Hence, crossover designs are usually more statistically efficient.
However, the limitations are noteworthy as well. First, it is usually unknown if the effects of previous treatments still exist in the following periods. These effects are often referred to as “carryover effects”. It is sensible to use a model that accounts for the carryover effects. There are variants of crossover models, which differ in the way of modeling carryover effects. Section 2 provides details of two of them, and we refer to [2] for an extended list of models. Second, the competing advantage of crossover designs is contingent on no or minimum dropouts. However in reality, as [3] pointed out “Experience suggests that a dropout rate of between 5% and 10% is not uncommon and, in some areas, can be as high as 25%”. While remedies for missing values due to dropouts have been established, the dropout issue is so common that it has to be prudently considered at the design stage. Related work in this direction includes [3, 4, 5, 6, 7, 8], with a focus on preserving the symmetric structure of the design in the presence of subject dropout. Recently in [9], Kushner’s type of linear equations (see [10]) are developed as necessary and sufficient conditions for a design being universal ${\phi _{1}}$ optimal where ${\phi _{1}}$ is defined as a new surrogate objective function considering dropout. It was later extended to unified results that apply to any configurations of the experiment. Optimal or efficient designs are obtained by either solving linear equations or integer optimization problems modified from equation systems.
The interference model is commonly employed in agricultural studies to mitigate the systematic bias caused by neighbor effects within blocks. A design for interference model consists of n subjects, t treatments, and $m(\le n)$ distinct blocks ${s_{i}}$ of size k. Optimal designs can be found in [11, 12, 13], and other sources.
The literature contains abundant theoretical work on optimal exact designs for crossover and interference models. However, despite the elegance of the theorems, they are in general not convenient for use. The optimal or efficient designs are usually derived for a certain combination of $(n,t,p)$ or $(n,t,k)$. In most cases, experimenters need to search for the theory and construct the design themselves. In addition, the relevant theory may not even exist for certain experiment configurations, and this presents a major obstacle to utilizing optimal or efficient designs. Therefore, a general and efficient algorithm easily implemented via a computer program is necessary.
Numerical algorithms have been developed for decades, and it aims at solving the optimal design problem from a general perspective. Existing algorithms are typically modifications of either Fedorov-Wynn algorithm (FWA, [14], [15]), or the multiplicative algorithm (MA, [16]). For more information, refer to [17, 18, 19, 20, 21, 22]. [23] combined multiple existing designs with modifications, named “cocktail algorithm”, and achieved dramatic speed improvement. However, all the aforementioned algorithms only focus on D-optimal designs whereas the different objective of experiments requires properly chosen optimality criterion. [1] proposed an optimal weights exchange algorithm (OWEA) for nonlinear models. It updates the support points in the same way as FWA and optimizes weights via the Newton’s method. This is the algorithm we will use.
Nevertheless, the development of an efficient algorithm for crossover designs is faced with two main obstacles. Firstly, it is worth noting that unlike general statistical models where the number of parameters is fixed, the number of parameters in a crossover design varies depending on the number of subjects. This variability extends to the dimension of the information matrix, presenting a significant challenge for algorithm-based optimization. Secondly, constructing optimal crossover designs in the presence of possible subject dropout mechanisms is challenging, considering the variety of crossover design formats. In this paper, we aim to address these challenges.
Specifically, through the Schur complement operation, we derived a novel information matrix that is linear in the design points, with corresponding weights serving as linear coefficients, paving the way for the successful implementation of the OWEA algorithm. Based on [1], we developed a general and efficient algorithm framework for optimal designs of crossover models and later extended it to interference models. Our algorithm is capable of handling any configuration of designs for the two models, including possible subject dropout mechanisms. The designs generated by our approach are highly efficient compared to existing designs in the literature. Additionally, to promote wider usage of our algorithm, we have developed a corresponding R package and R Shiny app, which provides a more user-friendly interface.
The remaining sections of this paper are organized as follows. Section 2 briefly introduces the necessary notations, statistical models and the derivations of the information matrices to fit into the framework of the algorithm. Section 3 gives details of the algorithm and the developed R-package. Numerical results are presented in section 4, and a short discussion is in section 5.
2 The Framework and Notations
In this section, we will introduce the general framework starting with basic concepts of optimal designs followed by demonstrations of crossover models and interference models as well as their information matrix. We will show how the information matrices, utilizing Schur complement, fit into the algorithmic framework in [1].
2.1 Optimal Designs and Objective Functions
Despite the various types of designs, generally, an experimental design with n runs can be written as
where $({s_{i}},{n_{i}})$ is the pair of the design point and its associated repetition. The ${s_{i}}$ is a vector of length $p(\ge 1)$. The design space, which is a collection of the design points, is represented by χ. The m is the number of distinct design points. If all of the repetitions $({n_{1}},\dots ,{n_{m}})$ are required to be positive integers, then the design (2.1) is called exact design and $m\le n$.
To derive an optimal design is to find a set of pairs $({s_{i}},{n_{i}})$ that optimize an objective function. Due to the discrete nature of the exact design, integer solutions of ${n_{i}}$ are generally intractable. In that case, one may drop the integer requirement and search for an approximate design:
where ${w_{i}}$ is the weight associated to design point ${s_{i}}$ and all weights sum up to unity.
(2.2)
\[ \xi =\{({s_{i}},{w_{i}})|{s_{i}}\in \chi ,{\sum \limits_{i=1}^{m}}{w_{i}}=1,{w_{i}}\in (0,1]\}.\]In after-experiment data analysis, usually a differentiable function of θ, say $g(\theta )$, is of interest. Notice that g must be an estimable function. In the rest of this paper, we always have this assumption. The choice of g grants the flexibility on parameter of interest. For example, when we target on all parameters, g is an identity function, i.e. $g(\theta )=\theta $; and when comparison is the goal, one of many options of g is $g(\theta )=({\theta _{1}}-{\theta _{\nu }},{\theta _{2}}-{\theta _{\nu }},\dots ,{\theta _{\nu -1}}-{\theta _{\nu }}))$, where ν is the length of θ. Denote by $\hat{\theta }$ the maximum likelihood estimator (MLE) of θ, then it is well known that $g(\hat{\theta })$ is also the MLE of $g(\theta )$. Under mild assumptions, the Delta method yields the asymptotic covariance matrix of $g(\hat{\theta })$. With ${\mathbf{I}_{\xi }}(\theta )$ being the information matrix of θ under design ξ, ${C_{\xi }}(g)$ is the asymptotic variance-covariance matrix of $g(\hat{\theta })$ obtained using the Delta method under mild conditions,
With the notations of (2.1) and (2.2), we follow the definition in [24] of the objective function,
where r is a non-negative real number, v is the dimension of g, g is a differentiable function of parameter vector θ, and $Tr({C_{\xi }}(g))$ is the trace of the matrix ${C_{\xi }}(g)$.
(2.4)
\[ {\Phi _{r}}({C_{\xi }}(g))={\left[\frac{1}{v}Tr{({C_{\xi }}(g))^{r}}\right]^{1/r}},0\le r\lt \infty ,\]Note that ${\Phi _{r}}({C_{\xi }}(g))$ is equivalent to many prevailing optimality criteria on various values of r. For example, $r=0$, ${\Phi _{r}}({C_{\xi }}(g))$ is ${\lim \nolimits_{r\downarrow 0}}{\left[\frac{1}{v}Tr{({C_{\xi }}(g))^{r}}\right]^{1/r}}=|{C_{\xi }}(g){|^{1/v}}$, which is D-optimality; when $r=1$, we have ${\Phi _{r}}({C_{\xi }}(g))=Tr({C_{\xi }}(g))/v$, and it is A-optimality.
For the purpose of verifying a design is optimal, the following theorem, adapted from Theorem 1 of [1], provides Kiefer’s type general equivalence theorem (GET) which serves as a sufficient and necessary condition.
Theorem 1.
Let $g(\theta )$ be an estimable function of θ. Suppose an arbitrary design ξ has an information matrix ${\mathbf{I}_{\xi }}$. ξ is ${\Phi _{r}}$ optimal for $g(\theta )$ if and only if the directional derivative of ${\Phi _{r}}$, denoted by ${d_{r}}(s,\xi )$ satisfies
for any $s\in \chi $, with equality holds if s belongs to the support of ξ.
In addition, the ${d_{r}}(s,\xi )$ can be calculated by:
when $r=0$,
when $r\gt 1$,
(2.6)
\[ {d_{r}}(s,\xi )=Tr[{({C_{\xi }}(g))^{-1}}(\frac{\partial g}{\partial {\theta ^{\prime }}}){\mathbf{I}_{\xi }^{-}}({\mathbf{I}_{s}}-{\mathbf{I}_{\xi }}){\mathbf{I}_{\xi }^{-}}{(\frac{\partial g}{\partial {\theta ^{\prime }}})^{\prime }}].\](2.7)
\[\begin{aligned}{}{d_{r}}(s,\xi )=& {(\frac{1}{v})^{1/r}}Tr{[({C_{\xi }}(g))]^{1-r}}\times \\ {} & Tr[{({C_{\xi }}(g))^{r-1}}(\frac{\partial g}{\partial {\theta ^{\prime }}}){\mathbf{I}_{\xi }^{-}}({\mathbf{I}_{s}}-{\mathbf{I}_{\xi }}){\mathbf{I}_{\xi }^{-}}{(\frac{\partial g}{\partial {\theta ^{\prime }}})^{\prime }}].\end{aligned}\]2.2 Models and Information Matrix
2.2.1 Crossover Model with Carryover Effect
Suppose a crossover design d is conducted with n subjects, t treatments and p periods, the observation from subject i at period j is typically modeled by
where μ is the grand mean, ${\zeta _{i}}$ is the effect from ith subject, ${\pi _{j}}$ stands for the effects from jth period, $d(i,j)$ denotes the treatment assignment of jth period for ith subject from design d, ${\tau _{d(i,j)}}$ is the treatment effect from $d(i,j)$, and ${\gamma _{d(i,j-1)}}$ is the carryover effect due to treatment $d(i,j-1)$ where ${\gamma _{d(i,0)}}$ is set to 0 by convention. ${\epsilon _{ij}}$ is the error term.
(2.8)
\[\begin{aligned}{}{y_{ij}}=\mu +{\zeta _{i}}+{\pi _{j}}& +{\tau _{d(i,j)}}+{\gamma _{d(i,j-1)}}+{\epsilon _{ij}};\\ {} & i=1,\dots ,n;\hspace{1em}j=1,\dots ,p,\end{aligned}\]If the collection of observations from the design d is organized in a vector ${Y_{d}}={({y_{11}},{y_{12}},\dots ,{y_{1p}},{y_{21}},\dots ,{y_{np}})^{\prime }}$, then model (2.8) can be written in matrices,
where ϵ is a vector of independent and identically distributed errors with a mean vector $\mathbf{0}$ and a variance-covariance matrix ${\sigma ^{2}}{I_{np}}$, which can be relaxed to a form of Kronecker’s product, ${I_{n}}\otimes \Sigma $, in most cases when the variance-covariance matrix of $({\epsilon _{i1}},\dots ,{\epsilon _{ip}})$ is Σ for $i=1,\dots ,n$. Besides, we also have $\pi ={({\pi _{1}},\dots ,{\pi _{p}})^{\prime }}$, $\zeta ={({\zeta _{1}},\dots ,{\zeta _{n}})^{\prime }}$, $\tau ={({\tau _{1}},\dots ,{\tau _{t}})^{\prime }}$, $\gamma ={({\gamma _{1}},\dots ,{\gamma _{t}})^{\prime }}$, $U={I_{n}}\otimes {\mathsf{1}_{p}}={({U^{\prime }_{1}},\dots ,{U^{\prime }_{n}})^{\prime }}$, $Z={\mathsf{1}_{n}}\otimes {I_{p}}={({Z^{\prime }_{1}},\dots ,{Z^{\prime }_{n}})^{\prime }}$, $T={({T^{\prime }_{1}},\dots ,{T^{\prime }_{n}})^{\prime }}$ and $R={({R^{\prime }_{1}},\dots ,{R^{\prime }_{n}})^{\prime }}$. Here ${U_{i}}$’s are $p\times n$ incidence matrices for subjects. ${Z_{i}}$ is $p\times p$ subject-to-period incidence matrix, ${T_{i}}$ and ${R_{i}}$ are $p\times t$ period-to-treatment incidence matrices for the ith subject. The structure of matrices ${Z_{i}}$, ${T_{i}}$, and ${R_{i}}$ depend on the design d. ${I_{n}}$ is an identity matrix of dimension n and ${\mathsf{1}_{p}}$ stands for a $p\times 1$ vector of 1’s. Then the information matrix for full parameter vector $\Theta ={({\zeta ^{\prime }},{\pi ^{\prime }},{\tau ^{\prime }},{\gamma ^{\prime }})^{\prime }}$ is
where the positive integer $m(\le n)$ represents the number of unique design point (sequence of treatments) of the design, and $m=n$ holds when there is only one subject assigned to each of the unique design points. ${\mathbf{I}_{i}}$ is the information matrix for the ith unique design point, and ${n_{i}}$ is the number of subjects assigned to that design point or alternatively the “repetition” of a design point. Additionally, ${\textstyle\sum _{i}}{n_{i}}=n$. For the remaining of this paper, unless otherwise specified, the term “design point” refers to “unique design point” in an experimental design.
(2.10)
\[\begin{aligned}{}\mathbf{I}(\Theta )& ={\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}U& Z& T& R\end{array}\right)^{\prime }}\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}U& Z& T& R\end{array}\right)\end{aligned}\](2.11)
\[\begin{aligned}{}& ={\sum \limits_{i=1}^{n}}{\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}{U_{i}}& {Z_{i}}& {T_{i}}& {R_{i}}\end{array}\right)^{\prime }}\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}{U_{i}}& {Z_{i}}& {T_{i}}& {R_{i}}\end{array}\right)\\ {} & ={\sum \limits_{i=1}^{n}}{\mathbf{I}_{i}}={\sum \limits_{i=1}^{m}}{n_{i}}{\mathbf{I}_{i}},\end{aligned}\]An optimal exact design is then to choose a collection of sequences so that it optimizes an objective function related to (2.10). In this paper, we obtain optimal approximate design by a numerical algorithm and carefully round it to an exact design which is either efficient or optimal. However two issues regarding information matrix has to be settled prior to the implementation of the algorithm. First, the direct treatment is the target to estimate, and yet its information matrix is not additive with respect to design points. As a result, it does not fit for the prerequisite of the algorithm. On the other hand, although information matrix for full parameters is additive, its dimension is changing all the time because of the inclusion of subject effects ζ as well as the iterative nature of numerical algorithm. Optimizing on an information matrix of varying dimensions is problematic. Therefore, to fit the crossover design problem to our framework, we exclude the subject effect from (2.10) through the Schur complement operation. Excluding them and the general mean (noticing that the unity vector ${\mathsf{1}_{np}}$ belongs to the column space of U) and utilizing the Schur complement of a matrix, we reach to the information matrix of the marginal parameter vector $\theta ={({\pi ^{\prime }},{\tau ^{\prime }},{\gamma ^{\prime }})^{\prime }}$ as
where, ${n_{i}}=n{w_{i}}$, for $i=1,2,\dots ,m$, and ${w_{i}}$ is called the “weight” associated to a design point. $p{r^{\perp }}(\cdot )$ is an orthogonal projection operator. Notice that ${\mathbf{I}_{i}}$ is the information matrix associated with the ith design point. For any matrix X, $p{r^{\perp }}(X)=I-X{({X^{\prime }}X)^{-1}}{X^{\prime }}$. With such derivation, the information matrix is linear in the design points with corresponding weights as linear coefficients.
(2.12)
\[\begin{aligned}{}\mathbf{I}(\theta )& ={\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}Z& T& R\end{array}\right)^{\prime }}p{r^{\perp }}(U)\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}Z& T& R\end{array}\right)\\ {} & ={\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}Z& T& R\end{array}\right)^{\prime }}[{I_{np}}-\frac{1}{p}{I_{n}}\otimes {1_{p}}{1^{\prime }_{p}}]\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}Z& T& R\end{array}\right)\\ {} & ={\sum \limits_{i=1}^{n}}{\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}{Z_{i}}& {T_{i}}& {R_{i}}\end{array}\right)^{\prime }}[{I_{p}}-\frac{1}{p}{1_{p}}{1^{\prime }_{p}}]\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}{Z_{i}}& {T_{i}}& {R_{i}}\end{array}\right)\\ {} & ={\sum \limits_{i=1}^{n}}{\mathbf{I}_{i}}={\sum \limits_{i=1}^{m}}{n_{i}}{\mathbf{I}_{i}}=n{\sum \limits_{i=1}^{m}}{w_{i}}{\mathbf{I}_{i}},\end{aligned}\]2.2.2 Crossover Model with Dropout
To account for the dropouts, following [3, 9], define the dropout mechanism of a subject as $\ell =({\ell _{1}},\dots ,{\ell _{p}})$, where ${\ell _{i}}$ is the probability that the longest time of stay is i, and ${\textstyle\sum _{i=1}^{p}}{\ell _{i}}=1$ and assume
Given a realization of ℓ, the information matrix (2.12) becomes
where ${M_{i}}$ is an indicator matrix depending on the dropout mechanism ℓ,
${I_{({a_{i}}\times {a_{i}})}}$ is the ${a_{i}}\times {a_{i}}$ identity matrix, ${a_{i}}$ is the longest period of stay for subject i, and O and ${O_{1}}$ are zero matrices with proper dimensions.
(2.13)
\[\begin{aligned}{}\mathbf{I}(\theta ,\ell )=& {\sum \limits_{i=1}^{n}}{\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}{Z_{i}}& {T_{i}}& {R_{i}}\end{array}\right)^{\prime }}[{M_{i}}-\\ {} & {M_{i}}{1_{p}}{({1^{\prime }_{p}}{M_{i}}{1_{p}})^{-1}}{1^{\prime }_{p}}{M_{i}}]({Z_{i}},{T_{i}},{R_{i}})\\ {} =& {\sum \limits_{i=1}^{n}}{\mathbf{I}_{i}}(\ell )={\sum \limits_{i=1}^{m}}{n_{i}}{\mathbf{I}_{i}}(\ell )=n{\sum \limits_{i=1}^{m}}{w_{i}}{\mathbf{I}_{i}}(\ell ),\end{aligned}\](2.14)
\[\begin{aligned}{}{M_{i}}(\ell )& ={\left(\begin{array}{c@{\hskip10.0pt}c}{I_{({a_{i}}\times {a_{i}})}}& {O_{1}}\\ {} {O^{\prime }_{1}}& O\end{array}\right)_{p\times p}},\end{aligned}\]The information matrix is random due to the modeling of the dropout mechanism. The intuitively reasonable objective function, $E{\Phi _{p}}({C_{\xi }}(g))$ (defined in (2.4)), is not easy to deal with, and ${\Phi _{p}}(E({C_{\xi }}(g)))$ is chosen as a feasible replacement. Here the operator E means expectation with respect to the dropout mechanism. Suppose we obtain an optimal approximate design ${\xi ^{\ast }}$ by minimizing ${\Phi _{p}}(E({C_{\xi }}(g)))$, and round it to exact design ${\xi _{1}}$. The efficiency of ${\xi _{1}}$ should be defined as $E{\Phi _{p}}({C_{{\xi _{opt}}}}(g))/E{\Phi _{p}}({C_{{\xi _{1}}}}(g))$, where ${\xi _{opt}}=\underset{\xi }{argmin}E{\Phi _{p}}({C_{\xi }}(g))$. However, the efficiency is not accessible because ${\xi _{opt}}$ is unknown. Nevertheless, one is still able to obtain the lower bound.
where the first ‘≥’ is due to the convexity of the function ${\Phi _{p}}$ and Jensen’s inequality, and the second ‘≥’ is because of (2.15). Note that the problem in (2.16) fits the framework of (2.12) and hence could be solved by our algorithm. Therefore, the rightmost term, $\frac{{\Phi _{p}}(E({C_{{\xi ^{\ast }}}}(g)))}{E{\Phi _{p}}({C_{{\xi _{1}}}}(g))}$, becomes evaluable for any given arbitrary design ${\xi _{1}}$ and dropout mechanism and will serve as the lower bound efficiency. Furthermore, [9] proposed to convert the results of (2.16) to an exact design and gauge its performance by the lower bound of efficiency. We are using the same approach except that our algorithm to derive the exact design is much faster.
Lemma 1 (Zheng (2013a)).
2.2.3 Crossover Model with Proportional Carryover Effects
Another variant of the classical crossover design model is the proportional model. For the design d with p periods and t treatments, individual outcomes are modeled as
where the notations are exactly the same as in (2.8), except that the additional real-valued λ, which is the proportion that carryover effects accounting for direct effects. The proportional model can also be written in matrices,
where ϵ is a vector of independent errors with mean $\mathbf{0}$ and variance covariance matrix ${I_{n}}\otimes \Sigma $, Σ is an $p\times p$ positive definite matrix. Utilizing Schur complement operations, the information matrix for the parameter vector $\theta ={({\pi ^{\prime }},{\tau ^{\prime }},\lambda )^{\prime }}$ is
Therefore, the design problem for the proportional model fits into the framework of (2.12). Note that the proportional model is nonlinear due to the term $\lambda \tau $, and its information matrix depends on unknown parameters τ and λ.
(2.17)
\[\begin{aligned}{}{y_{ij}}=\mu +{\zeta _{i}}+{\pi _{j}}& +{\tau _{d(i,j)}}+\lambda {\tau _{d(i,j-1)}}+{\epsilon _{ij}};\\ {} & i=1,\dots ,n;\hspace{1em}j=1,\dots ,p,\end{aligned}\](2.19)
\[\begin{aligned}{}\mathbf{I}(\theta ,\lambda ,\tau )=& {\sum \limits_{i=1}^{n}}{({Z_{i}},{T_{i}}+\lambda {R_{i}},{R_{i}}\tau )^{\prime }}[{\Sigma ^{-1}}-\\ {} & {\Sigma ^{-1}}{1_{p}}{[{1^{\prime }_{p}}{\Sigma ^{-1}}{1_{p}}]^{-}}{1^{\prime }_{p}}{\Sigma ^{-1}}]({Z_{i}},{T_{i}}+\lambda {R_{i}},{R_{i}}\tau )\\ {} =& {\sum \limits_{i=1}^{n}}{\mathbf{I}_{i}}={\sum \limits_{i=1}^{m}}{n_{i}}{\mathbf{I}_{i}}=n{\sum \limits_{i=1}^{m}}{w_{i}}{\mathbf{I}_{i}}.\end{aligned}\]2.3 Interference Model
The individual observations of the interference model with t treatments, n blocks of size k are written as
where ${y_{ij}}$ denotes the response from jth plot of ith block, μ is general mean, $d(i,j)$ stands for the treatment assignment of ith block and jth plot according to design d, and ${\tau _{d(i,j)}}$, ${\lambda _{d(i,j-1)}}$, and ${\rho _{d(i,j+1)}}$ are treatment effects from treatment itself, its left plot and right plot. By convention, we set ${\lambda _{d(i,0)}}={\rho _{d(i,p+1)}}=0$. Again, the responses can be gathered in a vector and modeled in terms of matrices,
where ${Y_{d}}={({y_{11}},..,{y_{1k}},{y_{21}},\dots ,{y_{np}})^{\prime }}$, $\gamma ={({\gamma _{1}},\dots ,{\gamma _{n}})^{\prime }}$, $\tau ={({\tau _{1}},\dots ,{\tau _{t}})^{\prime }}$, $\lambda ={({\lambda _{1}},\dots ,{\lambda _{t}})^{\prime }}$, $\rho ={({\rho _{1}},\dots ,{\rho _{t}})^{\prime }}$, $U={I_{n}}\otimes {\mathsf{1}_{k}}$, ${T_{d}}={({T^{\prime }_{1}},\dots ,{T^{\prime }_{n}})^{\prime }}$, ${L_{d}}={({L^{\prime }_{1}},\dots ,{L^{\prime }_{n}})^{\prime }}$, ${R_{d}}={({R^{\prime }_{1}},\dots ,{R^{\prime }_{n}})^{\prime }}$. Here ${U_{i}}$’s are incidence matrices for block, ${T_{i}}$ and ${L_{i}}$ and ${R_{i}}$ are $k\times t$ plot-to-treatment incidence matrices for ith block that depends on the design d. The error vector ϵ is assumed to follow $N(0,{I_{n}}\otimes \Sigma )$, where Σ is an arbitrary positive definite matrix of order k. Let $\theta ={({\tau ^{\prime }},{\lambda ^{\prime }},{\rho ^{\prime }})^{\prime }}$ be the vector of parameters of interest. Note that block effects are excluded due to similar reasons to that of the subject effect for the crossover model. The information matrix for θ, after applying Schur complement, is
(2.20)
\[\begin{aligned}{}{y_{ij}}=\mu +{\gamma _{i}}+{\tau _{d(i,j)}}& +{\lambda _{d(i,j-1)}}+{\rho _{d(i,j+1)}}+{\epsilon _{ij}}\\ {} & i=1,\dots ,n;\hspace{1em}j=1,\dots ,k,\end{aligned}\](2.22)
\[\begin{aligned}{}\mathbf{I}(\theta )=& {\sum \limits_{i=1}^{n}}{({T_{i}},{L_{i}},{R_{i}})^{\prime }}[{\Sigma ^{-1}}-\\ {} & {\Sigma ^{-1}}{1_{k}}{({1^{\prime }_{k}}{\Sigma ^{-1}}{1_{k}})^{-1}}{1^{\prime }_{k}}{\Sigma ^{-1}}]({T_{i}},{L_{i}},{R_{i}})\\ {} =& {\sum \limits_{i=1}^{n}}{\mathbf{I}_{i}}={\sum \limits_{i=1}^{m}}{n_{i}}{\mathbf{I}_{i}}=n{\sum \limits_{i=1}^{m}}{w_{i}}{\mathbf{I}_{i}}.\end{aligned}\]At this point, we have fitted the design problem for the crossover model with dropout, the proportional model and the interference model into the framework of (2.12), to which our algorithm would apply. In the next section, we shall introduce the algorithm and the developed R-package.
3 The Algorithm and R-package
Numerical algorithms are powerful and convenient tools for finding optimal designs. [1] proposed an optimal weights exchange algorithm (OWEA) for nonlinear models. It updates the support points in the same way as Fedorov-Wynn algorithm and optimizes weights via the Newton’s method. This is the algorithm we are going to use. The procedures of OWEA’s implementation is briefly introduced as follows.
Phase I: Finding optimal approximate designs
Start with initial support ${S^{(1)}}$ with $\nu +1$ randomly picked sequences and equal weights ${w^{(1)}}$, at iteration t,
Phase II: Rounding
-
1. Input support ${S^{(t)}}$, ${w^{(t)}}$, and update weights using the Newton’s method. Note that support points with zero weights will be deleted in optimizing process.Newton’s Method:Input: Start with the initial input by setting ${S_{1}^{(t)}}={S^{(t)}}$, and ${w_{1}^{(t)}}={w^{(t)}}$. After $j-1$ iterations, with the ${S_{j}^{(t)}}$ and ${w_{j}^{(t)}}$, updating the weight by the following steps:
-
(a) update the weights by the equation ${w_{j+1}^{(t)}}={w_{j}^{(t)}}-\alpha {(\frac{{\partial ^{2}}\Phi }{\partial w\partial {w^{\prime }}})^{-1}}\frac{\partial \Phi }{\partial w}{|_{{w_{j}^{(t)}}}}$.
-
(b) If ${w_{j+1}^{(t)}}$ has negative component, go to step $(c)$, otherwise proceed to step $(d)$.
-
(c) Set α to $\alpha /2$ and go back to step $(a)$. If $\alpha \lt {\epsilon _{\alpha }}$, remove the point with smallest weight and go back to step $(a)$.
-
(d) Check if $\frac{\partial \Phi }{\partial w}{|_{{w_{j+1}^{(t)}}}}\lt {\epsilon _{0}}$, if true, ${\tilde{w}^{(t)}}={w_{j+1}^{(t)}}$ is the optimal weights otherwise go to next iteration.
-
-
2. Derive ${s_{t}^{\ast }}=\underset{s\in \chi }{argmax}\hspace{5.0pt}{d_{r}}(s,{\xi ^{(t)}})$, where ${\xi ^{(t)}}=\{({s_{i}},{w_{i}})|{s_{i}}\in {\tilde{S}^{(t)}},{w_{i}}\in {\tilde{w}^{(t)}},\displaystyle\sum \limits_{i}{w_{i}}=1\}$
-
3. Check ${d_{r}}({s_{t}^{\ast }},{\xi ^{(t)}})\lt {\epsilon _{d}}$, where ${\epsilon _{d}}$ is a pre-selected small positive value. If true, ${\xi ^{(t)}}$ is the desired design. Otherwise, let ${S^{(t+1)}}={\tilde{S}^{(t)}}\textstyle\bigcup \{{s_{t}^{\ast }}\}$, ${w^{(t+1)}}={\tilde{w}^{(t)}}\textstyle\bigcup \{0\}$, and go to step 1.
Table 1
t = p = 4, n = 16, a = (0,0,1/2,1/2).
Optimality | Rel-Eff (to Gurobi) | Rel-Eff (rounding) | Efficiency (Lower Bound) | Rel-Eff (to ${d_{2}}$ in [25]) | Time (sec) | Gurobi Time (sec) |
A | 0.9994 | 0.9917 | 0.9366 | 0.9844 | 2.18 | 20.84 |
D | 0.9979 | 0.9757 | 0.9519 | 0.9538 | 2.63 | 20.84 |
Suppose optimal design from Phase I is ${\xi ^{\ast }}=\{({s_{i}},{w_{i}^{\ast }})|{s_{i}}\in {S^{\ast }},{w_{i}}\in (0,1),{\displaystyle\sum \limits_{i}^{m}}{w_{i}}=1\}$, for a given number of total runs, say N,
-
1. Compute $N{w_{i}}$ for $i=1,\dots $, and round them to their nearest integers. This results in a tentative exact design $d=\{({s_{i}},{n_{i}})|{s_{i}}\in {S^{\ast }},{\textstyle\sum _{i}^{m}}{n_{i}}=N\}$
-
2. For every ${s_{j}}\in \Omega /{S^{\ast }}$,
-
(a) For every unique ${s_{i}}\in {S^{\ast }}$, subtract ${n_{i}}$ by 1 and append ${s_{j}}$ to the design with repetition 1, this results in a new design, say ${d^{(-i,j)}}$.
-
(b) Calculate two values $V1=\Phi (d)$ and $V2=\Phi ({d^{(-i,j)}})$, if $V1\gt V2$, keep ${d^{(-i,j)}}$ as the current tentative design, otherwise, keep the tentative design unchanged.
-
(c) Repeat these two steps for $i=1,\dots ,m$;
-
-
3. After traversing all possible j’s, output the current design as efficient design ${d^{\ast }}$.
Note that tuning parameters ${\epsilon _{d}}$, ${\epsilon _{\alpha }}$ and ${\epsilon _{0}}$ may affect the time for convergence. In all examples tested in this paper, the combination of ${\epsilon _{d}}={10^{-15}}$, ${\epsilon _{\alpha }}={10^{-6}}$ and ${\epsilon _{0}}={10^{-10}}$ work well and optimal or highly efficient designs can be found within a relatively short period of time. Besides, it is known that the Newton’s method is sensitive to the choice of initial points and sometimes has overshooting issue. However, based on our experience, efficient designs are always found.
We have developed an R Package called OWEA to apply the algorithm to the three models as mentioned in the earlier section, namely (i) crossover design accounting for subject dropouts; ($ii$) proportional model; ($iii$) interference model. It can be readily applicable to all these three optimal design problems. To implement it, a user only needs to input very basic information such as numbers of periods, treatments, and subject, and of course the dropout probability for the model in (i). The detailed implementation is available at https://cran.r-project.org/package=OWEA.
4 Examples
In this section, utilizing the information matrices derived in section 2 and the OWEA algorithm in Section 3, exact designs are presented for three scenarios: crossover model with dropouts, crossover model with proportional carryover effects, and interference models. For convenience, we call the design obtained from OWEA algorithm the OWEA design, optimal design available from the literature the literature design. For example, the A- optimal design obtained from OWEA is “A-optimal OWEA design”. Moreover, for all the three scenarios, [9, 13, 25] developed linear equation systems as necessary and sufficient conditions for universally optimal designs and suggested searching for optimal designs using integer optimizer, like [26]. More specifically, the conditions are in the form of linear equations regarding the replication numbers of each treatment sequence. Since there is no guarantee of integer solutions, we transform the problem into a quadratic programming which minimizes the total square of loss for any equation that does not hold. In this section, we call designs derived such way as the Gurobi design.
Relative efficiency, shown as “Rel-Eff (to xx)”, is provided when the intention is to compare the OWEA designs to existing designs. In addition, the relative efficiency due to rounding from the approximate design to the exact design, shown as “Rel-Eff (rounding)”, is also presented.
All the algorithms and examples are programmed in R (R Core Team, 2022) [27] and are executed on an Apple Desktop with 8GB RAM. Computing times are recorded and provided.
4.1 Crossover with Subject Dropout
Example 1.
$(t,p)=(4,4)$, $n=16$, $\ell =(0,0,1/2,1/2)$. Exact designs and their performance under A- and D- optimality are summarized in Table 1. It is evident that there is only a tiny loss in efficiency when rounding from approximate design to exact design, though the lower bound efficiencies are only 0.9366 and 0.9519 for the A- and D- criteria. Compared to ${d_{2}}$ from [9] which is proved to be highly efficient, the relative efficiency of the OWEA designs are 0.9844 and 0.9538 respectively. They are not very high but still satisfactory. The design obtained utilizing the integer programming from [9] is also available and it is almost identical to the OWEA designs in efficiency, but the OWEA framework only takes a small fraction of computing time.
(4.1)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4\\ {} 2& 3& 4& 4& 1& 1& 3& 4& 1& 2& 2& 4& 1& 2& 2& 3\\ {} 3& 4& 3& 3& 3& 4& 1& 1& 2& 4& 4& 2& 2& 1& 3& 1\\ {} 4& 2& 2& 2& 4& 3& 4& 3& 4& 1& 1& 1& 3& 3& 1& 2\end{array}\](4.2)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4\\ {} 2& 3& 4& 4& 1& 1& 3& 4& 1& 2& 2& 4& 1& 2& 2& 3\\ {} 3& 4& 3& 3& 3& 4& 1& 1& 2& 4& 4& 2& 2& 1& 3& 1\\ {} 4& 2& 2& 2& 4& 3& 4& 3& 4& 1& 1& 1& 3& 3& 1& 2\end{array}\](4.3)
\[ {d_{2}}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4\\ {} 2& 2& 3& 4& 1& 3& 4& 4& 1& 1& 2& 4& 1& 2& 3& 3\\ {} 3& 3& 4& 2& 4& 1& 1& 3& 2& 4& 4& 1& 3& 1& 2& 2\\ {} 4& 4& 2& 2& 4& 1& 3& 3& 2& 2& 4& 1& 3& 3& 1& 1\end{array}\](4.4)
\[ \text{Gurobi Design}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4\\ {} 2& 2& 3& 4& 1& 3& 4& 4& 1& 2& 4& 4& 1& 2& 3& 3\\ {} 3& 3& 2& 2& 4& 1& 1& 3& 2& 4& 1& 2& 3& 1& 1& 2\\ {} 4& 4& 4& 2& 4& 4& 3& 3& 2& 4& 1& 1& 3& 3& 1& 1\end{array}\]Example 2.
$(t,p)=(4,4)$, $n=19$, $\ell =(0,0,1/2,1/2)$. This is a case where n is not a multiple of either t or p, and the optimal designs in literature are not available. Exact OWEA designs and the Gurobi designs are summarized in Table 2. Still, the lower bound efficiency is around 0.95, but the efficiency loss due to rounding is less than 0.015. In addition, the OWEA designs outperform the Gurobi designs with improved efficiency of 1.0261 and 1.0785 under A- and D- optimal criteria but with only no more than 3% of computing time.
(4.5)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4& 4\\ {} 3& 3& 4& 4& 4& 1& 1& 3& 3& 4& 1& 2& 2& 4& 1& 1& 2& 2& 3\\ {} 2& 4& 3& 3& 3& 3& 4& 1& 4& 1& 2& 4& 4& 2& 2& 2& 1& 3& 1\\ {} 4& 2& 2& 2& 2& 4& 3& 4& 1& 3& 4& 1& 1& 1& 3& 3& 3& 1& 2\end{array}\](4.6)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 4& 4& 4& 4\\ {} 2& 3& 3& 4& 4& 1& 1& 3& 3& 4& 1& 2& 2& 4& 4& 1& 1& 2& 3\\ {} 3& 2& 4& 3& 3& 3& 4& 1& 1& 1& 4& 4& 4& 2& 2& 2& 2& 1& 1\\ {} 4& 4& 2& 2& 2& 4& 3& 4& 4& 3& 2& 1& 1& 1& 1& 3& 3& 3& 2\end{array}\](4.7)
\[ \text{Gurobi Design}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 3& 3& 3& 3& 4& 4& 4& 4& 4\\ {} 2& 3& 4& 4& 4& 1& 3& 3& 4& 4& 1& 2& 2& 4& 1& 1& 3& 2& 2\\ {} 4& 4& 2& 2& 3& 3& 1& 1& 1& 3& 2& 1& 4& 1& 2& 2& 2& 1& 1\\ {} 3& 4& 3& 3& 3& 4& 1& 4& 3& 3& 2& 4& 4& 1& 3& 3& 2& 3& 3\end{array}\]Table 2
t = p = 4, n = 19, a = (0,0,1/2,1/2).
Optimality | Rel-Eff (to Gurobi) | Rel-Eff (Rounding) | Efficiency (Lower Bound) | Time (sec) | Gurobi Time (sec) |
A | 1.0261 | 0.9944 | 0.9489 | 2.28 | 100.41 |
D | 1.0785 | 0.9855 | 0.9576 | 2.71 | 100.41 |
Example 3.
$(t,p)=(4,4)$, $n=250$. This is a real world application based on [28]. The Best African American Response to Asthma Drugs (BARD) Trial, contained two four-treatment, four-period, four-sequence crossover trials, one involving 294 adolescents and adults, and the other one involving 280 children. For the purpose of illustration, we focus on the BARD trial for the children. The four treatment regimens were represented by letters Q, R, S, and T. A total of 280 participants were evenly randomized to one of the four sequences, QRST, RTQS, SQTR and TSRQ. The study design was uniform and balanced. Out of the 280 children: 30 dropped out before period 1, 16 dropped out after period 1, 17 dropped out after period 2, 19 dropped out after period 3, and the remaining 198 completed the study. Since the subjects who drop out before period 1 provides no information at all, for the illustration of our method, we shall study the optimal design of crossover design with $(280-30=)250$ subjects, 4 periods, 4 treatments, and the adjusted dropout mechanism as $\ell =(16,17,19,198)/250=(0.064,0.068,0.076,0.792)$. The resultant OWEA designs as well as their performances are available in Table 3.
Table 3
t = p = 4, n = 250, a = (0.064, 0.068, 0.076, 0.792).
Optimality | Rel-Eff (rounding) | Efficiency (Lower Bound) | Rel-Eff (to [28] with $n=250$) | Time (sec) |
A | 0.9999 | 0.9977 | 1.0023 | 2.24 |
D | 0.9999 | 0.9916 | 1.0063 | 2.15 |
Table 4
t = p = 3, n = 36.
Optimality | Rel-Eff (to Gurobi) | Rel-Eff (to ${d_{2}}$ in [25]) | Time (sec) | Gurobi Time (sec) |
A | 1.0390 | 1 | 0.01 | 0.06 |
D | 1.0794 | 1 | 0.01 | 0.06 |
Table 5
t = 4, p = 3, n = 12.
Optimality | Rel-Eff (to Gurobi) | Rel-Eff (to ${d_{1}}$ in [29]) | Time (sec) | Gurobi Time (sec) |
A | 1.009 | 0.9980 | 0.17 | 0.03 |
D | 1.027 | 0.9997 | 0.22 | 0.03 |
The OWEA exact design under A- and D- optimality happened to be exactly the same. Their relative efficiency due to rounding are almost 1, with lower bound efficiencies being all above 0.99. The relative efficiencies as compared to designs in [28] are 1.0023 for A-criterion and 1.0063 for D-criterion. This means that the OWEA framework improves the design efficiency after incorporating the dropout mechanism. Finally, the computing time is around 2.20 seconds under both criteria.
4.2 Crossover Model with Proportional Carryover Effects
Example 4.
$(t,p)=(3,3)$, $n=36$. The proportional model is nonlinear. All OWEA designs in this paper are locally optimal. For this particular example, we assume all direct treatment effects to be 2 and $\lambda =0.2$, and the covariance structure is taken to be the identity matrix. According to Table 4, the OWEA designs have identical efficiency under A- and D- optimal criteria compared to the optimal designs in the literature. In fact, the OWEA design is equivalent to the symmetric block $\langle 123\rangle $ defined in [25] and it is shown to have unity efficiency for A- D- T- criteria and 0.9931 under E-optimality. Besides, the Gurobi design is also calculated, and comparably it is less efficient. Trivially, the difference between computing times is negligible as both algorithms take less than 0.1 seconds.
(4.8)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 2& 3\\ {} 2& 3& 1\\ {} 3& 1& 2\end{array}\times 12\](4.9)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 2& 3\\ {} 2& 3& 1\\ {} 3& 1& 2\end{array}\times 12\](4.10)
\[ \textit{Gurobi Design}=\left\{\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 1& 1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2\\ {} 2& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 3& 1& 1& 1& 1& 1& 1\\ {} 2& 3& 3& 3& 3& 3& 2& 2& 2& 2& 2& 3& 1& 3& 3& 3& 3& 3\\ \\ {} 2& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 3& 3& 3& 3& 3& 3& 3\\ {} 3& 3& 3& 3& 3& 3& 1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2\\ {} 1& 1& 1& 1& 1& 3& 1& 2& 2& 2& 2& 2& 1& 1& 1& 1& 1& 2\end{array}\right.\]Example 5.
$(t,p)=(4,4)$, $n=12$. Similarly, we assumed all direct treatment effects to be 1.732 and $\lambda =0.1$, and the covariance structure is taken to be the identity matrix. As shown in Table 5, the efficiency of the OWEA designs relative to the Gurobi designs is 1.009 and 1.027 under A- and D- optimal criteria. There are minor differences in computing time, but all the computing times are below 0.3 seconds. In addition, design 1 (shown as ${d_{1}}$ in Table 5) in [29] is a set of 3 mutually orthogonal Latin square, which is balanced and neighbor-balanced. Comparably, the OWEA design is almost identical in efficiency under A- and D- optimal criteria.
(4.11)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 2& 2& 2& 3& 3& 3& 4& 4& 4\\ {} 2& 3& 3& 1& 3& 4& 2& 4& 4& 1& 1& 2\\ {} 4& 2& 4& 3& 4& 3& 1& 1& 1& 2& 2& 3\\ {} 3& 4& 2& 4& 1& 1& 4& 2& 2& 3& 3& 1\end{array}\](4.12)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 2& 2& 2& 3& 3& 3& 4& 4& 4\\ {} 2& 2& 3& 3& 3& 4& 1& 4& 4& 1& 1& 2\\ {} 4& 4& 2& 1& 1& 3& 4& 1& 2& 2& 3& 3\\ {} 3& 3& 4& 4& 4& 1& 2& 2& 1& 3& 2& 1\end{array}\](4.13)
\[ \text{Gurobi Design}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 2& 2& 2& 3& 3& 3& 4& 4& 4\\ {} 2& 3& 4& 1& 3& 4& 1& 2& 4& 1& 2& 2\\ {} 4& 2& 3& 3& 4& 3& 4& 1& 1& 2& 1& 3\\ {} 3& 4& 2& 4& 1& 1& 2& 4& 2& 3& 3& 1\end{array}\]4.3 Interference Model
Example 6.
$(t,k)=(4,4)$, $n=10$. As it is shown in [11], the optimal weights of universally optimal design are not integers, and hence the Gurobi designs proposed therein are used as benchmarks. As it is indicated in Table 6, the relative efficiency of A- and D- optimal designs are 0.9807 and 0.9265 compared to the Gurobi designs. Note that the relatively lower D- efficiency is the limitation of the OWEA framework, especially when n is small. There is no obvious winner in computing times, as OWEA framework is faster searching for A-optimal designs but slower in D-optimal designs compared to those using Gurobi.
(4.14)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 2& 2& 3& 3& 4& 4& 4\\ {} 2& 3& 4& 1& 2& 1& 3& 1& 2& 3\\ {} 4& 2& 3& 3& 3& 4& 4& 2& 1& 2\\ {} 4& 4& 3& 4& 1& 2& 1& 3& 1& 2\end{array}\](4.15)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 2& 2& 3& 3& 4& 4& 4\\ {} 2& 3& 4& 2& 4& 2& 3& 2& 3& 4\\ {} 3& 4& 2& 4& 1& 1& 1& 3& 2& 1\\ {} 3& 4& 2& 3& 3& 1& 4& 1& 1& 2\end{array}\](4.16)
\[ \text{Gurobi Design}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 2& 2& 2& 3& 3& 4& 4& 4\\ {} 1& 2& 1& 2& 3& 1& 3& 1& 3& 4\\ {} 4& 4& 3& 4& 1& 4& 2& 3& 2& 2\\ {} 4& 3& 4& 1& 1& 2& 1& 3& 2& 3\end{array}\]Example 7.
$(t,k)=(4,5)$, $n=24$. Based on Theorem 6 (ii) of [13], the universally optimal design is a symmetric design comprising equal weights to sequences and their dual sequences, where the dual sequence has the same elements as its original sequence but in reversed order. The design referred as “Universal” in Table 7 is a design consists of symmetric blocks $\langle 11234\rangle $, $\langle 12344\rangle $. From the summary in Table 7, the OWEA designs are as efficient as universally optimal designs from [13] and the Gurobi design, with the relative efficiency being 0.9970 and 0.9981 under A- and D- optimality. The computing times, in this case, are 14.17 and 7.89 seconds. Interestingly, integer programming for the Gurobi design only takes 0.20 seconds. This is because of the existence of universally optimal designs and solving for a linear equation systems is straightforward and fast utilizing integer programming.
(4.17)
\[ A-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 3& 3& 4& 4& 4& 4& 4\\ {} 1& 1& 1& 1& 3& 2& 2& 2& 2& 4& 4& 4& 2& 2& 2& 2& 3& 3& 4& 1& 3& 4& 4& 4\\ {} 2& 3& 3& 3& 4& 4& 1& 3& 3& 1& 1& 3& 1& 1& 1& 2& 1& 1& 1& 3& 2& 2& 2& 3\\ {} 3& 4& 4& 4& 2& 3& 4& 4& 4& 3& 3& 3& 4& 4& 4& 1& 2& 2& 2& 2& 1& 3& 3& 1\\ {} 4& 2& 2& 2& 3& 3& 3& 1& 1& 3& 3& 1& 4& 4& 4& 4& 4& 4& 2& 2& 1& 1& 1& 2\end{array}\](4.18)
\[ D-opt=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 3& 4& 4& 4& 4& 4& 4\\ {} 1& 1& 1& 3& 3& 3& 2& 2& 2& 4& 4& 4& 2& 2& 2& 3& 3& 3& 1& 1& 1& 4& 4& 4\\ {} 4& 4& 4& 4& 4& 4& 3& 3& 3& 3& 3& 3& 1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2\\ {} 3& 3& 3& 2& 2& 2& 4& 4& 4& 1& 1& 1& 4& 4& 4& 2& 2& 2& 3& 3& 3& 1& 1& 1\\ {} 2& 2& 2& 2& 2& 2& 1& 1& 1& 1& 1& 1& 4& 4& 4& 4& 4& 4& 3& 3& 3& 3& 3& 3\end{array}\](4.19)
\[ \text{Gurobi Design}=\begin{array}{c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c@{\hskip2.0pt}c}1& 1& 1& 1& 1& 1& 2& 2& 2& 2& 2& 2& 3& 3& 3& 3& 3& 3& 4& 4& 4& 4& 4& 4\\ {} 1& 1& 1& 2& 2& 4& 2& 2& 2& 3& 3& 4& 1& 2& 3& 3& 3& 4& 1& 1& 3& 4& 4& 4\\ {} 2& 2& 3& 3& 4& 3& 3& 4& 4& 1& 4& 1& 4& 1& 1& 1& 4& 2& 2& 3& 2& 1& 2& 3\\ {} 3& 4& 4& 4& 3& 2& 1& 1& 3& 4& 1& 3& 2& 4& 2& 4& 2& 1& 3& 2& 1& 2& 1& 2\\ {} 4& 3& 2& 4& 3& 2& 4& 3& 1& 4& 1& 3& 2& 4& 4& 2& 1& 1& 3& 2& 1& 3& 3& 1\end{array}\]5 Discussion
In this paper, we provide an algorithm framework for optimal or efficient crossover designs, and later extend it to the interference model. Information matrices are derived in order to apply the optimal weight exchange algorithm. The OWEA exact designs in the numerical examples are shown to be efficient or identical to the optimal exact designs. The most noteworthy advantages of the OWEA algorithm are the generality and convenience. It works on a myriad of models, as long as the information matrix is linear with respect to the design point. Although the requirement on the information matrix is regarded as a limitation of the algorithm, in practice, it is typically met when using linear models or some non-linear model. From one model to another, the OWEA algorithm is applicable with only minor modifications to the information matrix. Within the same model, one only needs to adjust the values of case configurations, such as t, p, and n in those examples. Note that these configurations can be arbitrary, including the inclusion of a dropout mechanism. To facilitate the usage of our algorithm, we have prepared an R package and provided an online R Shiny app as a more user-friendly interface (https://cran.r-project.org/package=OWEA).
Although the theoretical result indicates optimal designs within a subclass of pseudo-symmetric designs are automatically global optimal, the OWEA algorithm does not guarantee symmetry in its output. However, based on our experience, OWEA exact designs are close to symmetric optimal designs and always have relatively high efficiency or satisfactory lower bound. In addition, we also observed the limitation of the framework when n is a small integer, where the efficiency is not very high. In our opinion, one of the reasons is that approximate optimal design usually consists of many distinct sequences with small weights, and rounding them into a small n exact design would result in losing many sequences that significantly contributes to the information matrix. Furthermore, the OWEA algorithm offers flexibility in the choice of parameter vectors. Note that all the designs in this paper focus on direct treatment effects. In other cases, for example, [25] derived corresponding results for estimating λ in the proportional model, whereas using OWEA algorithm, one only needs to change the function g.
Speed is also an important aspect of numerical algorithms. In the numerical examples, the OWEA algorithm only takes a reasonable amount of time to complete the tasks. However, there is still space for improvement. The current search spaces that we implemented in the numerical examples comprise almost all possible sequences or blocks. According to theory of Kushner’s approximate design theory, for example, [11, 13, 25], one can narrow the search space down to a much smaller one depending on combinations of $(t,p,n)$ or $(t,k,n)$. However, there is a trade-off between convenience and computing time. Another possible improvement lies in the Newton’s method. In some of the cases, we do encounter overshooting problems, but the resultant design after reaching maximum iterations is still highly efficient. If this issue could be resolved, the computing time would be faster than the reported time.
Finally, although we compared examples and algorithms in [9, 13, 25], one has to point out those papers are aiming at developing powerful theoretical tools while this paper is prone to be more practical oriented. Often, practitioners even statisticians cannot easily understand and use good theoretical results in design studies. This inability to use results greatly hinders the application of excellent theoretical work. The developed R package R Shiny app not only offers a user-friendly interface but also generates optimal and efficient designs that are not available in existing publications. Deriving such designs theoretically is often challenging or impossible, but our user-friendly package simplifies the process. This development is essential for making theoretical results more accessible to practitioners, and it will undoubtedly contribute to the continued advancement of this important field.