Linear Mixed-effects Models for Censored Data with Serial Correlation Errors Using the Multivariate Student’s t-distribution

Zhong, Kelin; Olivari, Rommy C.; Garay, Aldo M.; Lachos, Víctor H.

doi:10.51387/24-NEJSDS68

Volume 3, Issue 1 (2025), pp. 119–134

Kelin Zhong Rommy C. Olivari Aldo M. Garay All authors (4)

https://doi.org/10.51387/24-NEJSDS68

Pub. online: 30 July 2024 Type: Methodology Article

Open Access

Area: Statistical Methodology

Accepted
6 July 2024

Published
30 July 2024

Abstract

The purpose of this paper is to develop a practical framework for the analysis of the linear mixed-effects models for censored or missing data with serial correlation errors, using the multivariate Student’s t-distribution, being a flexible alternative to the use of the corresponding normal distribution. We propose an efficient ECM algorithm for computing the maximum likelihood estimates for these models with standard errors of the fixed effects and likelihood function as a by-product. This algorithm uses closed-form expressions at the E-step, which relies on formulas for the mean and variance of a truncated multivariate Student’s t-distribution. In order to illustrate the usefulness of the proposed new methodology, artificial and a real dataset are analyzed. The proposed algorithm and methods are implemented in the R package ARpLMEC.

1 Introduction

Linear mixed effects (LME) models are frequently used to analyze repeated measures data because they model flexibly the within-subject correlation often present in this type of data. Usually, for mathematical convenience, it is assumed that both random effect and error term follow normal distributions (N-LME). These restrictive assumptions, however, may result in a lack of robustness against departures from the normal distribution and invalid statistical inferences, especially when the data show heavy tails. For instance, substantial bias in the maximum likelihood (ML) estimates of regression parameters can result when the random-effects distribution is misspecified [11].

Several researchers have investigated alternative distributions for errors in LME models. For example, [28] propose a robust hierarchical linear mixed model in which the random effects and the within-subject errors have a multivariate Student’s t-distribution (t-LME). [18] and [16] developed some additional tools for t-LME models from likelihood-based and Bayesian perspectives. However, in longitudinal studies, such as those on environmental pollution and infectious diseases, measurements of some variables may be subjected to certain threshold values below or above which the measurements are not quantifiable. For instance, viral load measures the amount of actively replicating virus and, depending upon the diagnostic assays used, its measurement may be subjected to detection limits, below or above, which are not quantifiable. For this kind of data, [30] proposed the LME model for censored responses (LMEC), and an exact EM-type algorithm is proposed. Recently, [21] and [23] have considered the Student’s t-distribution in the context of LMEC models (t-LMEC), including influence diagnostics analyses, under different perturbation schemes. A common feature of these classes of LMECs is that the error terms are unconditionally independent. However, in longitudinal studies, repeated measures are collected over time, and hence, the error term tends to be serially correlated.

There are several proposals in the literature that account for the time dependence in longitudinal data. For instance, [9] studied longitudinal data with random effects and AR(1) errors under the Bayesian paradigm, [4] extended the classical random effects model to deal with time dependence, leaving the random effects distribution unspecified, [13] proposed a robust structure for a censored linear model based on the multivariate Student’s t-distribution, considering a damped exponential correlation (DEC) structure. More recently, [26] have proposed a full likelihood-based approach for the Gaussian N-LMEC modeling with autoregressive correlation of order p (AR(p)) errors (AR(p)-LMEC), including the implementation of the EM algorithm for maximum likelihood (ML) estimation.

Even though some proposals have been made to deal with the problem of serial correlation among the observations in LMEC models, to the best of our knowledge, there are no studies of t-LMEC with serially correlated error structures, such as DEC or AR(p). Thus, in this paper we propose the t-LMEC modeling considering some useful correlation structures (CSs). We develop a full likelihood-based treatment, including the implementation of a computationally efficient estimation method via the EM algorithm with the likelihood function, predictions of unobservable values of the response, and the asymptotic standard errors as byproducts. The model developed here is an extension of those previously presented by [21], [22] and [26] for the analysis of mixed-effects models with censored responses and HIV data.

The proposed algorithm and methods are implemented in the R package ARpLMEC [27], available on CRAN repository.

The rest of the paper is organized as follows: Section 2 discusses some preliminary results related to the multivariate Student’s t-distribution and its truncated version. Some of its key properties are also presented. In Section 3, the related likelihood-based inference is presented, including estimation of the standard errors for the regression parameters. The application of the proposed method is presented in Sections 4 and 5 through some simulation studies and the analysis of data from a real HIV case study, respectively. Finally, Section 6 concludes with a short discussion of issues raised by this study and some possible directions for future research.

2 Preliminaries

We begin our exposition by defining the notation and presenting some basic concepts which are used throughout the development of our methodology. We denote a random variable by an upper-case letter and its realization by the corresponding lower case and use boldface letters for vectors and matrices. ${\mathbf{I}_{p}}$ represents a $p\times p$ identity matrix and ${\mathbf{A}^{\top }}$ denotes the transpose of A matrix. Thus, for $\mathbf{a}={({a_{1}},\dots ,{a_{p}})^{\top }}$ and $\mathbf{b}={({b_{1}},\dots ,{b_{p}})^{\top }}$ we have that, if the Borel set in ${\mathbb{R}^{p}}$ has the form:

(2.1)

\[\begin{aligned}{}\mathbb{A}& =\big\{({y_{1}},\dots ,{y_{p}})\in {\mathbb{R}^{p}}:\hspace{0.1667em}\hspace{0.1667em}\hspace{0.1667em}{a_{1}}\le {y_{1}}\le {b_{1}},\dots ,{a_{p}}\le {y_{p}}\le {b_{p}}\big\}\\ {} & =\big\{\mathbf{y}\in {\mathbb{R}^{p}}:\mathbf{a}\le \mathbf{y}\le \mathbf{b}\big\}\end{aligned}\]

then, we use the shorthand notations $\{\mathbf{Y}\in \mathbb{A}\}=\{\mathbf{a}\le \mathbf{Y}\le \mathbf{b}\}$ and

\[ {\int _{\mathbf{a}}^{\mathbf{b}}}f(\mathbf{y})d\mathbf{y}={\int _{{a_{1}}}^{{b_{1}}}}\dots {\int _{{a_{p}}}^{{b_{p}}}}f({y_{1}},\dots ,{y_{p}})\mathrm{d}{y_{p}}\dots \mathrm{d}{y_{1}}.\]

2.1 The Multivariate Student’s t-distribution

$\mathbf{Y}\sim {t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$ denotes a random vector Y following a p-variate Student’s-t distribution, with location vector $\boldsymbol{\mu }$, positive-definite scale-covariance matrix $\boldsymbol{\Sigma }$ and degrees of freedom ν and ${t_{p}}(\mathbf{y}\mid \boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$ denotes its probability density function (pdf).

The cumulative distribution function (cdf) of Y on $[\mathbf{a},\mathbf{b}]$ is denoted by:

(2.2)

\[ {L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )={\int _{\mathbf{a}}^{\mathbf{b}}}{t_{p}}(\mathbf{y}|\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )d\mathbf{y}.\]

An important property of the random vector Y is that it can be written as a scale mixture of the MVN (multivariate normal) random vector coupled with a positive random variable, i.e.,

(2.3)

\[ \mathbf{Y}=\boldsymbol{\mu }+{U^{-1/2}}\mathbf{Z},\]

where $\mathbf{Z}\sim {N_{p}}(\mathbf{0},\boldsymbol{\Sigma })$ independent of $U\sim \text{Gamma}(\nu /2,\nu /2)$, where ${N_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma })$ and $\text{Gamma}(a,b)$ denote the p-variate normal with mean vector $\boldsymbol{\mu }$ and variance-covariance matrix $\boldsymbol{\Sigma }$ and the gamma distribution with mean $a/b$, respectively.

2.2 The Multivariate Truncated Student’s t-distribution

A p-dimensional random vector Y is said to follow a doubly truncated Student’s t-distribution with location vector $\boldsymbol{\mu }$, scale-covariance matrix $\boldsymbol{\Sigma }$ and degrees of freedom ν over the truncation region $\mathbb{A}$, defined in Eq. (2.1), denoted by $\mathbf{Y}\sim T{t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;\mathbb{A})$, if it has the pdf:

\[ T{t_{p}}(\mathbf{y}|\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;\mathbb{A})=\frac{{t_{p}}(\mathbf{y}|\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )}{{L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )},\hspace{0.1667em}\hspace{0.1667em}\mathbf{a}\le \mathbf{y}\le \mathbf{b},\]

where ${L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$ as defined in Eq. (2.2).

The cdf of Y evaluated at the region $\{\mathbf{a}\le \mathbf{y}\le \mathbf{b}\}$ is:

(2.4)

\[\begin{aligned}{}T{T_{p}}(\mathbf{y}|\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;\mathbb{A})& =\frac{1}{{L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )}{\int _{\mathbf{a}}^{\mathbf{y}}}{t_{p}}(\mathbf{x}|\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )d\mathbf{x}\\ {} & =\frac{{L_{p}}(\mathbf{a},\mathbf{y};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )}{{L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )}.\end{aligned}\]

Some propositions and properties related to the p-variate Student’s t-distribution and the marginal and conditional moments of the first two moments of TMVT distributions under a double truncation, which are useful for our theoretical developments, can be found in Appendix A and B, respectively.

3 Linear Mixed-effects with Censored Response

3.1 Model Specification

We proceed as in [21, 28], by considering a generalization of the classical N-LME model in which the random terms are assumed to follow a Student’s-t distribution as follows:

(3.1)

\[ {\mathbf{Y}_{i}}={\mathbf{X}_{i}}\boldsymbol{\beta }+{\mathbf{Z}_{i}}{\mathbf{b}_{i}}+{\boldsymbol{\epsilon }_{i}},\hspace{0.1667em}\hspace{0.1667em}\text{with}\]

(3.2)

\[ {({\mathbf{b}_{i}},{\boldsymbol{\epsilon }_{i}})^{\top }}\sim {t_{{n_{i}}+q}}\big(\mathbf{0},Diag(\mathbf{D},{\boldsymbol{\Omega }_{i}}),\nu \big).\]

The subscript “i” is the subject index; Diag(A,B) is a block diagonal matrix whose elements are the matrices A and B. ${\mathbf{Y}_{i}}={({Y_{i1}},\dots ,{Y_{i{n_{i}}}})^{\top }}$ is a ${n_{i}}\times 1$ vector of observed continuous responses for sample unit “i”, ${\mathbf{X}_{i}}$ is the ${n_{i}}\times s$ design matrix corresponding to the fixed effects $\boldsymbol{\beta }$, which is the $s\times 1$ vector of population-averaged regression coefficients. ${\mathbf{Z}_{i}}$ is the ${n_{i}}\times q$ design matrix corresponding to the $q\times 1$ vector of random effects ${\mathbf{b}_{i}}$. ${\boldsymbol{\epsilon }_{i}}$ is the ${n_{i}}\times 1$ vector of random errors, the dispersion matrix $\mathbf{D}=\mathbf{D}(\boldsymbol{\alpha })$ depends on unknown and reduced parameters $\boldsymbol{\alpha }$. The correlation structure of the error vector is assumed to be ${\boldsymbol{\Omega }_{i}}={\sigma ^{2}}{\mathbf{R}_{i}}$, where ${\mathbf{R}_{i}}={\mathbf{R}_{i}}(\boldsymbol{\phi })$, for $\boldsymbol{\phi }={({\phi _{1}},\dots ,{\phi _{p}})^{\top }}$ is a ${n_{i}}\times {n_{i}}$ matrix, that incorporates a time-dependence structure.

Note that ${\mathbf{b}_{i}}$ and ${\boldsymbol{\epsilon }_{i}}$ are uncorrelated, once $\text{Cov}({\mathbf{b}_{i}},{\boldsymbol{\epsilon }_{i}})=\mathbb{E}[{\mathbf{b}_{i}}{\boldsymbol{\epsilon }_{i}^{\top }}]=\mathbb{E}[\mathbb{E}({\mathbf{b}_{i}}{\boldsymbol{\epsilon }_{i}^{\top }}|{U_{i}})]=\mathbf{0}$, where ${U_{i}}$ is a scalar generated from $\text{Gamma}(\nu /2,\nu /2)$. Classical inference on the parameter vector $\boldsymbol{\theta }={({\boldsymbol{\beta }^{\top }},{\sigma ^{2}},{\boldsymbol{\alpha }^{\top }},{\boldsymbol{\phi }^{\top }},\nu )^{\top }}$ is based on the marginal distribution of ${\mathbf{Y}_{i}}$, thus ${\mathbf{Y}_{i}}\stackrel{\mathrm{ind}.}{\sim }{t_{{n_{i}}}}({\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}},\nu )$, for $i=1,\dots ,n$, where ${\boldsymbol{\mu }_{i}}={\mathbf{X}_{i}}\boldsymbol{\beta }$ and ${\boldsymbol{\Sigma }_{i}}={\sigma ^{2}}{\mathbf{R}_{i}}+{\mathbf{Z}_{i}}\mathbf{D}{\mathbf{Z}_{i}^{\top }}$. The estimates from the multivariate t-LME are more robust against outliers than those based on the standard LME; in this sense, [28] showed by simulation study that the t-LME substantially outperforms the normal or standard LME when outliers are present in the data. This issue has also been discussed by [21] in the context of censored mixed-effects models.

In this study, we follow [13, 15, 21, 23, 26] by considering non-informative censoring observations (or censoring at random), i.e. we assume that the response ${Y_{ij}}$ is not fully observed for all $i,j$. Thus, let $({\mathbf{V}_{i}},{\mathbf{C}_{i}})$ be the observed data for the i-th subject, where ${\mathbf{V}_{i}}$ represents the vector of uncensored readings (${V_{ij}}={V_{0i}}$) or censoring interval (${V_{1ij}}$, ${V_{2ij}}$) and ${\mathbf{C}_{i}}$ is the vector of censoring indicators, such that:

(3.3)

\[ {C_{ij}}=\left\{\begin{array}{l@{\hskip10.0pt}l}1\hspace{1em}& \text{if}\hspace{1em}{V_{1ij}}\le {Y_{ij}}\le {V_{2ij}},\\ {} 0\hspace{1em}& \text{if}\hspace{1em}{Y_{ij}}={V_{0i}},\end{array}\right.\]

for all $i\in \{1,\dots ,n\}$ and $j\in \{1,\dots ,{n_{i}}\}$, i.e., ${C_{ij}}=1$ if ${Y_{ij}}$ is located within a specific interval. Note that for a right-censored observation ${V_{2ij}}=+\infty $ and for a left-censored observation ${V_{1ij}}=-\infty $. Moreover, missing at random (MAR) observations can be handled by considering ${V_{1ij}}=-\infty $ and ${V_{2ij}}=+\infty $. The model defined in Eqs. (3.1)–(3.3) is henceforth called the t-LMEC model.

3.2 Within-subject Dependence Structures

In order to enable some flexibility when modeling the error covariance, we consider essentially three dependence structures: unconditionally independent (UNC), the continuous-type autoregressive process of order p and the damped exponential correlation, which will be discussed next.

• Unconditional independence

The most common and simplest approach is to assume that the error terms are Unconditionally independent (UNC), i.e. we have ${\mathbf{R}_{i}}={\mathbf{I}_{{n_{i}}}}$, for each $i=1,\dots ,n$. In practice, the linear mixed models considering UNC errors is very frequent, for example it has been considered by Matos et al. [21]. It will be denoted as UNC-t-LMEC.

However, in longitudinal studies, repeated measures are collected over time and hence the error term might be serially correlated.
In order to account for the within-subject serial correlation, we consider other two general structures.
• Autoregressive dependence of order p

In this case, we propose ${\mathbf{R}_{i}}$ as a structured AR(p) dependence matrix [8]. Specifically,

(3.4)
\[ {\mathbf{R}_{i}}={\mathbf{R}_{i}}(\boldsymbol{\phi })=\frac{1}{1-{\phi _{1}}{\rho _{1}}-\cdots -{\phi _{p}}{\rho _{p}}}[{\rho _{|r-s|}}],\]
where $r,s=1,\dots ,{n_{i}}$ and ${\rho _{1}},\dots ,{\rho _{p}}$ are the theoretical autocorrelations of the process, and thereby they are functions of autoregressive parameters $\boldsymbol{\phi }={({\phi _{1}},\dots ,{\phi _{p}})^{\top }}$ and satisfy the Yule–Walker equations [7], i.e.,
\[ {\rho _{k}}={\phi _{1}}{\rho _{k-1}}+\cdots +{\phi _{p}}{\rho _{k-p}},\hspace{0.1667em}\hspace{0.1667em}{\rho _{0}}=1,\hspace{0.1667em}\hspace{0.1667em}\hspace{0.1667em}k=1,\dots ,p.\]
In addition, the roots of $1-{\phi _{1}}B-{\phi _{2}}{B^{2}}-\cdots -{\phi _{p}}{B^{p}}=0$ must lie outside the unit circle to ensure stationarity of the AR(p) model.

Following [3], the autoregressive process can be reparameterized using a one-to-one, continuous and differentiable transformation in order to simplify the conditions for stationarity. For details on the estimation of the autoregressive coefficients, we refer to [29], see also [17, 16] for details on estimation of LME model with AR(p) dependence.

The model formulated in Eqs. (3.1)–(3.3) with ${\boldsymbol{\Omega }_{i}}={\sigma ^{2}}{\mathbf{R}_{i}}$, where ${\mathbf{R}_{i}}$ is given by Eq. (3.4), $i=1,\dots ,n$, will be denoted AR(p)-t-LMEC. In order to accommodate situations in which measurements are taken irregularly over discrete time, we modify ${\mathbf{R}_{i}}$ by computing it for a regular range of time and then suppressing the line and column regarding the position from the missing measurements.
• Damped exponential correlation
In this case, following [25], we propose to structure ${\mathbf{R}_{i}}$ as a damped exponential correlation (DEC) matrix, as follows:

(3.5)
\[ {\mathbf{R}_{i}}={\mathbf{R}_{i}}(\boldsymbol{\phi },{\mathbf{t}_{i}})=\big[{\phi _{1}^{|{t_{ij}}-{t_{ik}}{|^{{\phi _{2}}}}}}\big],\hspace{0.1667em}\hspace{0.1667em}\hspace{0.1667em}\hspace{0.1667em}0\le {\phi _{1}}\lt 1,\hspace{0.1667em}\hspace{0.1667em}{\phi _{2}}\ge 0,\]
where $j,k=1,\dots ,{n_{i}}$, for $i=1,\dots ,n$ and $\boldsymbol{\phi }={({\phi _{1}},{\phi _{2}})^{\top }}$. The correlation parameter ${\phi _{1}}$ describes the autocorrelation between observations separated by the absolute length of two time points, and the damping parameter ${\phi _{2}}$ allows the acceleration of the exponential decay of the autocorrelation function defining a continuous-time autoregressive (AR) model. It is important to stress that considering the DEC structure it is possible to obtain different correlation structures. For example:
- (i) if ${\phi _{2}}=0$, then ${\mathbf{R}_{i}}$ reduces to the compound symmetry correlation structure (DEC-SYM);
- (ii) if ${\phi _{2}}=1$, then ${\mathbf{R}_{i}}$ reduces to the DEC-AR(1) correlation structure;
- (iii) if $0\lt {\phi _{2}}\lt 1$, then ${\mathbf{R}_{i}}$ generates a decay rate slower than the DEC-AR(1) structure;
- (iv) if ${\phi _{2}}\gt 1$, then ${\mathbf{R}_{i}}$ generates a decay rate faster than the DEC-AR(1) structure; and
- (v) if ${\phi _{2}}\to \infty $, then ${\mathbf{R}_{i}}$ converges to the correlation matrix of a moving-average of order 1 (DEC-MA(1)).
The model presented in Eqs. (3.1)–(3.3) considering ${\boldsymbol{\Omega }_{i}}={\sigma ^{2}}{\mathbf{R}_{i}}$, where ${\mathbf{R}_{i}}$ is given by Eq. (3.5), $i=1,\dots ,n$, will be denoted DEC-t-LMEC.

3.3 The Likelihood Function

Let ${\mathbf{Y}_{i}}={({\mathbf{Y}_{i}^{o}},{\mathbf{Y}_{i}^{c}})^{\top }}$ where ${\mathbf{Y}_{i}^{o}}$ and ${\mathbf{Y}_{i}^{c}}$ represent the ${n_{i}^{o}}$ and ${n_{i}^{c}}$ vectors of observed outcomes and censored observations for subject i, respectively; with ${n_{i}}={n_{i}^{o}}+{n_{i}^{c}}$, such that ${C_{ij}}=0$ for all elements in ${\mathbf{Y}_{i}^{o}}$ and 1 for all elements in ${\mathbf{Y}_{i}^{c}}$. After reordering, ${\mathbf{Y}_{i}}$, ${\mathbf{V}_{i}}$, ${\boldsymbol{\mu }_{i}}$, and ${\boldsymbol{\Sigma }_{i}}$ can be partitioned as follows:

(3.6)

\[\begin{aligned}{}{\mathbf{Y}_{i}}& =\mathrm{vec}\big({\mathbf{Y}_{i}^{o}},{\mathbf{Y}_{i}^{c}}\big),\hspace{0.1667em}{\mathbf{V}_{i}}=\mathrm{vec}\big({\mathbf{V}_{i}^{o}},{\mathbf{V}_{i}^{c}}\big),\hspace{0.1667em}{\boldsymbol{\mu }_{i}^{\top }}=\big({\boldsymbol{\mu }_{i}^{o}},{\boldsymbol{\mu }_{i}^{c}}\big)\hspace{0.1667em}\hspace{0.1667em}\mathrm{and}\hspace{0.1667em}\hspace{0.1667em}\\ {} {\boldsymbol{\Sigma }_{i}}& =\left(\begin{array}{c@{\hskip10.0pt}c}{\boldsymbol{\Sigma }_{i}^{oo}}& {\boldsymbol{\Sigma }_{i}^{oc}}\\ {} {\boldsymbol{\Sigma }_{i}^{co}}& {\boldsymbol{\Sigma }_{i}^{cc}}\end{array}\right),\end{aligned}\]

where $\mathrm{vec}(.)$ denotes the function which stacks vectors or matrices of the same number of columns.

Using Proposition 1 (Appendix A), we have that

\[\begin{aligned}{}{\mathbf{Y}_{i}^{o}}& \sim {t_{{n_{i}^{o}}}}\big({\boldsymbol{\mu }_{i}^{o}},{\boldsymbol{\Sigma }_{i}^{oo}},\nu \big),\hspace{1em}\text{and}\hspace{1em}\\ {} {\mathbf{Y}_{i}^{c}}|{\mathbf{Y}_{i}^{o}}={\mathbf{y}_{i}^{o}}& \sim {t_{{n_{i}^{c}}}}\big({\boldsymbol{\mu }_{i}^{co}},{\mathbf{S}_{i}^{co}},\nu +{n_{i}^{o}}\big),\end{aligned}\]

where

\[\begin{aligned}{}{\boldsymbol{\mu }_{i}^{o}}& ={\mathbf{X}_{i}^{o}}\boldsymbol{\beta }\hspace{2.5pt}\hspace{2.5pt}\text{and}\hspace{2.5pt}\hspace{2.5pt}{\boldsymbol{\mu }_{i}^{c}}={\mathbf{X}_{i}^{c}}\boldsymbol{\beta },\\ {} {\boldsymbol{\mu }_{i}^{co}}& ={\boldsymbol{\mu }_{i}^{c}}+{\boldsymbol{\Sigma }_{i}^{co}}{\boldsymbol{\Sigma }_{i}^{oo-1}}({\mathbf{y}_{i}^{o}}-{\boldsymbol{\mu }_{i}^{o}}),\\ {} {\mathbf{S}_{i}^{co}}& =(\frac{\nu +{\delta ^{2}}({\mathbf{y}_{i}^{o}})}{\nu +{n_{i}^{o}}}){\mathbf{S}_{i}},\\ {} {\mathbf{S}_{i}}& ={\boldsymbol{\Sigma }_{i}^{cc}}-{\boldsymbol{\Sigma }_{i}^{co}}{\boldsymbol{\Sigma }_{i}^{oo-1}}{\boldsymbol{\Sigma }_{i}^{oc}}\hspace{2.5pt}\hspace{2.5pt}\text{and}\hspace{2.5pt}\\ {} {\delta ^{2}}({\mathbf{y}_{i}^{o}})& ={({\mathbf{y}_{i}^{o}}-{\boldsymbol{\mu }_{i}^{o}})^{\top }}{\boldsymbol{\Sigma }_{i}^{oo-1}}({\mathbf{y}_{i}^{o}}-{\boldsymbol{\mu }_{i}^{o}}).\end{aligned}\]

Let $\boldsymbol{\theta }={({\boldsymbol{\beta }^{\top }},{\sigma ^{2}},{\boldsymbol{\alpha }^{\top }},{\boldsymbol{\phi }^{\top }},\nu )^{\top }}$ be the parameters vector, as presented by [21], the likelihood function for subject i is given by:

(3.7)

\[\begin{aligned}{}{L_{i}}(\boldsymbol{\theta })& =f({\mathbf{y}_{i}}|\boldsymbol{\theta })=f({\mathbf{V}_{i}}|{\mathbf{C}_{i}},\boldsymbol{\theta })\\ {} & =f\big({\mathbf{y}_{i}^{o}}|\boldsymbol{\theta }\big)P\big({\mathbf{V}_{1i}^{c}}\le {\mathbf{Y}_{i}^{c}}\le {\mathbf{V}_{2i}^{c}}|{\mathbf{V}_{i}^{o}},\boldsymbol{\theta }\big)\\ {} & ={t_{{n_{i}^{o}}}}\big({\mathbf{V}_{i}^{o}};{\boldsymbol{\mu }_{i}^{o}},{\boldsymbol{\Sigma }_{i}^{oo}},\boldsymbol{\nu }\big){L_{{n_{i}^{c}}}}\big({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{co}},{\mathbf{S}_{i}^{co}},\nu +{n_{i}^{o}}\big),\end{aligned}\]

where ${L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$ is as defined in Eq. (2.2), which can be easily evaluated by using the R package MomTrunc.

The log-likelihood function for the observed data is given by $\ell (\boldsymbol{\theta }|\mathbf{y})={\textstyle\sum _{i=1}^{n}}\log {L_{i}}(\boldsymbol{\theta })$, and the estimates obtained by maximizing the log-likelihood function $\ell (\boldsymbol{\theta }|\mathbf{y})$ are the ML estimates.

3.4 The EM Algorithm

In order to obtain the ML estimation of the parameters in the t-LMEC model, presented in Section 3.1, we implemented the ECM algorithm, proposed by [24]. This algorithm preserves the properties of simplicity and stability of EM algorithm, however replace the M-step with a sequence of conditional maximization (CM) steps. Based in property of multivariate Student’s t-distribution, the t-LMEC model can be expressed in the following hierarchical model:

\[ \begin{aligned}{}\mathbf{Y}i|{\mathbf{b}_{i}},{u_{i}}& \stackrel{\mathrm{ind}.}{\sim }{\mathrm{N}_{{n_{i}}}}\big({\boldsymbol{\mu }_{i}},{u_{i}^{-1}}{\boldsymbol{\Omega }_{i}}\big),\\ {} {\mathbf{b}_{i}}|{u_{i}}& \stackrel{\mathrm{ind}.}{\sim }{\mathrm{N}_{q}}\big(\mathbf{0},{u_{i}^{-1}}\mathbf{D}\big),\\ {} {u_{i}}& \stackrel{\mathrm{iid}.}{\sim }Gamma\bigg(\frac{\nu }{2},\frac{\nu }{2}\bigg).\end{aligned}\]

Thus, it is possible to apply the ECM algorithm by assuming that $\mathbf{Y}=({\mathbf{Y}_{1}^{\top }},\dots ,{\mathbf{Y}_{n}^{\top }})$, $\mathbf{b}=({\mathbf{b}_{1}^{\top }},\dots ,{\mathbf{b}_{n}^{\top }})$ and $\mathbf{u}={({u_{1}},\dots ,{u_{n}})^{\top }}$ are hypothetical missing variables, and augmenting with the observed variables $(\mathbf{V},\mathbf{C})$ where $\mathbf{V}=\mathrm{vec}({\mathbf{V}_{1}},\dots ,{\mathbf{V}_{n}})$ and $\mathbf{C}=\mathrm{vec}({\mathbf{C}_{1}},\dots ,{\mathbf{C}_{n}})$.

Considering ${\mathbf{Y}_{c}}={({\mathbf{C}^{\top }},{\mathbf{V}^{\top }},{\mathbf{y}^{\top }},{\mathbf{b}^{\top }},{\mathbf{u}^{\top }})^{\top }}$ and $\boldsymbol{\theta }={({\boldsymbol{\beta }^{\top }},{\sigma ^{2}},{\boldsymbol{\alpha }^{\top }},{\boldsymbol{\phi }^{\top }},\nu )^{\top }}$, the complete-data log-likelihood, is given by

(3.8)

\[ \begin{aligned}{}{\ell _{c}}(\boldsymbol{\theta }|{\mathbf{y}_{c}})& ={\sum \limits_{i=1}^{n}}\left[-\frac{{n_{i}}}{2}\log {\sigma ^{2}}-\frac{1}{2}\log (|{\mathbf{R}_{i}}|)\right.\\ {} & \hspace{1em}-\left.\frac{{u_{i}}}{2{\sigma ^{2}}}{({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}}-{\mathbf{Z}_{i}}{\mathbf{b}_{i}})^{\top }}{\mathbf{R}_{i}^{-1}}({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}}-{\mathbf{Z}_{i}}{\mathbf{b}_{i}})\right.\\ {} & \hspace{1em}-\left.\frac{1}{2}\log |\mathbf{D}|-\frac{{u_{i}}}{2}{\mathbf{b}_{i}^{\top }}{\mathbf{D}^{-1}}{\mathbf{b}_{i}}+\log h({u_{i}}|\nu )+C\right],\end{aligned}\]

where C is a constant that does not depend on the $\boldsymbol{\theta }$. $h({u_{i}}|\nu )$ represents the pdf of a $Gamma(\nu /2,\nu /2)$ distribution.

Given the estimate $\boldsymbol{\theta }={\widehat{\boldsymbol{\theta }}^{(k)}}$, at iteration “$(k)$” of the algorithm, in the E-step computes the conditional expectation of the complete-data log-likelihood function, given by:

\[\begin{aligned}{}Q\big(\boldsymbol{\theta }|{\widehat{\boldsymbol{\theta }}^{(k)}}\big)& =\mathbb{E}\big[{\ell _{c}}(\boldsymbol{\theta }|{\mathbf{y}_{c}})|\mathbf{V},\mathbf{C},{\widehat{\boldsymbol{\theta }}^{(k)}}\big]\\ {} & ={\sum \limits_{i=1}^{n}}{Q_{1i}}\big(\boldsymbol{\beta },{\sigma ^{2}},\boldsymbol{\phi }|{\widehat{\boldsymbol{\theta }}^{(k)}}\big)+{\sum \limits_{i=1}^{n}}{Q_{2i}}\big(\boldsymbol{\alpha }|{\widehat{\boldsymbol{\theta }}^{(k)}}\big),\end{aligned}\]

where

\[\begin{aligned}{}& {Q_{1i}}\big(\boldsymbol{\beta },{\sigma ^{2}},\boldsymbol{\phi }|{\widehat{\boldsymbol{\theta }}^{(k)}}\big)\\ {} & \hspace{1em}=-\frac{{n_{i}}}{2}\log {\sigma ^{2}}-\frac{1}{2}\log (|{\mathbf{R}_{i}}|)\\ {} & \hspace{2em}-\frac{1}{2{\sigma ^{2}}}\Big[{\widehat{a}_{i}^{(k)}}-2{\boldsymbol{\mu }_{i}^{\top }}{\mathbf{R}_{i}^{-1}}\big({\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}-{\mathbf{Z}_{i}}{\widehat{{u_{i}}{\mathbf{b}_{i}}}^{(k)}}\big)\\ {} & \hspace{2em}+{\widehat{{u_{i}}}^{(k)}}{\boldsymbol{\mu }_{i}^{\top }}{\mathbf{R}_{i}^{-1}}{\boldsymbol{\mu }_{i}}\Big]\hspace{1em}\text{and}\end{aligned}\]

\[\begin{aligned}{}{Q_{2i}}\big(\boldsymbol{\alpha }|{\widehat{\boldsymbol{\theta }}^{(k)}}\big)& =-\frac{1}{2}\log |\mathbf{D}|-\frac{1}{2}\operatorname{tr}\big({\widehat{{u_{i}}{\mathbf{b}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}}{\mathbf{D}^{-1}}\big),\hspace{2.5pt}\text{with}\\ {} {\widehat{a}_{i}^{(k)}}& =\operatorname{tr}\big({\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}^{(k)}}{\mathbf{R}_{i}^{-1}}-2{\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}}{\mathbf{Z}_{i}^{\top }}{\mathbf{R}_{i}^{-1}}\\ {} & \hspace{1em}+{\widehat{{u_{i}}{\mathbf{b}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}}{\mathbf{Z}_{i}^{\top }}{\mathbf{R}_{i}^{-1}}{\mathbf{Z}_{i}}\big),\\ {} {\widehat{{u_{i}}{\mathbf{b}_{i}}}^{(k)}}& =\mathbb{E}\big[{u_{i}}{\mathbf{b}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\widehat{\boldsymbol{\theta }}^{(k)}}\big]={\boldsymbol{\varphi }_{i}}\big({\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}-{\widehat{{u_{i}}}^{(k)}}{\boldsymbol{\mu }_{i}}\big),\\ {} {\widehat{{u_{i}}{\mathbf{b}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}}& =\mathbb{E}\big[{u_{i}}{\mathbf{b}_{i}}{\mathbf{b}_{i}^{\top }}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\widehat{\boldsymbol{\theta }}^{(k)}}\big]\\ {} & ={\boldsymbol{\Lambda }_{i}}+{\boldsymbol{\varphi }_{i}}\big({\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}^{(k)}}-2{\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}{\boldsymbol{\mu }_{i}}\\ {} & \hspace{1em}+{\widehat{{u_{i}}}^{(k)}}{\boldsymbol{\mu }_{i}}{\boldsymbol{\mu }_{i}^{\top }}\big){\boldsymbol{\varphi }_{i}^{\top }},\\ {} {\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}}& =\mathbb{E}\big[{u_{i}}{\mathbf{y}_{i}}{\mathbf{b}_{i}^{\top }}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\widehat{\boldsymbol{\theta }}^{(k)}}\big]\\ {} & =\big({\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}^{(k)}}-{\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}{\boldsymbol{\mu }_{i}^{\top }}\big){\boldsymbol{\varphi }_{i}^{\top }},\end{aligned}\]

where ${\boldsymbol{\Lambda }_{i}}={({\mathbf{D}^{-1}}+{\mathbf{Z}_{i}^{\top }}{\mathbf{R}_{i}^{-1}}{\mathbf{Z}_{i}}/{\sigma ^{2}})^{-1}}$ and ${\boldsymbol{\varphi }_{i}}={\boldsymbol{\Lambda }_{i}}{\mathbf{Z}_{i}^{\top }}{\mathbf{R}_{i}^{-1}}/{\sigma ^{2}}$.

In conditional maximization (CM) steps, we update ${\widehat{\boldsymbol{\theta }}^{(k+1)}}$ by maximizing $Q(\boldsymbol{\theta }|{\widehat{\boldsymbol{\theta }}^{(k)}})$ over $\boldsymbol{\theta }$, as follows:

(3.9)

\[\begin{aligned}{}{\widehat{\boldsymbol{\beta }}^{(k+1)}}& ={\Bigg({\sum \limits_{i=1}^{n}}{\widehat{{u_{i}}}^{(k)}}{\mathbf{X}_{i}^{\top }}{\widehat{\mathbf{R}}_{i}^{-1(k)}}{\mathbf{X}_{i}}\Bigg)^{-1}}{\sum \limits_{i=1}^{n}}{\mathbf{X}_{i}^{\top }}{\widehat{\mathbf{R}}_{i}^{-1(k)}}\\ {} & \hspace{1em}\times \big({\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}-{\mathbf{Z}_{i}}{\widehat{{u_{i}}{\mathbf{b}_{i}}}^{(k)}}\big),\end{aligned}\]

(3.10)

\[\begin{aligned}{}{\widehat{{\sigma ^{2}}}^{(k+1)}}& =\frac{1}{N}{\sum \limits_{i=1}^{n}}\left[{\widehat{a}_{i}^{(k)}}-2{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)\top }}{\mathbf{R}_{i}^{-1}}\right.\\ {} & \hspace{1em}\times \left.\big({\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}-{\mathbf{Z}_{i}}{\widehat{{u_{i}}{\mathbf{b}_{i}}}^{(k)}}\big)\right.\\ {} & \hspace{1em}+\left.{\widehat{{u_{i}}}^{(k)}}{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)\top }}{\mathbf{R}_{i}^{-1}}{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)}}\right],\end{aligned}\]

(3.11)

\[\begin{aligned}{}{\widehat{\mathbf{D}}^{(k+1)}}& =\frac{1}{n}{\sum \limits_{i=1}^{n}}{\widehat{{u_{i}}{\mathbf{b}_{i}}{\mathbf{b}_{i}^{\top }}}^{(k)}},\end{aligned}\]

(3.12)

\[\begin{aligned}{}{\widehat{\phi }^{(k+1)}}& =\underset{\phi }{\arg \max }\bigg(-\frac{1}{2}\log (|{\mathbf{R}_{i}}|)\\ {} & \hspace{1em}-\frac{1}{2{\widehat{{\sigma ^{2}}}^{(k+1)}}}\Big[{\widehat{a}_{i}^{(k)}}-2{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)\top }}{\mathbf{R}_{i}^{-1}}\\ {} & \hspace{1em}\times \big({\widehat{{u_{i}}{\mathbf{y}_{i}}}^{(k)}}-{\mathbf{Z}_{i}}{\widehat{{u_{i}}{\mathbf{b}_{i}}}^{(k)}}\big)\\ {} & \hspace{1em}+{\widehat{{u_{i}}}^{(k)}}{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)\top }}{\mathbf{R}_{i}^{-1}}{\widehat{\boldsymbol{\mu }}_{i}^{(k+1)}}\Big]\bigg),\end{aligned}\]

(3.13)

\[\begin{aligned}{}{\widehat{\nu }^{(k+1)}}& =\underset{\nu }{\arg \max }\Bigg\{{\sum \limits_{i=1}^{n}}\log \\ {} & \hspace{1em}\hspace{2em}\hspace{2em}{L_{{n_{i}^{c}}}}\big({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{c{o^{(k+1)}}}},{\mathbf{S}_{i}^{c{o^{(k+1)}}}},\nu +{n_{i}^{o}}\big)\\ {} & \hspace{1em}+{\sum \limits_{i=1}^{n}}\log {t_{{n_{i}^{o}}}}\big({\mathbf{V}_{i}^{o}};{\boldsymbol{\mu }_{i}^{{o^{(k+1)}}}},{\boldsymbol{\Sigma }_{i}^{o{o^{(k+1)}}}},\boldsymbol{\nu }\big)\Bigg\},\end{aligned}\]

where $N={\textstyle\sum _{i=1}^{n}}{n_{i}}$.

The algorithm is iterated until a suitable convergence rule is satisfied. In this case, the process is iterated until some distance between two successive evaluations of the currently penalized log-likelihood ${\ell _{p}}(\boldsymbol{\theta },\lambda )$, such as $|\ell ({\widehat{\boldsymbol{\theta }}^{(k+1)}})/\ell ({\widehat{\boldsymbol{\theta }}^{(k)}})-1|$ becomes small enough. For example, we adopted $\epsilon ={10^{-6}}$.

As proposed by [26], a set of starting values may be achieved by computing ${\widehat{\boldsymbol{\beta }}^{(0)}}$, ${\widehat{{\sigma ^{2}}}^{(0)}}$, ${\widehat{\mathbf{D}}^{(0)}}$ and ${\widehat{\boldsymbol{\phi }}^{(0)}}$, as the solution of the normal linear mixed-effects model, using the R package nlme.

On the other hand, it is important to stress that, from Eqs. (3.9)–(3.12), the E-step reduces to the computation of

\[\begin{aligned}{}\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\mathbb{E}\big[{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }\big],\hspace{0.1667em}\hspace{0.1667em}\hspace{1em}\\ {} \widehat{{u_{i}}{\mathbf{y}_{i}}}& =\mathbb{E}[{u_{i}}{\mathbf{y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }]\hspace{1em}\text{and}\\ {} \hspace{1em}\widehat{{u_{i}}}& =\mathbb{E}[{u_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }].\end{aligned}\]

These expected values can be determined, using the marginal and conditional moments of the first two moments of the TMVT distribution under a double truncation, presented in Propositions 3 and 4 (Appendix B). Thus:

1. If the i-th subject has only non-censored components, then:
\[ \widehat{{u_{i}}}=\bigg(\frac{\nu +{n_{i}}}{\nu +{\delta ^{2}}({\mathbf{y}_{i}})}\bigg),\hspace{1em}\widehat{{u_{i}}{\mathbf{y}_{i}}}=\widehat{{u_{i}}}{\mathbf{y}_{i}},\hspace{1em}\widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}\hspace{-0.1667em}=\hspace{-0.1667em}\widehat{{u_{i}}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }},\]
where ${\delta ^{2}}({\mathbf{y}_{i}})={({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}})^{\top }}{\boldsymbol{\Sigma }_{i}^{-1}}({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}})$.
2. If the i-th subject has only censored components then, we have:
\[\begin{aligned}{}\widehat{{u_{i}}}& =\frac{{L_{{n_{i}}}}({\mathbf{V}_{1i}},{\mathbf{V}_{2i}};{\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}^{\ast }},\nu +2)}{{L_{{n_{i}}}}({\mathbf{V}_{1i}},{\mathbf{V}_{2i}};{\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}},\nu )},\\ {} \widehat{{u_{i}}{\mathbf{y}_{i}}}& =\widehat{{u_{i}}}\mathbb{E}({\mathbf{W}_{i}}),\\ {} \widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\widehat{{u_{i}}}\mathbb{E}\big({\mathbf{W}_{i}}{\mathbf{W}_{i}^{\top }}\big),\end{aligned}\]
where $\mathbf{W}\sim T{t_{{n_{i}}}}({\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}^{\ast }},\nu +2;({\mathbf{V}_{1i}},{\mathbf{V}_{2i}}))$, ${\boldsymbol{\mu }_{i}}={\mathbf{X}_{i}}\boldsymbol{\beta }$, ${\boldsymbol{\Sigma }_{i}^{\ast }}=\frac{\nu }{\nu +2}{\boldsymbol{\Sigma }_{i}}$, ${\boldsymbol{\Sigma }_{i}}={\mathbf{Z}_{i}}\mathbf{D}{\mathbf{Z}_{i}^{\top }}+{\boldsymbol{\Omega }_{i}}$.
3. If the i-th subject has censored and uncensored components and given that ${\mathbf{Y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}}$, ${\mathbf{Y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\mathbf{Y}_{i}^{o}}$ and ${\mathbf{Y}_{i}^{c}}|{\mathbf{V}_{i}}$, ${\mathbf{C}_{i}}$, ${\mathbf{Y}_{i}^{o}}$ are equivalent process, then, we have:
\[\begin{aligned}{}\widehat{{u_{i}}}& =\bigg(\frac{{n_{i}^{o}}+\nu }{\nu +{\delta ^{2}}({\mathbf{y}_{i}^{o}})}\bigg)\\ {} & \hspace{1em}\times \frac{{L_{{n_{i}^{c}}}}({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{co}},{\widetilde{\mathbf{S}}_{i}^{co}},\nu +{n_{i}^{o}}+2)}{{L_{{n_{i}^{c}}}}({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{co}},{\mathbf{S}_{i}^{co}},\nu +{n_{i}^{o}})},\\ {} \widehat{{u_{i}}{\mathbf{y}_{i}}}& =\mathrm{vec}\big(\widehat{{u_{i}}}{\mathbf{y}_{i}^{o}},\widehat{{u_{i}}}\mathbb{E}[{\mathbf{W}_{i}}]\big),\\ {} \widehat{{u_{i}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\left(\begin{array}{c@{\hskip10.0pt}c}\widehat{{u_{i}}}{\mathbf{y}_{i}^{o}}{\mathbf{y}_{i}^{o\top }}& \widehat{{u_{i}}}{\mathbf{y}_{i}^{o}}{\mathbb{E}^{\top }}[{\mathbf{W}_{i}}]\\ {} \widehat{{u_{i}}}\mathbb{E}[{\mathbf{W}_{i}}]{\mathbf{y}_{i}^{{o^{\top }}}}& \widehat{{u_{i}}}\mathbb{E}[{\mathbf{W}_{i}}{\mathbf{W}_{i}^{\top }}]\end{array}\right),\end{aligned}\]
where ${\mathbf{W}_{i}}\sim T{t_{{n_{i}^{c}}}}({\boldsymbol{\mu }_{i}^{co}},{\widetilde{\mathbf{S}}_{i}^{co}},\nu +{n_{i}^{o}}+2;({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}}))$, ${\widetilde{\mathbf{S}}_{i}^{co}}=(\frac{\nu +{\delta ^{2}}({\mathbf{y}_{i}^{o}})}{\nu +{n_{i}^{o}}+2}){\mathbf{S}_{i}}$ and ${\mathbf{S}_{i}}$, ${\mathbf{S}_{i}^{co}}$ and ${\boldsymbol{\mu }_{i}^{co}}$ are defined in Section 3.3.

Formulas for $\mathbb{E}[\mathbf{W}]$ and $\mathbb{E}[\mathbf{W}{\mathbf{W}^{\top }}]$, where $\mathbf{W}\sim T{t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;\mathbb{A})$, have been recently developed in the R package relliptical [31], using a slice sampling algorithm with Gibbs sampler steps.

3.5 The Expected Information Matrix

As developed by [21] and [26], we used the [20]’s technique to obtain an asymptotic approximation for the variances of the fixed effects in our t-MLE model. This approximation is given by:

\[ {\mathbf{I}_{s}}(\boldsymbol{\beta }|\mathbf{y})={\mathbf{I}_{c}}(\boldsymbol{\beta }|\mathbf{y})-{\mathbf{I}_{m}}(\boldsymbol{\beta }|\mathbf{y}),\]

where ${\mathbf{I}_{s}}(\boldsymbol{\beta }|\mathbf{y})$ represents the information about $\boldsymbol{\beta }$ in the observed data y, ${\mathbf{I}_{c}}(\boldsymbol{\beta }|\mathbf{y})$ is the conditional expectation of the complete-data information and ${\mathbf{I}_{m}}(\boldsymbol{\beta }|\mathbf{y})$ is the missing information. Therefore, the approximated covariance-matrix of $\widehat{\boldsymbol{\beta }}$ is given by:

\[ \widehat{\mathrm{Cov}}(\widehat{\boldsymbol{\beta }})\approx {\mathbf{I}_{s}^{-1}}(\boldsymbol{\beta }){|_{\widehat{\boldsymbol{\theta }}}},\]

where

\[\begin{aligned}{}{\mathbf{I}_{s}}(\boldsymbol{\beta })& ={\sum \limits_{i=1}^{n}}\left\{\bigg(\frac{\nu +{n_{i}}}{\nu +{n_{i}}+2}\bigg){\mathbf{X}_{i}^{\top }}{\boldsymbol{\Sigma }_{i}^{-1}}{\mathbf{X}_{i}}\right.\\ {} & \hspace{1em}-\left.{\mathbf{X}_{i}^{\top }}{\boldsymbol{\Sigma }_{i}^{-1}}\bigg(\bigg(\frac{\nu +{n_{i}}+2}{\nu +{n_{i}}}\bigg){\mathrm{E}_{2}}-{\mathrm{E}_{1}}\bigg){\boldsymbol{\Sigma }_{i}^{-1}}{\mathbf{X}_{i}}\right\},\end{aligned}\]

where

\[\begin{aligned}{}{\mathrm{E}_{1}}& =(\widehat{{u_{i}}{\mathbf{y}_{i}}}-\widehat{{u_{i}}}{\boldsymbol{\mu }_{i}}){(\widehat{{u_{i}}{\mathbf{y}_{i}}}-\widehat{{u_{i}}}{\boldsymbol{\mu }_{i}})^{\top }}\hspace{1em}\text{and}\\ {} {\mathrm{E}_{2}}& =\big(\widehat{{u_{i}^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}-\widehat{{u_{i}^{2}}{\mathbf{y}_{i}}}{\boldsymbol{\mu }_{i}^{\top }}-{\boldsymbol{\mu }_{i}}\widehat{{u_{i}^{2}}{\mathbf{y}_{i}^{\top }}}+\widehat{{u_{i}^{2}}}{\boldsymbol{\mu }_{i}}{\boldsymbol{\mu }_{i}^{\top }}\big).\end{aligned}\]

Note that ${\mathrm{E}_{1}}$ depend on the computation of $\widehat{{u_{i}}}$, $\widehat{{u_{i}}{\mathbf{y}_{i}}}$ that can be obtained in Section 3.4 and ${\mathrm{E}_{2}}$ depend on the following quantities

\[\begin{aligned}{}\widehat{{u_{i}^{2}}}& =\mathbb{E}\bigg[{\bigg(\frac{\nu +{n_{i}}}{\nu +{\delta ^{2}}({\mathbf{y}_{i}})}\bigg)^{2}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }\bigg],\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}}& =\mathbb{E}\bigg[{\bigg(\frac{\nu +{n_{i}}}{\nu +{\delta ^{2}}({\mathbf{y}_{i}})}\bigg)^{2}}{\mathbf{y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }\bigg]\hspace{1em}\text{and}\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\mathbb{E}\bigg[{\bigg(\frac{\nu +{n_{i}}}{\nu +{\delta ^{2}}({\mathbf{y}_{i}})}\bigg)^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},\boldsymbol{\theta }\bigg].\end{aligned}\]

These expected values can be determined in closed form using Propositions 3 and 4, as follows:

1. If the i-th subject has only non-censored components, then,
\[ \widehat{{u_{i}^{2}}}\hspace{-0.1667em}=\hspace{-0.1667em}{\bigg(\frac{\nu +{n_{i}}}{\nu +{\delta ^{2}}({\mathbf{y}_{i}})}\bigg)^{2}},\hspace{1em}\widehat{{u_{i}^{2}}{\mathbf{y}_{i}}}\hspace{-0.1667em}=\hspace{-0.1667em}\widehat{{u_{i}^{2}}}{\mathbf{y}_{i}},\hspace{1em}\widehat{{u_{i}^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}\hspace{-0.1667em}=\hspace{-0.1667em}\widehat{{u_{i}^{2}}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }},\]
where ${\delta ^{2}}({\mathbf{y}_{i}})={({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}})^{\top }}{\boldsymbol{\Sigma }_{i}^{-1}}({\mathbf{y}_{i}}-{\boldsymbol{\mu }_{i}})$.
2. If the i-th subject has only censored components then
\[\begin{aligned}{}\widehat{{u_{i}^{2}}}& ={c_{p}}(\nu ,2)\frac{{L_{{n_{i}}}}({\mathbf{V}_{1i}},{\mathbf{V}_{2i}};{\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}^{\ast }},\nu +4)}{{L_{{n_{i}}}}({\mathbf{V}_{1i}},{\mathbf{V}_{2i}};{\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}},\nu )},\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}}& =\widehat{{u_{i}^{2}}}\mathbb{E}[{\mathbf{W}_{i}}],\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\widehat{{u_{i}^{2}}}\mathbb{E}\big[{\mathbf{W}_{i}}{\mathbf{W}_{i}^{\top }}\big],\end{aligned}\]
where ${c_{p}}(\nu ,2)=\frac{({n_{i}}+\nu )(\nu +2)}{\nu ({n_{i}}+\nu +2)}$, ${\mathbf{W}_{i}}\sim T{t_{{n_{i}}}}({\boldsymbol{\mu }_{i}},{\boldsymbol{\Sigma }_{i}^{\ast }},\nu +4$; $({\mathbf{V}_{1i}},{\mathbf{V}_{2i}}))$, ${\boldsymbol{\Sigma }_{i}^{\ast }}=\frac{\nu }{\nu +4}{\boldsymbol{\Sigma }_{i}}$, ${\boldsymbol{\mu }_{i}}={\mathbf{X}_{i}}\boldsymbol{\beta }$, ${\boldsymbol{\Sigma }_{i}}={\mathbf{Z}_{i}}\mathbf{D}{\mathbf{Z}_{i}^{\top }}+{\boldsymbol{\Omega }_{i}}$.
3. If the i-th subject has censored and uncensored components and given that $\{{\mathbf{Y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}}\}$, $\{{\mathbf{Y}_{i}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\mathbf{Y}_{i}^{o}}\}$ and $\{{\mathbf{Y}_{i}^{c}}|{\mathbf{V}_{i}},{\mathbf{C}_{i}},{\mathbf{Y}_{i}^{o}}\}$ are equivalent process, we have:
\[\begin{aligned}{}\widehat{{u_{i}^{2}}}& =\frac{{d_{p}}({n_{i}^{o}},\nu ,2)}{{(\nu +{\delta ^{2}}({\mathbf{y}_{i}^{o}}))^{2}}}\\ {} & \hspace{1em}\times \frac{{L_{{n_{i}^{c}}}}({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{co}},{\widetilde{\mathbf{S}}_{i}^{co}},\nu +{n_{i}^{o}}+4)}{{L_{{n_{i}^{c}}}}({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}};{\boldsymbol{\mu }_{i}^{co}},{\mathbf{S}_{i}^{co}},\nu +{n_{i}^{o}})},\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}}& =\mathrm{vec}\big(\widehat{{u_{i}^{2}}}{\mathbf{y}_{i}^{o}},\widehat{{u_{i}^{2}}}\mathbb{E}[{\mathbf{W}_{i}}]\big),\\ {} \widehat{{u_{i}^{2}}{\mathbf{y}_{i}}{\mathbf{y}_{i}^{\top }}}& =\left(\begin{array}{c@{\hskip10.0pt}c}\widehat{{u_{i}^{2}}}{\mathbf{y}_{i}^{o}}{\mathbf{y}_{i}^{o\top }}& \widehat{{u_{i}^{2}}}{\mathbf{y}_{i}^{o}}{\mathbb{E}^{\top }}[{\mathbf{W}_{i}}]\\ {} \widehat{{u_{i}^{2}}}\mathbb{E}[{\mathbf{W}_{i}}]{\mathbf{y}_{i}^{{o^{\top }}}}& \widehat{{u_{i}^{2}}}\mathbb{E}[{\mathbf{W}_{i}}{\mathbf{W}_{i}^{\top }}]\end{array}\right),\end{aligned}\]
where ${d_{p}}({n_{i}^{o}},\nu ,2)=\frac{(\nu +{n_{i}})({n_{i}^{o}}+\nu +2)({n_{i}^{o}}+\nu )}{{n_{i}}+\nu +2}$, ${\mathbf{W}_{i}}\sim T{t_{{n_{i}^{c}}}}({\boldsymbol{\mu }_{i}^{co}},{\widetilde{\mathbf{S}}_{i}^{co}},\nu +{n_{i}^{o}}+4;({\mathbf{V}_{1i}^{c}},{\mathbf{V}_{2i}^{c}}))$, ${\widetilde{\mathbf{S}}_{i}^{co}}=(\frac{\nu +{\delta ^{2}}({\mathbf{y}_{i}^{o}})}{\nu +{n_{i}^{o}}+4}){\mathbf{S}_{i}}$ and ${\mathbf{S}_{i}}$, ${\mathbf{S}_{i}^{co}}$ and ${\boldsymbol{\mu }_{i}^{co}}$ are presented in Section 3.3.

4 Simulation Studies

In this Section, we conduct two simulation studies in order to analyze the performance of our proposed methods. The first simulation study shows the asymptotic behavior of the ML estimators developed in Section 3.4 and the second one examines the consistency of approximate the standard error (SE) presented in Section 3.5. We consider the AR(p)-LMEC model, defined in Eqs. (3.1)–(3.3) with left censored cases and ${n_{i}}=7$ measurements for each subject. As recommended by [26], for both simulation schemes, the initial set of values to generate the sample were $\beta ={(1,2,1)^{\top }}$, ${\sigma ^{2}}=0.5$. For the dispersion matrices of the random error and random effects we assume an autoregressive dependence of order $p=2$ with $\phi ={(0.6,-0.2)^{\top }}$ and $D=\left(\begin{array}{c@{\hskip10.0pt}c}0.490& 0.001\\ {} 0.001& 0.002\end{array}\right)$, respectively. The design matrices for each individual are given by ${\mathbf{X}_{i}}=({{\mathbf{X}_{i}}_{1}},{{\mathbf{X}_{i}}_{2}},{{\mathbf{X}_{i}}_{3}})$ and ${\mathbf{Z}_{i}}=({\mathbf{1}_{7}},{{\mathbf{Z}_{i}}_{2}})$, with ${\mathbf{X}_{ik}}={({X_{ik1}},\dots ,{X_{ik7}})^{\top }}$ and ${\mathbf{Z}_{i2}}={({Z_{i21}},\dots ,{Z_{i27}})^{\top }}$ where ${\mathbf{1}_{m}}$ represents a $1\times m$ vector of ones. Each vector, ${X_{ikj}}$ and ${Z_{i2j}}$ are generated independently of a uniform distribution, thus ${X_{ikj}}\sim U(0,1)$ and ${Z_{i2j}}\sim U(1,3)$, $i=1,\dots ,n$, $k=2,3$ and $j=1,\dots ,7$. These sets of values are fixed for all the replications. It is important to stress that, for the two simulation, we consider the parameter ν as fixed at 4. Thus, the parameter vector is $\boldsymbol{\theta }={({\boldsymbol{\beta }^{\top }},{\sigma ^{2}},{\boldsymbol{\alpha }^{\top }},{\boldsymbol{\phi }^{\top }})^{\top }}$.

4.1 Simulation Study 1

We generated $R=100$ datasets from the AR(2)-t-LMEC model, considering each one of the combinations, of different sample sizes $n\in \{50,100,200,400,600\}$ and censoring levels $l\% \in \{0\% ,5\% ,20\% ,40\% \}$. In each combination, were fitted the AR(2)-t-LMEC and AR(2)-N-LMEC models. Thus, for both models, we computed the bias ($Bias$) and mean squared error ($MSE$) of the estimates obtained. These measures are defined by:

\[\begin{aligned}{}Bias(\widehat{{\theta _{i}}})& =\frac{1}{R}{\sum \limits_{j=1}^{R}}\big({\widehat{{\theta _{i}}}^{(j)}}-{\theta _{i}}\big)\hspace{2.5pt}\hspace{2.5pt}\hspace{2.5pt}\text{and}\hspace{2.5pt}\hspace{2.5pt}\hspace{2.5pt}\\ {} MSE(\widehat{{\theta _{i}}})& =\frac{1}{R}{\sum \limits_{j=1}^{R}}{\big({\widehat{{\theta _{i}}}^{(j)}}-{\theta _{i}}\big)^{2}},\end{aligned}\]

where ${\widehat{{\theta _{i}}}^{(j)}}$ denotes the ML estimate of ${\theta _{i}}$, for the j-th replication, $j=1,\dots ,R$.

Figure 1

Simulation study 1. Average Bias of parameter estimates under the AR(2)-t-LMEC model.

Figure 2

Simulation study 1. Average MSE of parameter estimates under the AR(2)-t-LMEC model.

Figure 3

Simulation study 1. Average Bias of parameter estimates under the AR(2)-N-LMEC model.

Figure 4

Simulation study 1. Average MSE of parameter estimates under the AR(2)-N-LMEC model.

From Figures 1–2, we observe as a general rule that, when the distribution’s assumption is correct, the $Bias$ and $MSE$ tend to zero when the sample size increases, indicating that the ML estimates based on the proposed EM-type algorithm do provide good asymptotic properties, under the t-LMEC model. However, from Figures 3–4, we noted that when the distribution’s assumption is incorrect, the ML estimates do not provide good asymptotic properties; for example, the Bias of ${\widehat{\phi }_{1}}$ under the AR(2)-N-LMEC model, departs from zero when the sample size increases.

4.2 Simulation Study 2

In this simulation study, we consider sample sizes $n\in \{40,100\}$ and generated $R=100$ Monte Carlo samples from the AR(2)-N-LMEC and AR(2)-t-LMEC models, with different censoring levels $l\% \in \{0\% ,5\% ,15\% ,25\% ,40\% \}$. We analyzed the $SE$ for the $\boldsymbol{\beta }=({\beta _{1}},{\beta _{2}},{\beta _{3}})$ parameter vector (MC-SE), the averages values of the standard errors computed, using the empirical information matrix (MC-IM-SE) described in Section 3.5, and the percentage of times in the R samples, that the $95\% $ confidence interval contained the true parameter values (MC-COV), assuming asymptotic normality. These measures are defined by:

\[ \text{MC-IM-SE}(\widehat{{\boldsymbol{\beta }_{i}}})=\frac{1}{R}{\sum \limits_{j=1}^{R}}{\widehat{SE({\boldsymbol{\beta }_{i}})}^{(j)}}\]

and

\[ \text{MC-SE}(\widehat{{\boldsymbol{\beta }_{i}}})=\sqrt{\frac{1}{R-1}\Bigg({\sum \limits_{j=1}^{R}}{\big({\widehat{{\boldsymbol{\beta }_{i}}}^{(j)}}\big)^{2}}-{\Bigg(\frac{1}{R}{\sum \limits_{j=1}^{R}}{\widehat{{\boldsymbol{\beta }_{i}}}^{(j)}}\Bigg)^{2}}\Bigg)},\]

where for $j=1,\dots ,R$, we have that ${\widehat{{\boldsymbol{\beta }_{i}}}^{(j)}}$ and $\widehat{SE{({\boldsymbol{\beta }_{i}})^{(j)}}}$ represent the ML estimates of parameter ${\boldsymbol{\beta }_{i}}$ and the SE estimate of ${\widehat{{\boldsymbol{\beta }_{i}}}^{(j)}}$, respectively.

Table 1

Standard errors of parameter estimates (MC-SE), average values of the standard errors (MC-IM-SE) and MC-COV, under t-MLE model with $n=40$.

Censoring		AR(2)-N-LMEC
Levels	Criteria	${\beta _{1}}$	${\beta _{2}}$	${\beta _{3}}$
0%	MC-IM-SE	0.2106	0.2298	0.2160
	MC-SE	0.1648	0.1791	0.1604
	MC-COV	0.98	0.97	0.98
5%	MC-IM-SE	0.2220	0.2151	0.2066
	MC-SE	0.1840	0.1601	0.1770
	MC-COV	0.97	0.96	0.96
15%	MC-IM-SE	0.2136	0.2155	0.2100
	MC-SE	0.1538	0.1661	0.1611
	MC-COV	1	0.96	0.97
25%	MC-IM-SE	0.2177	0.2161	0.2076
	MC-SE	0.2009	0.1708	0.1840
	MC-COV	0.96	0.98	0.93
40%	MC-IM-SE	0.2569	0.2559	0.2551
	MC-SE	0.2050	0.2440	0.2124
	MC-COV	0.96	0.95	0.96
Censoring		AR(2)-t-LMEC
Levels	Criteria	${\beta _{1}}$	${\beta _{2}}$	${\beta _{3}}$
0%	MC-IM-SE	0.1469	0.1600	0.1467
	MC-SE	0.1366	0.1450	0.1338
	MC-COV	0.95	0.98	0.96
5%	MC-IM-SE	0.1654	0.1575	0.1567
	MC-SE	0.1543	0.1418	0.1492
	MC-COV	0.94	0.95	0.97
15%	MC-IM-SE	0.1834	0.1811	0.1743
	MC-SE	0.1443	0.1471	0.1315
	MC-COV	0.98	0.97	0.98
25%	MC-IM-SE	0.2053	0.1973	0.2004
	MC-SE	0.1636	0.1451	0.1439
	MC-COV	0.98	0.97	0.96
40%	MC-IM-SE	0.2525	0.2545	0.2503
	MC-SE	0.1690	0.1774	0.1781
	MC-COV	0.98	0.94	0.93

Table 2

Standard errors of parameter estimates (MC-SE), average values of the standard errors (MC-IM-SE) and MC-COV, under t-MLE model with $n=100$.

Censoring		AR(2)-N-LMEC
Levels	Criteria	${\beta _{1}}$	${\beta _{2}}$	${\beta _{3}}$
0%	MC-IM-SE	0.1267	0.1296	0.1231
	MC-SE	0.1159	0.1240	0.1242
	MC-COV	0.94	0.96	0.95
5%	MC-IM-SE	0.1144	0.1151	0.1241
	MC-SE	0.1040	0.1063	0.1201
	MC-COV	0.95	0.97	0.95
15%	MC-IM-SE	0.1149	0.1240	0.1208
	MC-SE	0.1003	0.1052	0.1062
	MC-COV	0.97	0.97	0.99
25%	MC-IM-SE	0.1259	0.1257	0.1245
	MC-SE	0.1103	0.1315	0.1151
	MC-COV	0.99	0.95	0.96
40%	MC-IM-SE	0.1446	0.1412	0.1371
	MC-SE	0.1144	0.1260	0.1271
	MC-COV	0.97	0.98	0.96
Censoring		AR(2)-t-LMEC
Levels	Criteria	${\beta _{1}}$	${\beta _{2}}$	${\beta _{3}}$
0%	MC-IM-SE	0.0915	0.0960	0.0913
	MC-SE	0.0938	0.0923	0.0882
	MC-COV	0.97	0.95	0.96
5%	MC-IM-SE	0.1012	0.1006	0.1041
	MC-SE	0.0938	0.0962	0.0939
	MC-COV	0.96	0.97	0.97
15%	MC-IM-SE	0.1096	0.1102	0.1145
	MC-SE	0.1003	0.0978	0.0941
	MC-COV	0.97	0.97	0.97
25%	MC-IM-SE	0.1257	0.1272	0.1249
	MC-SE	0.0932	0.1133	0.1021
	MC-COV	0.99	0.96	0.96
40%	MC-IM-SE	0.1495	0.1478	0.1554
	MC-SE	0.0907	0.0968	0.1085
	MC-COV	0.98	0.99	0.98

From Tables 1 and 2, we notice that the MC-IM-SE and MC-SE for the $\boldsymbol{\beta }$ parameter vector, from the t-LMEC model are smaller than the corresponding MC-IM-SE and MC-SE, from the N-LMEC model. Besides, the MC-COV under both models are similar.

5 Application

In this Section, we study the viral load data of patients infected with the human immunodeficiency virus type 1 (HIV-1), from the clinical trial “A5055”, previously analyzed by [32]. This clinical trial consists of 44 HIV-1 infected patients that were treated with one of the two powerful antiretroviral therapies: (i) IDV 800 mg and RTV 200 mg, twice a day every 12 hs. and (ii) IDV 400 mg and RTV 400 mg, twice a day every 12 hs. (see [33] for more details). Were measured the number of RNA copies in the blood plasma (viral load) and the immunological marker of T cells of differentiation groups 4 (CD4), at different times of the treat. It is important to note that the viral loads were not necessarily measured at equally spaced intervals of days.

Following [33], we investigate the longitudinal evolution of RNA viral load (in log-base-10 scale) of the 44 patients, during antiretroviral treatment. We focus on analyzing only 42 HIV-1 infected patients; two patients were excluded from the analysis because both of them dropped out very early ($\mathrm{\# }\hspace{2.5pt}4$ and $\mathrm{\# }\hspace{2.5pt}8$ with 3 and 1 measurements, respectively).

The trajectories of transformed RNA viral load along with the detection limit by plotting dotted red lines are shown in Figure 5. The lower detection limit for RNA viral load is $50\hspace{2.5pt}\text{copies}/\text{ml}$, thus 106 out of 312 viral load measurements, below the detection limit, are considered to be censoring (approximately 34%). We start by presenting the results obtained by fitting the LMEC model defined in (3.1) and (3.3), as follows:

\[ {\mathbf{Y}_{i}}={\mathbf{X}_{i}}\boldsymbol{\beta }+{\mathbf{Z}_{i}}{\mathbf{b}_{i}}+{\boldsymbol{\epsilon }_{i}},\hspace{0.1667em}\hspace{0.1667em}\]

with

${\mathbf{Y}_{i}}={({Y_{i1}},\dots ,{Y_{i{n_{i}}}})^{\top }}$,
${\mathbf{X}_{i}}=({\mathbf{1}_{{n_{i}}}^{\top }},{\mathbf{t}_{i}^{\top }},{\mathrm{Treat}_{i}}\times {\mathbf{1}_{{n_{i}}}^{\top }},{\mathrm{CD}\mathrm{4}^{1/2}}\times {\mathbf{1}_{{n_{i}}}^{\top }},{\mathrm{Treat}_{i}}\times {\mathbf{t}_{i}^{\top }})$,
${\mathbf{Z}_{i}}=({\mathbf{1}_{{n_{i}}}^{\top }},{\mathbf{t}_{i}^{\top }})$, $\boldsymbol{\beta }={({\beta _{0}},\dots ,{\beta _{4}})^{\top }}$, ${\mathbf{b}_{i}}={({b_{i0}},{b_{i1}})^{\top }}$,
${\boldsymbol{\epsilon }_{i}}={({\epsilon _{1}},\dots ,{\epsilon _{{n_{i}}}})^{\top }}$, for $j=1,\dots ,{n_{i}}$ and $i=1,\dots ,42$,

where

• ${\mathbf{1}_{m}}$ represents a $1\times m$ vector of ones and ${\mathbf{t}_{i}}=({t_{i1}},\dots ,{t_{i{n_{i}}}})$.
• ${Y_{ij}}$ is the is the log10-transformation of the viral load of subject “i” at time ${t_{ij}}$ (days).
• ${\mathrm{Treat}_{i}}$ is a treatment indicator (i.e. 0 for the treatments 1 and 1 for the treatments 2).
• ${\mathrm{CD}\mathrm{4}_{ij}^{1/2}}$ represents the square root of CD4 counts of patient i, measured at time ${t_{ij}}$.
• The random intercept and random slope are represent by ${b_{i0}}$ and ${b_{i1}}$, respectively.

For comparison purposes, we fitted the N-LMEC and the t-LMEC counterpart with the structure dependency errors discussed in Section 3.2. Thus,

(1) Assuming the N-LMEC model.

We made a preliminary analysis using unconditional independence (UNC) in the random effects and random errors, i.e., the model proposed by [30]. The scatter plot of the estimated random intersections (${b_{0}}$ effect) and random slopes (${b_{1}}$ effect), presented in Figure 6(a), suggests that there is a possible linear relationship between both of them. We can suspect that there is a dependency structure for the random effect. From Figure 6(b), we observe that, apparently, the behavior of the marginal residuals is not totally random, highlighting a trend. Finally, in Figure 6(c)–(e), we show the QQ-plots for the ${b_{0}}$ and ${b_{1}}$ effect and the standard marginal residuals. These figures clearly indicate that the normality assumption for random effects and within-subject errors may be unrealistic. Moreover, the variogram plot of the standard marginal residuals, presented in Figure 7, indicates a long-term autocorrelation, showing a possible intra-individual correlation over time.

Tables 3 presents the results for the “A5055” dataset considering the N-LMEC model, under the different correlation structures. We observed that only the estimation of ${\sigma ^{2}}$ parameters is affected by the choice of the correlation structure.
(2) Assuming the t-LMEC model.

In order to address the possible serial correlation among within-subject errors, we fit several models: i) the unconditionally independent (UNC), ii) the damped exponential correlation (DEC), and iii) the continuous-type autoregressive process (AR) of order $p\in \{1,2,3\}$ cases. It is important to stress that considering the DEC structure it is possible to obtain different correlation structures, for instance, the symmetry (SYM) and DEC-AR(1).

From Table 4, we observed that there are no significant differences in the estimation of the fixed and random effects parameters under the different correlation structures. On the other hand, the estimation of the ν and ${\sigma ^{2}}$ parameters are affected by the choice of the correlation structure.

Note from Figure 8(a)–(b) that the inclusion of a correlation structure improves the model fit. The QQ plots for the ${b_{0}}$, ${b_{1}}$ effect, and standard marginal residuals, under the t-LMEC model with DEC-AR(1) correlation structure, presented in Figure 8(c)–(e) show a significant improvement, with respect to fit obtained under the N-LMEC model. On the other hand, in Figure 9(a)–(b), we present the simulated envelope plots of Marginal residuals, from fitting the DEC(AR)-N-LMEC and DEC(AR)-t-LMEC models. We observed that some marginal residuals under the DEC(AR)-N-LMEC model are outside the boundary of the envelope, indicating the inadequacy of this model. However, the marginal residuals corresponding to the DEC(AR)-t-LMEC model are within the boundary of the envelope. This shows that although we consider the DEC(AR) correlation structure, a correct specification of the data distribution assumption is essential. Finally, as suggested by [19], we performed the likelihood ratio test to examine the appropriateness of using DEC(AR) dependency structure errors versus UNC errors under the t-LMEC model, obtaining a p-value of 0.0058. As expected, this result indicates the existence of strong dependence among within-patient errors across occasions under the t-LMEC model.

On the other hand, Table 5 shows the summaries of the log-likelihood, AIC and BIC criteria for t-LMEC and N-LMEC models, respectively, under the different autocorrelation structures. We observe that t-LMEC consistently outperforms the normal counterpart in all cases. In particular, these criteria indicate a preference of the unspecified correlation structure (DEC(AR)), i.e., the estimated correlation structure of the error is given by:

\[ {\widehat{\boldsymbol{\Omega }}_{i}}=\widehat{{\sigma ^{2}}}{\widehat{\mathbf{R}}_{i}}=\widehat{{\sigma ^{2}}}\big[{\widehat{\phi }_{1}^{|{t_{ij}}-{t_{ik}}|}}\big]=0.51\big[{0.899^{|{t_{ij}}-{t_{ik}}|}}\big].\]

Thus, by the previous analysis, we conclude that the t-LMEC model with DEC(AR) correlation structure for the error, is the most appropriate model for these dataset.

Figure 5

A5055 Data. Trajectories of ${\log _{10}}(RNA)$ for 42 HIV-1 infected patients who were randomized to one of two treatment regimens.

Figure 6

A5055 Data. Scatter diagrams of the random effect and marginal errors and Q-Q plots of estimates of random intercepts and slopes and marginal residuals for log10(RNA) using the N-LMEC model with conditional independence in the random effects and the random errors.

Figure 7

A5055 Data. Variogram from the marginal residuals for log10(RNA) using the N-LMEC model with conditional independence in the random effects and the random errors.

Table 3

A5055 Data. Summary of parameter estimates and the standard errors of fixed effects (in parentheses) under the N-LMEC model, with different correlation structures for the errors.

	Correlation structures
Parameter estimates		Damped exponential			AR(p)
Parameter estimates	UNC	DEC	DEC-AR(1)	SYM	AR(1)	AR(2)	AR(3)
${\beta _{0}}$	4.383	4.379	4.378	4.388	4.384	4.382	4.287
	(0.533)	(0.002)	(0.002)	(0.010)	(0.540)	(0.553)	(0.587)
${\beta _{1}}$	−0.004	−0.004	−0.004	−0.004	−0.004	−0.004	−0.003
	(0.009)	(0.008)	(0.008)	(0.008)	(0.009)	(0.009)	(0.009)
${\beta _{2}}$	0.305	0.304	0.303	0.304	0.307	0.308	0.361
	(0.309)	(0.058)	(0.058)	(0.013)	(0.310)	(0.309)	(0.316)
${\beta _{3}}$	−0.111	−0.111	−0.111	−0.112	−0.111	−0.111	−0.113
	(0.031)	(0.010)	(0.010)	(0.008)	(0.032)	(0.032)	(0.034)
${\beta _{4}}$	−0.003	−0.003	−0.003	−0.003	−0.003	−0.003	−0.003
	(0.005)	(0.004)	(0.004)	(0.004)	(0.005)	(0.005)	(0.006)
${\sigma ^{2}}$	0.733	0.733	0.732	0.765	0.427	0.417	0.193
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{11}}$	0.383	0.382	0.381	0.352	0.382	0.384	0.294
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{12}}$	−0.004	−0.004	−0.004	−0.004	−0.004	−0.004	−0.003
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{22}}$	>0.001	>0.001	>0.001	>0.001	>0.001	>0.001	>0.001
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{1}}$	–	0.998	1.000	0.000	0.652	0.762	0.273
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{2}}$	–	0.106	1	0	–	−0.484	0.102
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{1}}$	–	–	–	–	–	–	−0.820
	(–)	(–)	(–)	(–)	(–)	(–)	(–)

Table 4

A5055 Data. Summary of parameter estimates and the standard errors of fixed effects (in parentheses)under the t-LMEC model, with different correlation structures for the errors.

	Correlation structures
Parameter estimates		Damped exponential			AR(p)
Parameter estimates	UNC	DEC	DEC-AR(1)	SYM	AR(1)	AR(2)	AR(3)
${\beta _{0}}$	3.995	3.998	4.051	4.001	3.988	3.987	4.080
	(0.604)	(0.509)	(0.531)	(0.513)	(0.607)	(0.606)	(0.643)
${\beta _{1}}$	−0.005	−0.005	−0.004	−0.005	−0.005	−0.005	−0.004
	(0.008)	(0.007)	(0.007)	(0.007)	(0.008)	(0.008)	(0.009)
${\beta _{2}}$	0.350	0.350	0.385	0.345	0.354	0.355	0.353
	(0.309)	(0.250)	(0.264)	(0.253)	(0.308)	(0.309)	(0.323)
${\beta _{3}}$	−0.086	−0.086	−0.092	−0.086	−0.086	−0.086	−0.094
	(0.020)	(0.020)	(0.020)	(0.020)	(0.020)	(0.020)	(0.021)
${\beta _{4}}$	−0.003	−0.003	−0.004	−0.004	−0.004	−0.004	−0.004
	(0.005)	(0.005)	(0.005)	(0.005)	(0.005)	(0.006)	(0.006)
${\sigma ^{2}}$	0.473	0.476	0.510	0.507	0.286	0.282	0.175
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{11}}$	0.220	0.222	0.210	0.194	0.218	0.218	0.222
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{12}}$	−0.002	−0.002	−0.002	−0.002	−0.002	−0.002	−0.002
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\alpha _{22}}$	>0.001	>0.001	>0.001	>0.001	>0.001	>0.001	>0.001
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{1}}$	–	0.105	0.899	0.066	0.634	0.731	0.276
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{2}}$	–	0.998	1	0	–	−0.421	0.109
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
${\phi _{3}}$	–	–	–	–	–	–	−0.789
	(–)	(–)	(–)	(–)	(–)	(–)	(–)
ν	3.382	3.407	3.953	3.304	3.247	3.296	4.915
	(–)	(–)	(–)	(–)	(–)	(–)	(–)

Table 5

A5055 Data. Summary of selection criteria for the t-LMEC and N-LMEC model under different correlation structures.

		Correlation structures
Model	Criteria		Damped exponential			AR(p)
		UNC	DEC	DEC-AR(1)	SYM	AR(1)	AR(2)	AR(3)
t-LMEC	loglik	−353.840	−353.877	−349.323	−353.696	−353.439	−353.456	−352.021
	AIC	729.681	731.755	720.646	729.393	728.878	730.913	730.042
	BIC	770.854	776.671	761.819	770.566	770.0517	775.829	778.701
N-LMEC	loglik	−362.571	−362.643	−362.671	−362.569	−362.343	−362.385	−355.407
	AIC	743.142	747.285	745.342	745.139	744.686	746.771	734.815
	BIC	776.829	788.458	782.773	782.569	782.116	787.944	779.731

All the computational procedures were realized using a 64-bit Windows environment on a notebook machine with a 1.99 GHz Intel Core i7 processor with 12GB of RAM, the EM algorithm took approximately 9.8 min to converge by each model.

Figure 8

A5055 Data. Scatter diagrams of the random effects and marginal errors and Q-Q plots of estimates of random effects and marginal residuals for log10(RNA) using the DEC(AR)-t-LMEC model.

Figure 9

A5055 Data. Simulated envelope of the Marginal residuals from the fitted models: (a) DEC(AR)-N-LMEC model and (b) DEC(AR)-t-LMEC model.

6 Conclusions

In this paper, we proposed a robust linear mixed effect model to analyze longitudinal censored or missing data under the multivariate Student’s t-distribution with within-subject correlation, considering some useful dependence structures. In practical implementation, the EM algorithm is used to obtain the ML estimates of the model parameters. The model proposed in this manuscript is an extension of [21], [22] and [26] by developing some additional tools for robust inferences in practical data analysis.

Two simulation studies were performed in order to evaluate the proposed model. The simulation studies validate our method’s performance and indicate an efficiency gain of the t-LMEC model over the N-LMEC model when data present heavy tails. A real data set, previously analyzed with a conditional independent structure, is analyzed. The results show the t-LMEC model with DEC-AR(1) correlation structure fits the data better. The proposed methods were implemented as part of the new R package ARpLMEC, which is available for download at the CRAN repository. Moreover, we have prepared a Rmarkdown file which contains the codes to reproduce the results of the application and it is available with this manuscript via GitHub https://github.com/hlachos/tlmec/.

Although the t-LMEC model considered here has shown great flexibility for modeling symmetric data with evidence of heavier tails than the normal distributions, its robustness against outliers can be seriously affected by the presence of skewness. Thus, it is of interest to generalize the t-LMEC model by considering a more flexible family of distributions, such as the scale mixtures of skew-normal (SMSN) distributions [14], to accommodate the censoring, skewness and heavy tails simultaneously. It is also of interest to develop effective algorithms for non-linear censored mixed effect (NLMEC) models or semiparametric structures in order to accommodate substantial non-linearity trends observed in the data.

Another possible to explore is to study models that estimate the conditional heteroscedasticity of dataset, as ARCH [12], GARCH [5, 6] models, among others. Finally, another interesting avenue for research is to propose methods within a unified framework in LMEC/NLMEC models for checking random-effects distribution following [1] for generalized linear mixed models and [10] for non-linear mixed models. An in-depth investigation of such extensions is beyond the scope of the present paper, but these are interesting topics for further research.

Appendix A p-variate Student’s-t

The following properties of the p-variate Student’s t-distribution are useful for our theoretical developments. We start with the marginal-conditional decomposition of a p-variate Student’s t random vector. The proof of the following propositions can be found in [2].

Proposition 1.

Let $\mathbf{Y}\sim {t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$ partitioned as $\mathbf{Y}={({\mathbf{Y}_{1}^{\top }},{\mathbf{Y}_{2}^{\top }})^{\top }}$ with $dim({\mathbf{Y}_{1}})={p_{1}}$, $dim({\mathbf{Y}_{2}})={p_{2}}$, where ${p_{1}}+{p_{2}}=p$. Let $\boldsymbol{\mu }={({\boldsymbol{\mu }_{1}^{\top }},{\boldsymbol{\mu }_{2}^{\top }})^{\top }}$ and $\boldsymbol{\Sigma }=\left[\begin{array}{c@{\hskip10.0pt}c}{\boldsymbol{\Sigma }_{11}}& {\boldsymbol{\Sigma }_{12}}\\ {} {\boldsymbol{\Sigma }_{21}}& {\boldsymbol{\Sigma }_{22}}\end{array}\right]$ be the corresponding partitions of $\boldsymbol{\mu }$ and $\boldsymbol{\Sigma }$, respectively. Then, we have

$(i)$ ${\mathbf{Y}_{1}}\sim {t_{{p_{1}}}}({\boldsymbol{\mu }_{1}},{\boldsymbol{\Sigma }_{11}},\nu )$; and
$(ii)$ The conditional distribution of ${\mathbf{Y}_{2}}\mid ({\mathbf{Y}_{1}}={\mathbf{y}_{1}})$ is given by
\[ {\mathbf{Y}_{2}}\mid ({\mathbf{Y}_{1}}={\mathbf{Y}_{1}})\sim {t_{{p_{2}}}}({\boldsymbol{\mu }_{2.1}},{\widetilde{\boldsymbol{\Sigma }}_{22.1}},\nu +{p_{1}}),\]
where ${\boldsymbol{\mu }_{2.1}}={\boldsymbol{\mu }_{2}}+{\boldsymbol{\Sigma }_{21}}{\boldsymbol{\Sigma }_{11}^{-1}}({\mathbf{y}_{1}}-{\boldsymbol{\mu }_{1}})$ and ${\widetilde{\boldsymbol{\Sigma }}_{22.1}}=(\frac{\nu +{\delta _{1}}}{\nu +{p_{1}}}){\boldsymbol{\Sigma }_{22.1}}$ with ${\delta _{1}}={({\mathbf{y}_{1}}-{\boldsymbol{\mu }_{1}})^{\top }}{\boldsymbol{\Sigma }_{11}^{-1}}({\mathbf{y}_{1}}-{\boldsymbol{\mu }_{1}})$ and ${\boldsymbol{\Sigma }_{22.1}}={\boldsymbol{\Sigma }_{22}}-{\boldsymbol{\Sigma }_{21}}{\boldsymbol{\Sigma }_{11}^{-1}}{\boldsymbol{\Sigma }_{12}}$.

Proposition 2.

Let $\mathbf{Y}\sim {t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )$. Then for any fixed vector $\mathbf{b}\in {\mathbb{R}^{m}}$ and matrix $\mathbf{A}\in {\mathbb{R}^{m\times p}}$ of full rank we get

\[ \mathbf{V}=\mathbf{b}+\mathbf{A}\mathbf{Y}\sim {t_{m}}\big(\mathbf{b}+\mathbf{A}\boldsymbol{\mu },\mathbf{A}\boldsymbol{\Sigma }{\mathbf{A}^{\top }},\nu \big).\]

Appendix B TMVT Distributions

The following propositions are related to the marginal and conditional moments of the first two moments of the TMVT distributions under a double truncation. The proof are similar to those given in [21]. In what follows, we shall use the notation ${\mathbf{Y}^{(0)}}=1$, ${\mathbf{Y}^{(1)}}=\mathbf{Y}$, ${\mathbf{Y}^{(2)}}=\mathbf{Y}{\mathbf{Y}^{\top }}$ and $\mathbf{W}\sim T{t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;(\mathbf{a},\mathbf{b}))$ stands for a p-variate doubly truncated Student’s-t distribution on $(\mathbf{a},\mathbf{b})\in {\mathbb{R}^{p}}$.

Proposition 3.

If $\mathbf{Y}\sim T{t_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;(\mathbf{a},\mathbf{b}))$ then it follows that

\[\begin{aligned}{}& \mathbb{E}\bigg[{\bigg(\frac{\nu +p}{\nu +\delta (\mathbf{Y})}\bigg)^{r}}{\mathbf{Y}^{(k)}}\bigg]\\ {} & \hspace{1em}={c_{p}}(\nu ,r)\frac{{L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },{\boldsymbol{\Sigma }^{\ast }},\nu +2r)}{{L_{p}}(\mathbf{a},\mathbf{b};\boldsymbol{\mu },\boldsymbol{\Sigma },\nu )}\mathbb{E}\big[{\mathbf{W}^{(k)}}\big],\end{aligned}\]

where

\[ {c_{p}}(\nu ,r)={\bigg(\frac{\nu +p}{\nu }\bigg)^{r}}\frac{\Gamma (\frac{p+\nu }{2})\Gamma (\frac{\nu +2r}{2})}{\Gamma (\frac{\nu }{2})\Gamma (\frac{p+\nu +2r}{2})},\]

${\boldsymbol{\Sigma }^{\ast }}=\frac{\nu }{\nu +2r}\boldsymbol{\Sigma }$ and $\nu +2r\gt 0$ with $\mathbf{W}\sim T{t_{p}}(\boldsymbol{\mu },{\boldsymbol{\Sigma }^{\ast }},\nu +2r;(\mathbf{a},\mathbf{b}))$.

Having established the formula on the k-order moment of Y, we provide an explicit formula for the conditional moments with respect to a two-component partition of Y.

Proposition 4.

Let $\mathbf{Y}\sim {Tt_{p}}(\boldsymbol{\mu },\boldsymbol{\Sigma },\nu ;(\mathbf{a},\mathbf{b}))$. Consider the partition $\mathbf{Y}={({\mathbf{Y}_{1}^{\top }},{\mathbf{Y}_{2}^{\top }})^{\top }}$ with $\mathrm{dim}({\mathbf{Y}_{1}})={p_{1}}$, $\mathrm{dim}({\mathbf{Y}_{2}})={p_{2}}$, ${p_{1}}+{p_{2}}=p$, and the corresponding partitions: $\mathbf{a}={({\mathbf{a}_{1}^{\top }},{\mathbf{a}_{2}^{\top }})^{\top }}$, $\mathbf{b}={({\mathbf{b}_{1}^{\top }},{\mathbf{b}_{2}^{\top }})^{\top }}$, $\boldsymbol{\mu }={({\boldsymbol{\mu }_{1}^{\top }},{\boldsymbol{\mu }_{2}^{\top }})^{\top }}$ and $\boldsymbol{\Sigma }=\left[\begin{array}{c@{\hskip10.0pt}c}{\boldsymbol{\Sigma }_{11}}& {\boldsymbol{\Sigma }_{12}}\\ {} {\boldsymbol{\Sigma }_{21}}& {\boldsymbol{\Sigma }_{22}}\end{array}\right]$. Then,

\[\begin{aligned}{}& \mathbb{E}\bigg[{\bigg(\frac{\nu +p}{\nu +\delta (\mathbf{Y})}\bigg)^{r}}{\mathbf{Y}_{2}^{(k)}}\mid {\mathbf{Y}_{1}}\bigg]\\ {} & \hspace{1em}=\frac{{d_{p}}({p_{1}},\nu ,r)}{{(\nu +\delta ({\mathbf{Y}_{1}}))^{r}}}\\ {} & \hspace{2em}\times \frac{{L_{{p_{2}}}}({\mathbf{a}_{2}},{\mathbf{b}_{2}};{\boldsymbol{\mu }_{2.1}},{\widetilde{\boldsymbol{\Sigma }}_{22.1}^{\ast }},\nu +{p_{1}}+2r)}{{L_{{p_{2}}}}({\mathbf{a}_{2}},{\mathbf{b}_{2}};{\boldsymbol{\mu }_{2.1}},{\widetilde{\boldsymbol{\Sigma }}_{22.1}},\nu +{p_{1}})}\times \mathbb{E}\big[{\mathbf{W}_{2}^{(k)}}\big],\end{aligned}\]

for $\nu +{p_{1}}+2r\gt 0$, with $\delta ({\mathbf{Y}_{1}})=\delta ({\mathbf{Y}_{1}};{\boldsymbol{\mu }_{1}},{\boldsymbol{\Sigma }_{11}})$,

\[\begin{aligned}{}{\widetilde{\boldsymbol{\Sigma }}_{22.1}^{\ast }}& =\bigg(\frac{\nu +{\delta _{1}}}{\nu +2r+{p_{1}}}\bigg){\boldsymbol{\Sigma }_{22.1}}\hspace{1em}\textit{and}\\ {} {d_{p}}({p_{1}},\nu ,r)& ={(\nu +p)^{r}}\frac{\Gamma (\frac{p+\nu }{2})\Gamma (\frac{{p_{1}}+\nu +2r}{2})}{\Gamma (\frac{{p_{1}}+\nu }{2})\Gamma (\frac{p+\nu +2r}{2})},\end{aligned}\]

where ${\boldsymbol{\Sigma }_{22.1}}$ is defined as in Proposition 1. Moreover, ${\mathbf{W}_{2}}\sim \mathrm{T}{t_{{p_{2}}}}({\boldsymbol{\mu }_{2.1}},{\widetilde{\boldsymbol{\Sigma }}_{22.1}^{\ast }},\nu +{p_{1}}+2r;({\mathbf{a}_{2}},{\mathbf{b}_{2}}))$.

Acknowledgements

The authors thank the editor and anonymous referees for their valuable comments and suggestions, which significantly contributed to the improvement of this paper.

References

[1]

Abad, A. A., Litière, S. and Molenberghs, G. (2010). Testing for misspecification in generalized linear mixed models. Biostatistics 11(4) 771–786.

[2]

Arellano-Valle, R. B. and Bolfarine, H. (1995). On some characterizations of the t-distribution. Statistics & Probability Letters 25(1) 79–85. https://doi.org/10.1016/0167-7152(94)00208-P. MR1364821

[3]

Barndorff-Nielsen, O. and Schou, G. (1973). On the parametrization of autoregressive models by partial autocorrelations. Journal of Multivariate Analysis 3(4) 408–419. https://doi.org/10.1016/0047-259X(73)90030-4. MR0343510

[4]

Bartolucci, F. and Farcomeni, A. (2009). A multivariate extension of the dynamic logit model for longitudinal data based on a latent Markov heterogeneity structure. Journal of the American Statistical Association 104(486) 816–831. https://doi.org/10.1198/jasa.2009.0107. MR2751454

[5]

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31(3) 307–327. https://doi.org/10.1016/0304-4076(86)90063-1. MR0853051

[6]

Bollerslev, T. (1987). A conditionally heteroskedastic time series model for speculative prices and rates of return. The Review of Economics and Statistics 69(3) 542–547.

[7]

Box, G. E. P. and Jenkins, G. M. (1976). Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco. MR0436499

[8]

Box, G. E. P., Jenkins, G. M., Reinsel, G. C. and Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control. John Wiley & Sons. MR3379415

[9]

Chi, E. M. and Reinsel, G. C. (1989). Models for longitudinal data with random effects and AR(1) errors. Journal of the American Statistical Association 84(406) 452–459. MR1010333

[10]

Drikvandi, R. (2020). Nonlinear mixed-effects models with misspecified random-effects distribution. Pharmaceutical Statistics 19(3) 187–201.

[11]

Drikvandi, R., Verbeke, G. and Molenberghs, G. (2017). Diagnosing misspecification of the random-effects distribution in mixed models. Biometrics 73(1) 63–71. https://doi.org/10.1111/biom.12551. MR3632352

[12]

Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50(4) 987–1007. https://doi.org/10.2307/1912773. MR0666121

[13]

Garay, A. M., Castro, L. M., Leskow, J. and Lachos, V. H. (2017). Censored linear regression models for irregularly observed longitudinal data using the multivariate-t distribution. Statistical Methods in Medical Research 26(2) 542–566. https://doi.org/10.1177/0962280214551191. MR3635923

[14]

Lachos, V. H., Ghosh, P. and Arellano-Valle, R. B. (2010). Likelihood based inference for skew-normal independent linear mixed models. Statistica Sinica 20(1) 303–322. MR2640696

[15]

Lachos, V. H., A. Matos, L., Castro, L. M. and Chen, M.-H. (2019). Flexible longitudinal linear mixed models for multiple censored responses data. Statistics in Medicine 38(6) 1074–1102. https://doi.org/10.1002/sim.8017. MR3916716

[16]

Lin, T. I. and Lee, J. C. (2007). Bayesian analysis of hierarchical linear mixed modeling using the multivariate t distribution. Journal of Statistical Planning and Inference 137(2) 484–495. https://doi.org/10.1016/j.jspi.2005.12.010. MR2298952

[17]

Lin, T. I. and Lee, J. C. (2003). On modelling data from degradation sample paths over time. Australian & New Zealand Journal of Statistics 45(3) 257–270. https://doi.org/10.1111/1467-842X.00282. MR1999510

[18]

Lin, T. I. and Lee, J. C. (2006). A robust approach to t linear mixed models applied to multiple sclerosis data. Statistics in Medicine 25(8) 1397–1412. https://doi.org/10.1002/sim.2384. MR2226794

[19]

Lin, T.-I. and Wang, W.-L. (2020). Multivariate-t linear mixed models with censored responses, intermittent missing values and heavy tails. Statistical Methods in Medical Research 29(5) 1288–1304. https://doi.org/10.1177/0962280219857103. MR4097145

[20]

Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 44(2) 226–233. MR0676213

[21]

Matos, L. A., Prates, M. O., Chen, M. H. and Lachos, V. H. (2013). Likelihood-based inference for mixed-effects models with censored response using the multivariate-t distribution. Statistica Sinica 23 1323–1342. MR3114716

[22]

Matos, L. A., Castro, L. M. and Lachos, V. H. (2016). Censored mixed-effects models for irregularly observed repeated measures with applications to HIV viral loads. Test 25(4) 627–653. https://doi.org/10.1007/s11749-016-0486-2. MR3554408

[23]

Matos, L. A., Bandyopadhyay, D., Castro, L. M. and Lachos, V. H. (2015). Influence assessment in censored mixed-effects models using the multivariate Student’s t distribution. Journal of Multivariate Analysis 141 104–117. https://doi.org/10.1016/j.jmva.2015.06.014. MR3390061

[24]

Meng, X. L. and Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 81 633–648. https://doi.org/10.1093/biomet/80.2.267. MR1243503

[25]

Muñoz, A., Carey, V., Schouten, J. P., Segal, M. and Rosner, B. (1992). A parametric family of correlation structures for the analysis of longitudinal data. Biometrics 48 733–742.

[26]

Olivari, R. C., Garay, A. M., Lachos, V. H. and Matos, L. A. (2021). Mixed-effects models for censored data with autoregressive errors. Journal of Biopharmaceutical Statistics 31(3) 273–294.

[27]

Olivari, R. C., Zhong, K., Garay, A. M. and Lachos, V. H. (2022). ARpLMEC: Fitting autoregressive censored mixed-effects models. R package version 2.3. https://CRAN.R-project.org/package=ARpLMEC.

[28]

Pinheiro, J. C., Liu, C. H. and Wu, Y. N. (2001). Efficient algorithms for robust estimation in linear mixed-effects models using a multivariate t-distribution. Journal of Computational and Graphical Statistics 10 249–276. https://doi.org/10.1198/10618600152628059. MR1939700

[29]

Schumacher, F. L., Lachos, V. H. and Dey, D. K. (2017). Censored regression models with autoregressive errors: A likelihood-based perspective. Canadian Journal of Statistics 45(4) 375–392. https://doi.org/10.1002/cjs.11338. MR3729976

[30]

Vaida, F. and Liu, L. (2009). Fast implementation for normal mixed effects models with censored response. Journal of Computational and Graphical Statistics 18(4) 797–817. https://doi.org/10.1198/jcgs.2009.07130. MR2750442

[31]

Valeriano, K., Matos, L. A. and Morales, C. G. (2022). relliptical: The truncated elliptical family of distributions. R package version 1.1.0. https://CRAN.R-project.org/package=relliptical.

[32]

Wang, W.-L. (2013). Multivariate t linear mixed models for irregularly observed multiple repeated measures with missing outcomes. Biometrical Journal 55(4) 554–571. https://doi.org/10.1002/bimj.201200001. MR3079990

[33]

Wang, W.-L., Lin, T.-I. and Lachos, V. H. (2018). Extending multivariate-t linear mixed models for multiple longitudinal data with censored responses and heavy tails. Statistical Methods in Medical Research 27(1) 48–64. https://doi.org/10.1177/0962280215620229. MR3745654

Exit Reading

Table of contents

1 Introduction
2 Preliminaries
3 Linear Mixed-effects with Censored Response
4 Simulation Studies
5 Application
6 Conclusions
Appendix A p-variate Student’s-t
Appendix B TMVT Distributions
Acknowledgements
References

RSS

Authors

Abstract

1 Introduction

2 Preliminaries

(2.1)

2.1 The Multivariate Student’s t-distribution

(2.2)

(2.3)

2.2 The Multivariate Truncated Student’s t-distribution

(2.4)

3 Linear Mixed-effects with Censored Response

3.1 Model Specification

(3.1)

(3.2)

(3.3)

3.2 Within-subject Dependence Structures

(3.4)

(3.5)

3.3 The Likelihood Function

(3.6)

(3.7)

3.4 The EM Algorithm

(3.8)

(3.9)

(3.10)

(3.11)

(3.12)

(3.13)

3.5 The Expected Information Matrix

4 Simulation Studies

4.1 Simulation Study 1

Figure 1

Figure 2

Figure 3

Figure 4

4.2 Simulation Study 2

Table 1

Table 2

5 Application

Figure 5

Figure 6

Figure 7

Table 3

Table 4

Table 5

Figure 8

Figure 9

6 Conclusions

Appendix A p-variate Student’s-t

Proposition 1.

Proposition 2.

Appendix B TMVT Distributions

Proposition 3.

Proposition 4.

Acknowledgements

References

Export citation

Copy and paste formatted citation

Download citation in file