1 Introduction
Dementia is a general syndrome characterized by a long term and gradual decrease in the ability to think and remember with consequences on the person’s daily functioning. It represents a pressing public health problem and major research priority. Alzheimer disease (AD) is the most common form of dementia, representing 60% to 80% of the cases (Reitz and Mayeux, 2014). AD gradually affects multiple components long before any clinical diagnosis including brain atrophies, cognitive decline in various functions (memory, language, orientation in space and time, etc.) and loss of autonomy in the daily living activities. There is currently no cure for AD although many novel compounds are under development. Natural history of AD and its progression are still misunderstood; yet their understanding constitutes a crucial step for the assessment of compounds, and more generally AD research. Theoretical schemes have been proposed to understand the disease taken as a whole, highlighting the expected dynamic and multidimensional aspects (Jack et al., 2013). Jack’s scheme, which is now central in AD research, hypothesizes a sequence of alterations: mainly the accumulation of proteins in the brain (amyloid and tau proteins), the atrophy of brain regions (e.g., hippocampus) and finally clinical manifestations with cognitive and functional declines on which the diagnosis of dementia currently relies. However, because of its complexity, such theoretical scheme is very difficult to translate into a statistical model as it necessitates to combine the dynamic and multidimensional aspects, and to properly explore the temporal relationships between dimensions.
The dynamic aspect of AD has been apprehended mostly using the linear mixed model theory with models that analyzed one dimension according to time and covariates (Hall et al., 2000; Amieva et al., 2008; Donohue et al., 2014). One difficulty is that the dimension of interest is usually not directly observed but measured by a series of outcomes; for instance, cognition is usually measured by a battery of neuropsychological tests. ProustLima et al. (2006, 2013) considered latent process models which distinguish a structural model for analyzing the unobserved quantity over time and equations of observations for linking the unobserved quantity to the observed outcomes.
To account for the multidimensional aspect of AD and understand how dimensions are interrelated, many contributions explored predetermined relationships by examining change over time of one biomarker (e.g., cognitive measure) according to another observed biomarker (e.g., hippocampal initial volume or change) and assumed the latter was observed without measurement error (e.g. Landau et al. (2011); Han et al. (2012)). This approach quantifies temporal relationships but it relies on a specific a priori determined sequence and does not consider all biomarkers as stochastic processes, which limits the interpretation and may induce biases. Some authors also proposed to use bivariate mixed models to dynamically model two dimensions and account for their correlation through correlated random effects (e.g., (Mungas et al., 2005; Robitaille et al., 2012)). However such models remain descriptive; they do not rely on asymmetric temporal relationships between processes and thus do no allow causal interpretations.
Temporal asymmetric relationships between processes have usually been apprehended with Dynamic Bayesian Networks (DBN)
(Song et al., 2009). This approach extends the concept of Directed Acyclic Graphs (DAG) (Greenland, 2000) to the dynamic case by modelling constant or timevarying temporal relationships between successive states of a network of processes. However, firstly time is usually too grossly discretized so that spurious temporal associations might appear as showed by Aalen et al. (2016). Second, the trajectories of processes are not modelled over time. Finally, DBN quantify the association between successive levels of processes while a dynamic view of causality would seek local dependence structures linking the network of processes to its infinitesimal change over time (Aalen and Frigessi, 2007).Local dependence structures can be naturally investigated with mechanistic models which are dynamic models relating a system of processes over time using differential equations. Proposed in HIV studies (Prague et al., 2017) or cancer studies (Desmée et al., 2017), they allowed retrieving causal associations between disease components. Yet mechanistic models are numerically very demanding so that their application to complex diseases such as AD is compromised. In addition they require precise biological knowledge which lacks for AD.
The objective of this work is to propose a statistical model that simultaneously describes the dynamics of multiple dimensions involved in Alzheimer’s disease and assess their temporal relationships similarly as in a mechanistic model but with much less numerical complexity. We consider a system of latent processes, each one representing a latent dimension possibly observed through one or several longitudinal markers. In contrast with mechanistic models, we rely on a discretetime framework and define a set of difference equations to model the change over time of the system according to the previous state of the system. In addition, we account for individual differences through subjectspecific covariates and random effects. As discretization can distort the causal interpretations of temporal relationships compared to a model in continuous time, we specifically evaluate the impact of discretization on the temporal influence structure in a simulation study. The methodology is applied to the Alzheimer’s Disease Neuroimaging Initiative database to explore the temporal structure between cerebral, cognitive and functional dimensions at different stages of AD.
2 Motivating Data
Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a publicprivate partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). For uptodate information, see www.adniinfo.org. We focused in this work on the ADNI 1 phase which is a longitudinal multisite observational study that included approximately 800 individuals aged between 5590 enrolled in three different stages of progression to AD: normal aging (CN, N 200), mild cognitive impairment (MCI, N 400), and diagnosed Alzheimer’s Disease (dAD, N 200). The individuals were followedup every 6 months up to 3 years for CN and MCI groups and up to 2 years for the dAD group. At each visit, a lot of information was collected, including for the present work, volumes of brain regions measured by MRI, a battery of 19 cognitive tests and a functional assessment with the Functional Activities Questionnaire(FAQ) (Pfeffer et al., 1982). Further details on the ADNI 1 study can be found in Mueller et al. (2005).
3 Methodology
3.1 Structural model for the system of latent processes
Consider latent processes (with ) representing a system of dimensions (e.g. cerebral anatomy, cognitive ability and functional autonomy dimensions for Alzheimer’s disease) for individual with . We assume is defined at discrete times with , and a constant discretization step. Let us denote the change of the system between two successive times and the rate of change of the system. The model for the trajectories of the processes is split in two parts: (i) the level of the processes at baseline is modelled with a multivariate linear mixed model, and (ii) the rate of change of the system over time is modelled by difference equations combined with a multivariate linear mixed model:
(1) 
where is the matrix of covariates associated with the
vector of fixed effects
, and is the vector of individual random intercepts in the initial system . The matrix and the matrix include timedependent covariates associated with the vector of fixed effects and the vector of individual random effects , respectively. is the matrix of transition intensities.For each dimension , the vector of individual random effects
is assumed to have a multivariate normal distribution with variancecovariance matrix
where and are unstructured. The variances of the s are constrained to 1 so that the D processes have the same magnitude at baseline. Random effects are assumed independent between dimensions so that the entire vector of individual random effects has a multivariate normal distribution,
with the identity matrix, the Dblock diagonal matrix with block , and the matrix with row where is the row vector of zeros. In the estimation process, is replaced by its Cholesky decomposition: , where is a lower triangular matrix. Independence between random effects across dimensions is handled by fixing corresponding parameters to zero in , and unitvariances of random intercepts are handled by fixing the corresponding parameters to 1 in . In addition to the variances of random intercepts fixed at one, and without loss of generality, the matrix of covariates excludes intercepts for each process so that the D processes are standardized at baseline. These two constraints are compensated by parameters in the measurement models.
The temporal influences between processes are modelled through the matrix of timedependent transition intensities :
This matrix captures the directed temporal influences between latent processes at time and subsequent rates of change of latent processes between times and . Specifically, coefficient quantifies the temporal effect of process at time on process
. Each effect can be modelled according to time/covariates through a linear regression
where is a rvector of timedependent covariates associated with the rvector of regression coefficients .When the discretization step is not too large, the temporal influences intend to have the same causal interpretations as those of a model in continuous time (see Simulation Study 2 for the demonstration).
3.2 Measurement Models of the longitudinal markers
Consider K () continuous longitudinal markers that have been measured for subject at successive discrete times with . Following ProustLima et al. (2006, 2013), we assume that each latent process is the underlying common factor of markers () and we note the set of marker subscripts associated with latent process . We assume that a marker measures only one latent process.
The link between a marker and its underlying latent process is defined by a markerspecific measurement model. If marker is Gaussian, the measurement model is a linear equation:
(2) 
where the vector of transformation parameters is used to get the standardized form of the marker and are independent Gaussian errors with variance .
In the more general case of a continuous marker (possibly non Gaussian), one may consider a nonlinear equation of observation:
(3) 
where the link transformation comes from a family of monotonic increasing and continuous functions parameterized with . Again is the transformed marker and are independent Gaussian errors with variance . Following ProustLima et al. (2013), the link transformation can be defined from a basis of Isplines (which are integrated Msplines (Ramsay, 1988)) in association with positive coefficients, thus providing an increasing bijective flexible transformation. We used here a quadratic Isplines basis with internal knots, , so that
with the vector of parameters of the transformation.
In the following, we denote the diagonal variance matrix of the vector of errors and , the total vector of transformation parameters for the markers. The vector of transformed markers is mapped to the system of latent processes through a matrix with element equal to 1 if marker measures latent process :
(4) 
In practice the observation process may include intermittent missing observations for a subset of markers or for all the markers at a given occasion , so that markers are actually observed at occasion for subject . We assume that observations are missing at random and note the transformations of the actual vector of observed markers at occasion .
The model linking the system of processes to the transformed markers can be easily adapted to the presence of intermittent missing data by considering a observation matrix (for ) where element equals 1 if marker is the observed marker at occasion and 0 if not for and :
(5) 
with the vector of independent Gaussian errors with covariance matrix .
3.3 Estimation
The parameters are estimated in the maximum likelihood framework.
3.3.1 Distribution of the Latent Processes and transformed observations
We first derive the marginal distributions of the latent processes and of the transformed observations. By recurrence, the structural model (1) can be rewritten:
(6) 
where for and .
By introducing for and so that
(7) 
Equation (6) can be rewritten
(8) 
As a result, the vector has a multivariate normal distribution with expectation and variance covariance matrix and the vector has a multivariate normal distribution with expectation and variancecovariance matrix where
(9) 
and
(10) 
It can be easily deduced that the vector of incomplete and transformed data at occasion is multivariate Gaussian with expectation and variancecovariance matrix , and the total vector of incomplete and transformed data is multivariate Gaussian with expectation and variancecovariance matrix , a block matrix with the block.
3.3.2 Likelihood
As the subjects of the sample are independent, the loglikelihood of the model is with the individual contribution to the likelihood. Here, is the whole vector of parameters. Using the Jacobian of the link functions (), the individual contribution is:
(11) 
where denotes the density function of a multivariate Gaussian vector with expectation and variancecovariance matrix , and denotes the Jacobian of the link function used to transform , the observed marker at occasion for subject . For instance, with a linear link function, .
3.3.3 Optimisation Algorithm and Implementation
The maximum likelihood estimates are obtained using an extended LevenbergMarquardt algorithm (Marquardt, 1963) because of its robustness and good convergence rate. At each iteration , if necessary, the Hessian matrix is diagonalinflated to obtain a positive definite matrix which is used to update the parameters , with the gradient at iteration and the improvement control parameter. Convergence is reached when , and , with the total number of parameters. The latter criterion is by far the most stringent one and specifically targets maximum search. The variances of the estimators are obtained from the inverse of .
Given the possibly high number of parameters, we first estimate the parameters for each process taken separately, then we start the maximization of the likelihood of the multivariate model from these simple estimates, setting initial values of the interdimension parameters to zero. The model estimation is implemented in R (program available on request); it combines R and languages, and includes parallel computations.
3.4 Marginal and subjectspecific predictions
The goodnessoffit of the model can be assessed by comparing predictions with observations of the markers in their transformed scales. From notations defined in section 3.3.1, marginal and conditional distributions of the markers are:
(12) 
(13) 
where is the blockdiagonal matrix constituted of blocks.
The marginal () and subjectspecific () predictions in the transformed scales are then respectively obtained by taking the expectations of the marginal and conditional distributions of the transformed markers at the parameter estimates and at the predicted latent processes of given the observations : where is the covariance matrix between and .
Using these individual predictions, one can graphically compare either the marginal predictions or subjectspecific predictions averaged within time intervals to the observations averaged within the same time intervals. Marginal and subjectspecific predictions in the natural scale of the markers can also be derived from the marginal and conditional distributions by using a MonteCarlo approximation (ProustLima et al., 2013).
4 Simulations
We performed two series of simulations to evaluate the estimation program and the impact of time discretization on the interpretation of matrix.
4.1 Simulation Study 1: Validation of the estimation program
4.1.1 Design
To evaluate the estimation program, we generated a system of two Gaussian processes ( and ), each one measured by one longitudinal marker ( and ) and two covariates, one binary and one continuous . We considered two scenarios: a covariatespecific temporal influence structure (Scenario 1) and a timedependent temporal influence structure (Scenario 2). In Scenario 1, we assumed linear trajectories over time for the system of latent processes (random intercepts and simple effects of both covariates in the submodels for the initial level and the change over time) and a temporal influence structure different in each level of :
(14) 
where , and with such that the random effects are independent between dimensions, and ), .
In scenario 2, we assumed linear trajectories for the latent processes adjusted for and a transition matrix that evolved with time:
(15) 
where , and with such that the random effects are independent between dimensions, and ), . Each element of the transition matrix is defined from a basis of quadratic Bsplines with one internal knot at the median so that , .
The design of the simulations and the parameters were chosen to mimic the ADNI data. Dimensions 1 and 2 were cerebral anatomy and cognitive ability, respectively measured by a specific composite score. Covariate
corresponded to the two groups CN (for normal elderly, in reference) and MCI (for Mild Cognitive Impairment). It was generated according to a Bernoulli distribution with probability
. Covariatecorresponded to the standardized age at baseline and was generated according to a standard Gaussian distribution. We considered a discretization step
of 6 months. Markers observations were generated every 6 months up to 3 years. Thus a subject had 7 repeated measures at occasions . We also considered a design in which scheduled visits could be missed completely at random with a probability of 0.15, and when a visit was not missed, a marker could be missing with a probability of 0.07. For each design and scenario, we generated 100 samples of 512 subjects.4.1.2 Results
Table 4 and Table 1 provide the results of the simulations for scenarios 1 and 2 respectively. In both settings, all the parameters were correctly estimated with satisfying coverage rates in the absence of missing data (left part) and in the presence of missing data (right part).
without missing values (conv=100%)  with missing values (conv=98%)  
ESE  ASE  CR(%)  ESE  ASE  CR(%)  
1.661  1.655  0.104  0.112  96.0  1.653  0.111  0.113  98.0  
1.803  1.831  0.146  0.128  92.0  1.834  0.150  0.133  90.0  
0.278  0.281  0.043  0.046  96.0  0.285  0.045  0.050  98.0  
0.152  0.154  0.023  0.023  96.0  0.156  0.024  0.026  96.0  
0.117  0.121  0.080  0.080  94.0  0.111  0.093  0.086  91.0  
0.143  0.145  0.040  0.041  96.0  0.140  0.046  0.045  95.0  
L(3,1)  0.041  0.040  0.019  0.022  97.0  0.039  0.020  0.024  98.9 
L(4,2)  0.012  0.026  0.021  0.039  97.0  0.029  0.025  0.044  97.0 
L(3,3)  0.198  0.197  0.015  0.014  94.0  0.196  0.016  0.014  92.0 
L(4,4)  0.373  0.372  0.027  0.027  95.0  0.371  0.029  0.029  93.0 
0.112  0.109  0.040  0.044  96.0  0.107  0.046  0.048  97.0  
0.003  0.010  0.047  0.050  95.0  0.015  0.052  0.056  96.0  
0.033  0.050  0.070  0.070  97.0  0.056  0.070  0.079  98.0  
0.132  0.117  0.095  0.093  95.0  0.108  0.099  0.104  97.0  
0.168  0.170  0.039  0.037  93.0  0.172  0.044  0.041  92.0  
0.079  0.067  0.043  0.043  92.0  0.067  0.049  0.048  92.0  
0.143  0.160  0.062  0.057  92.0  0.161  0.065  0.064  92.0  
0.070  0.050  0.080  0.073  90.0  0.048  0.088  0.081  89.0  
0.238  0.225  0.071  0.068  95.0  0.220  0.076  0.073  92.0  
0.138  0.159  0.092  0.090  94.0  0.162  0.097  0.097  93.0  
0.359  0.331  0.128  0.126  94.0  0.330  0.134  0.135  95.0  
0.014  0.048  0.170  0.165  94.0  0.045  0.179  0.178  95.0  
0.108  0.108  0.077  0.074  91.0  0.102  0.085  0.081  89.0  
0.037  0.024  0.082  0.084  97.0  0.025  0.091  0.088  95.0  
0.067  0.046  0.105  0.107  90.0  0.049  0.116  0.113  91.0  
0.066  0.093  0.133  0.133  96.0  0.085  0.154  0.142  94.0  
0.387  0.391  0.015  0.015  93.0  0.391  0.015  0.016  96.0  
0.713  0.713  0.035  0.033  94.0  0.709  0.035  0.035  95.0  
3.769  3.725  0.192  0.197  95.0  3.719  0.199  0.199  93.0  
2.580  2.561  0.092  0.090  95.0  2.563  0.094  0.092  95.0  
2.489  2.511  0.138  0.120  92.0  2.515  0.144  0.124  91.0  
1.412  1.413  0.063  0.060  93.0  1.415  0.063  0.064  95.0 

Rate of convergence,

(15% missing occasions, 7% missing outcomes),

ASE is the asymptotic standard error, ESE is the empirical standard error and CR is the coverage rate of the 95% confidence interval.
4.2 Simulation Study 2: Impact of the discretization step on the temporal influence structure between processes
4.2.1 Design
To formally assess whether the interpretation under the discretized time was the same as the one obtained under continuous time, we assessed the typeI error rate associated with each non diagonal element of the transition matrix
under three discretization steps , and when data were actually generated under continuous time (approximated by a step =0.001). We considered for this a system of three latent processes (, , and ), each one measured by one Gaussian repeated marker (, , ). The trajectories were linear, with no adjustment for covariates and a transition matrix constant over time:(16) 
The simulation model mimicked the ADNI data with cerebral anatomy, cognitive ability and functional autonomy as dimensions, respectively measured by a specific composite score. We estimated the model on the ADNI data with =1 (6 months) and transformed the parameters to the scale =0.001 to generate the data in almost continuous time. We provide in the Appendix A: the formulas that we used to relate model components defined under =1 and =0.001. Elements of the transition matrix were set one by one to 0 in scale to evaluate the associated typeI error rate. Latent processes were generated with solver from dsolve package (Soetaert et al., 2010) and observations were derived every 6 months up to 3 years. For each design, we considered 200 samples of 512 subjects.
4.2.2 Results
Table 2 displays the typeI error rates (in percentage) associated with each non diagonal element of the transition matrix when estimated with discretization steps =1/3, = and =1. All the typeI error rates are close to the nominal 5% rate (expected rates in the 95% interval [1.98 , 8.02] with 200 replicates). This shows that with such discretization steps, the causal interpretation expected in continuous time is not altered.
Parameter  

5.5  4.5  6.5  
3.5  5.5  5.0  
7.0  7.5  8.0  
5.0  4.5  5.0  
4.0  3.5  5.5  
7.0  7.5  5.5 
5 Application
The application aimed to describe the decline over time of cerebral anatomy, cognitive ability and functional autonomy in three clinical stages of AD (normal aging (CN), Mild Cognitive Impairment (MCI) and diagnosed with Alzheimer’s Disease (dAD)) and to quantify the temporal influences between these dimensions by assessing especially whether the relationships differed according to the clinical stage.
5.1 The sample
We considered the nineteen available cognitive tests as outcomes of global cognitive ability (Crane et al., 2012), the volume of hippocampus relative to total intracranial volume and the cortical thickness of nine regions (Freesurfer version 4.4.0 for longitudinal data) as outcomes for cerebral anatomy following regions of interest identified by Dickerson et al. (2008), and the sumscore of the FAQ composed of 30 items (Pfeffer et al., 1982) to assess functional autonomy. For numerical simplicity, we considered a unique marker for each dimension by summarizing the cerebral and cognitive markers into two composite scores (ProustLima et al., 2017). Covariates were the age at entry (centered around 75.4 and indicated in decades), gender, educational level (low level if 12 years vs high level if 12 years), the APOE genotype (4 carrier vs 4 noncarrier ) and the 3 clinical stages (CN, MCI, dAD). We selected in the sample all the subjects who had no missing data for the covariates and had at least one measure for each dimension during the followup. The main analysis included 656 subjects (82% of the initial sample). The sample consisted of 190 (29%) subjects at healthy stage (CN), 322 (49%) subjects at mild cognitive impairment stage (MCI), and 144 (22%) subjects diagnosed with Alzheimer’s disease (dAD); 43% were females, 83% had a high educational level and 49% carried APOE 4 allele. The mean age at entry was 75.4 years old (sd=6.6). The mean number of visits was 5, 6 and 4 for CN, MCI and dAD subjects, respectively.
5.2 The dynamic multivariate model
The dynamic model applied to ADNI is summarized in Figure 1. The three latent processes (cerebral anatomy (), cognitive ability (), functional autonomy ()) were repeatedly measured by the three composite scores , and
Comments
There are no comments yet.