A joint model for multiple dynamic processes and clinical endpoints: application to Alzheimer's disease

03/27/2018 ∙ by Cécile Proust-Lima, et al. ∙ Inserm 0

Alzheimer's disease, the most frequent dementia in the elderly, is characterized by multiple progressive impairments in the brain structure and in clinical functions such as cognitive functioning and functional disability. Until recently, these components were mostly studied independently while they are fundamentally inter-related in the degradation process towards dementia. We propose a joint model to describe the dynamics of multiple correlated latent processes which represent various domains impaired in the Alzheimer's disease. Each process is measured by one or several markers, possibly non Gaussian. Rather than considering the associated time to dementia as in standard joint models, we assume dementia diagnosis corresponds to the passing above a covariate-specific threshold of a pathological process modelled as a combination of the domain-specific latent processes. This definition captures the clinical complexity of dementia diagnosis but also benefits from simplifications for the obtention of Maximum Likelihood Estimates. We show that the model and estimation procedure can also handle competing clinical endpoints such as death. The estimation procedure is validated by simulations and the method is illustrated on a large French population-based cohort of cerebral aging with a maximum follow-up of 25 years. We focused on three clinical manifestations (cognitive functioning, physical dependency and depressive symptoms) measured repeatedly by one or several markers, as well as repeated clinical-based diagnoses of dementia and competing death before dementia.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Dementia is a syndrom which affects individuals after 60 years old. Alzheimer’s disease is the most frequent type of dementia and represents approximately 60% of the reported cases. Dementia is characterized by a very long degradation process before clinical diagnosis which lasts a few decades (Amieva et al., 2008). This degradation process has multiple anatomo-clinical components. As shown recently (Jack et al., 2013), long before diagnosis, dementia is characterized by an accumulation of biomarkers in the brain (-amyloid, protein), an alteration of the brain structure (atrophy of some regions such as the hippocampus), impaired clinical manifestations with the decline of several cognitive functions (e.g., executive functions, processing speed, episodic memory), an increase of functional limitations in the daily life (e.g. shopping, toiletting, transferring from bed to chair), and possibly increased depressive symptoms among other behavioral alterations. Although inter-related, these components were mostly studied independently with a large focus on cognitive decline for which repeated measures have been available for long in cerebral aging cohorts (e.g., Proust-Lima et al., 2016; Graham et al., 2011).

The limitation to a single component was also partly explained by the complexity of analyzing multiple longitudinal components in link with a clinical endpoint. It requires the joint modelling of multivariate longitudinal markers and a survival process for which a series of models were proposed (see Hickey et al., 2016 for a review, and examples in Albert and Shih, 2010; Andrinopoulou et al., 2014; Baghfalaki et al., 2014; Chi and Ibrahim, 2006; Choi et al., 2014; Lin et al., 2002; Pantazis et al., 2005) but not necessarily adapted to the dementia context. Numerical complexity is considerable in joint models involving multiple longitudinal components due notably to an integral over the random effects in the likelihood which can not be solved analytically. The integral computation becomes cumbersome with more than a couple of random effects (Albert and Shih, 2010; Rizopoulos et al., 2009) and not necessarily accurately solved (Ferrer et al., 2016). This entails that most applications focused on only two longitudinal markers (e.g. Lin et al., 2002; Li et al., 2012) and/or random intercepts only (Li et al., 2012; Tang and Tang, 2015) which is not feasible in dementia. Contributions also mostly relied on a Bayesian estimation to circumvent this numerical complexity (Andrinopoulou et al., 2014; Baghfalaki et al., 2014; Chi and Ibrahim, 2006; Choi et al., 2014; Tang and Tang, 2015) or a two-stage estimation (Albert and Shih, 2010).

An additional complexity in dementia is that each quantity of interest is not directly measured; it is approached by multivariate markers. For example, cognitive level is measured by a battery of psychometric tests to apprehend all the coexisting cognitive functions, and brain structure comprises various regional volumes including the hippocampus volume but not limited to (Dickerson et al., 2009). Joint models for multivariate longitudinal markers measuring the same underlying process were proposed but were limited to a univariate latent process (Proust-Lima et al., 2016; He and Luo, 2016; Luo, 2014). Finally, for the clinical manifestations at least, the usual assumption of normality for the continuous outcomes does not hold and normalizing transformations have to be incorporated in the model to be able to rely on standard linear mixed regressions (Proust-Lima et al., 2011).

This work aimed at developing a novel joint model for multivariate longitudinal markers (possibly non Gaussian) measuring several latent processes and one or several clinical events. The main originality is that we replaced the standard proportional hazard model by a latent process model for each clinical endpoint. In our context, dementia diagnosis is defined as a binary variable repeatedly collected at visits which becomes positive if its underlying continuous degradation process has passed above a threshold. Using this definition, clinically relevant for dementia diagnosis and many other clinical endpoints, the joint model has an exact likelihood which reduces the usual limitations in the number of longitudinal components. The replacement of the classical survival model by another model to provide exact likelihood had already been proposed in joint models with a univariate longitudinal Gaussian marker (

Barrett et al., 2015). In this work, the authors had opted for a sequential probit model defined in discrete time with which our approach has similarities.

Section 2 describes the joint model for multiple latent components and one clinical endpoint. Section 3 details the likelihood computation and Section 4 extends the model to other types of endpoints and competing endpoints. Section 5 validates the estimation procedure by simulations. Section 6 details an application to three clinical manifestations in dementia, cognitive functioning, functional dependency and depressive symptomatology, in link with dementia diagnosis and in the presence of competing risk of dementia-free death. Finally, we conclude in Section 7.

2 The joint model for multiple latent domains and a clinical endpoint

The joint model for multiple latent domains and a clinical endpoint is described in Figure 1 and formalized in Subsections 2.1, 2.2 and 2.3.

2.1 Latent domains measured by multivariate longitudinal markers

In a population of individuals, let consider latent domains (e.g., cognition, brain structure), each one defined for individual () as a latent process in continuous time with and . Each latent domain is measured through a battery of markers repeatedly measured over time (e.g., several cognitive tests, volumes of different brain regions). Let define the measure of marker () and individual for domain collected at time with .

Based on previous works (Proust-Lima et al., 2013), we assume that each marker, normalized by a parameterized link function, is a noisy measure of the underlying latent domain:



are centered independent Gaussian measurement errors with variance

and is the link function which transforms the marker into a Gaussian framework. This link function depends on parameters that are estimated on the data along with the other parameters. Any family of monotonic increasing parameterized transformations can be chosen for

, including the family of linear transformations to reduce to the standard Gaussian case. When departures from normality are suspected, we recommend the flexible and parsimonious family of linear combinations of quadratic I-splines

so that with the number of knots kept small for parsimony. I-splines are integrated M-splines which ensure the monotonicity of the link functions (Ramsay, 1988). Note that although not detailed here for sake of brievity, it is very easy to add marker-specific effects of covariates and/or random effects to this equation of observation (see Proust-Lima et al., 2013 for details).

2.2 Latent domains correlated trajectories

Each latent domain trajectory is described via a linear mixed model (

Laird and Ware, 1982):



is a vector of covariates associated with the vector of fixed effects

at the population level and is a vector of covariates associated with the vector of random effects at the individual level with . A zero-mean Gaussian process can be added to make the trajectory more flexible at the individual level; for example a Brownian motion with covariance structure is often relevant in aging studies (Proust et al., 2006; Ganiayre et al., 2008). In the following, the zero-mean Gaussian processes are kept independent from one latent domain to the other. The correlation between the latent domains is only captured by correlations between the domain-specific random effects so that each individual is characterized by an overall vector of random effects:


where is the covariance matrix between random effects of latent domains and (). To ensure the positiveness of , we did not consider the usual Cholesky transformation but focused on the variances parameterized by () and the correlation matrix of parameterized by non-diagonal generic element () for a correlation of .

2.3 Degradation process toward a clinical endpoint

We assume there exists a degradation process in continuous time for each individual denoted with . This degradation process is measured by repeated observations of the clinical status of interest, dementia diagnosis in our case, at time with the occasion () so that an individual has the positive clinical status if the degradation process with an additional noise has reached a specific threshold at the visit time:



is a zero-mean Gaussian independent random variable (

), is the threshold in the reference group and is a vector of covariates associated with parameters that can modulate the threshold defining the positive clinical status. Note that all observations are censored after a change to a positive clinical status.

The degradation process is defined as a linear combination of the latent domains under study:


where characterizes the intensity of latent domain contribution for individual . Such intensity can be contrasted according to covariates with in which is a vector of covariates (including the intercept) associated with the vector of parameters . For instance, in dementia, the contributions of latent domains may be modulated by factors linked to cognitive reserve.

3 Maximum Likelihood Estimation

The entire vector of parameters denoted comprises parameters , , , and all the parameters and constituting the matrix. It is estimated in the maximum likelihood framework.

3.1 Likelihood

Let denote the repeated observed diagnoses, the repeated and multivariate observed markers (with the vector of all the repeated and multivariate observations of latent domain ) and the set of latent processes at the observed diagnosis visits. The likelihood is with


where the vector of parameters is omitted for simplicity and

  • the marginal density of the repeated markers can be decomposed into the multivariate normal density of the transformed observations of the markers through the link functions times the Jacobian of the link functions . The multivariate normal density of the transformed observations has mean and variance with , and the D-block diagonal matrices with blocks and that comprise row vectors and for all the outcome-specific occasions , the covariance matrix of the Gaussian processes at the same visits and the diagonal variance matrix of measurement errors;

  • the conditional distribution of the repeated diagnoses is:

    • for subjects not diagnosed with dementia where is a

      -dimensional Gaussian cumulative distribution function with mean

      and variance computed in , , and is the identity matrix;

    • for those diagnosed with dementia; and are respectively and without the last row for ;

  • the conditional density is a multivariate normal density function with mean and variance where and are the D-block diagonal matrices with blocks and that comprise row vectors and at diagnosis visits (), the covariance matrix of the Gaussian processes at the same visits and the covariance matrix of the Gaussian processes between diagnosis visits and marker visits so that .

The joint density of the observations can thus be rewritten:


Instead of approximating the multivariate integral over

by numerical integration techniques (as mostly done in joint modelling area with generally Gaussian quadratures), we use the properties of the skewed normal variables which provide a closed form for the integrals involved in equation (

7) (Arnold, 2009). The joint density becomes:


3.2 Likelihood accounting for delayed entry

When necessary, delayed entry can be accounted for by dividing the likelihood by the probability of being at risk of the clinical endpoint at study entry that is

at time :




with the latent processes at entry, the threshold at entry, the vector of domain-specific contributions, and the D-block diagonal matrices with blocks and , and the variance of the Gaussian processes at entry for .

3.3 Likelihood Optimization and implementation

We chose to maximize the log-likelihood using a modified Marquardt algorithm, a Newton-like algorithm, with convergence criteria based on parameter and log-likelihood stability and derivatives size (see Proust-Lima et al., 2017 for details). The latter is defined at iteration as with and the gradient and the Hessian matrix of the log-likelihood at iteration , the length of , and the convergence threshold (fixed here at 0.001). This criterion is very stringent and ensures convergence toward a maximum. The program was implemented in C++ with an interface in R and parallel computations to fasten the estimation procedure. The multivariate normal cumulative distribution functions were computed by using Genz routines (Genz, 1992).

4 Extension to multiple clinical endpoints and continuous-time events

4.1 Multiple diagnoses at predefined visits

The definition of the degradation process and measurement model for a clinical event observed at specific visits such as diagnoses (Subsection 2.3) extends naturally to the case of two causes of diagnosis. The log-likelihood has the exact same structure except that the dimension of the cumulative distribution functions is augmented to the number of visits for each cause of clinical event, if the two diagnoses are made at the same times.

4.2 Event in continuous time

The model definition relies on discrete repeated visits for the clinical event and as such does not directly extend to clinical events that are intrinsequely defined in continuous time. With continuous-time events, a preliminary discretization of time is necessary: we partition time into intervals with midpoints for ( for ). A subject is followed from interval , the interval containing the exact entry time in the study, and is followed until interval : if the subject has the event, is such that the time of event is in ; if the subject is censored, is such that the censoring time is in . Without loss of generality, we focus on death and define the repeated death status in each interval () with for all except for those who die ().

Using the exact same definition as for other clinical diagnoses, we consider a latent degradation process defined as a linear combination of the latent domains:


We then define the probability of dying in interval as the probability that the noisy underlying degradation process is above a specific threshold at the midpoint of interval :


As previously, characterizes the intensity of latent domain contribution for individual and it can be contrasted according to covariates with in which is a vector of covariates (including the intercept) associated with the vector of parameters ; is a zero-mean Gaussian independent random variable (); is the threshold in the reference group and is a vector of covariates associated with parameters that can modulate the threshold defining the death status. In particular, may include time.

When this discretized event replaces clinical diagnosis, the likelihood has the same structure except that the dimension of the cumulative distribution function becomes . In case of delayed entry, the likelihood is divided by the probability to have survived until .

4.3 Clinical event in competition with death

When interested in a clinical diagnosis, death before diagnosis may be a relevant absorbing competing event. The methodology handles this by combining the two degradation processes of sections 2.3 and 4.2, and slightly change the observations for death. To focus on death before diagnosis, we only consider death in the x years following a negative diagnosis (x=3 years is realistic in the case of dementia). Otherwise, time to death is censored at the last visit .

With these changes, the joint log-likelihood has again the exact same structure as explained in section 3.1 except that the dimension of the cumulative distribution functions is augmented to the number of visits for the clinical event and the number of intervals for the discretized event . In case of delayed entry, the likelihood is divided by the probability to have survived until and be still at risk of clinical event at entry.

5 Validation of the estimation procedure by simulations

We validated the estimation procedure with a series of simulations for studying two longitudinal domains, cognition and functional dependency, measured by two psychometric tests and one scale, respectively. For the events, we considered two scenarios: clinical diagnosis of dementia (section 2.3), and discretized death (section 4.2).

5.1 Design and scenarios

We generated linear trajectories for cognition and function according to age and ajusted for binary educational level (0.5-probability Bernoulli) and their interaction for domain 1. Correlated random intercept and slope with age took into account the correlation within and between domains at the individual level. Entry in the cohort was generated from a normal distribution (mean 75, standard deviation 3). Observed visits were generated with a uniform distribution in [-1,1] years around the theoretical visits every 2.5 years up to 20 years. In scenario 1, dementia diagnosis was made at each visit. In scenario 2, death was discretized into 10 intervals with cutoffs at 65, 70, 75, 78, 81, 84, 87, 90, 93, 98, 105 years old. The two domains contributed to the clinical event model and the threshold was constant for diagnosis while it was a linear function of current age for death. For each scenario, longitudinal data were censored after the event. We considered Gaussian longitudinal markers with different ranges inspired by psychometric tests IST and DSST for cognition and IADL sumscore for function (see application Section

6). Parameters of the model roughly corresponded to those obtained on the application data.

5.2 Results

We considered 200 replicates of samples of 500 subjects. Table 1

summarizes the results when considering clinical diagnosis of dementia. In this scenario, the mean number of repeated markers and timepoints for dementia diagnosis was 6.7 and 299 subjects were diagnosed with dementia during the follow-up. The parameters were well estimated without bias and correct coverage rate which validated the estimation procedure. When considering death as the endpoint and a discretization in 10 intervals (mean number of repeated markers 5.9 and 418 deaths), parameters were again well estimated with negligeable bias and no departure from the expected 95% coverage rate of the 95% confidence interval (Table S1 in Supplementary Materials).

6 Application to clinical manifestations in dementia

Changes in various clinical measures such as cognitive tests, dependency scales or depressive symptomatology have been separately observed in prodromal dementia (Amieva et al., 2008) suggesting a possible concomittant role in the dementia process with a modulation of the intensity by educational level illustrating a potential compensatory mechanism. Our objective was to precisely investigate the role of cognition, dependency and depression in the degradation process toward dementia by jointly analyzing their trajectories and their determinants in link with dementia diagnosis and accounting for the competing death.

6.1 The PAQUID data

We relied on the data from the population-based PAQUID cohort which included 3777 individuals aged 65 years and older and living at home in southwestern France in 1989-1990. Individuals were then followed for up to 25 years with repeated neuropsychological evaluations and clinical diagnoses of dementia every 2 or 3 years (see Letenneur et al., 1994 for details) and death continuously recorded. We focused on a subsample of 646 individuals who were tested for ApoE4, the main genetic factor associated with aging. We excluded 22 individuals diagnosed with dementia at baseline and 31 individuals who did not have at least one observation of each clinical measure. The final sample consisted of 593 individuals among which 180 developed a dementia and 283 died in the three years following a negative dementia diagnosis. The sample comprised 332 (56%) women, 438 (74%) individuals with higher education level (EL+; individuals who graduated from primary school) and 130 (22%) carriers of at least one copy of the APOE4 allele. The mean age at entry was 73.6 (SD=6.1) years.

The three clinical manifestations were:

  • Cognitive impairment; it was assessed by four psychometric tests (inverted so that higher levels indicated higher impairment). The Mini-Mental State Examination (MMSE) provides an index of global cognitive performance, the Benton Visual Retention Test (BVRT) assesses visual memory, the Isaacs Set Test (IST) measures verbal semantic memory and speed, and the Digit Symbol Substitution Test (DSST) provides a global measure of executive functioning and processing speed.

  • Functional dependency; it was assessed by the French version of the Instrumental Activities of Daily Living (IADL). We summed the grades of dependency for four activities, telephone use, transportation, medication and domestic finances.

  • Depressive symptomatology; it was assessed by the sumscore of the Center for Epidemiologic Studies Depression Scale (CES-D).

Individuals had between 1 and 12 repeated measures of each marker with a median of 6 (Interquartile range IQR=4-8) for MMSE, IADL and dementia diagnosis, 5 (IQR=4-8) for CES-D, 5 (IQR=3-8) for BVRT, 4 (IQR=3-7) for IST and 4 (IQR=2-6) for DSST.

6.2 Model specification

The structure of the model is summarized in Figure 2 and specifics are given below.

Quadratic trajectories according to age were assumed for each domain with an adjustment for age at entry, EL+, ApoE4 and gender (and their interactions with age and age squared), and three individual random effects (on intercept, slope and slope squared) correlated within and between domains. An additional time dependent variable indicating the baseline evaluation was included due to evidence of a primo passation negative effect. The selection of covariates and interactions was determined in separate analyses by domain with a 20% significance level.

Marker-specific splines transformations

were used for normalizing the markers with 3 internal knots chosen at the quartiles of the marker distribution over follow-ups. The relevance of the splines transformations was checked in domain-specific analyses by comparing the Akaike criterion (AIC) with the AIC of the model assuming a linear transformation.

Degradation process toward dementia was modelled according to the three domains with an adjustment for age at entry, gender, ApoE4 carriers and EL+ and a potential change in the threshold of dementia after the 10-year follow-up visit (due to a new drug on the market that implied earlier diagnoses).

Degradation process toward dementia-free death was also modelled according to the three domains. We discretized death into 8 intervals with boundaries at 65, 70, 74, 77, 80, 83, 86, 90 and 104 years chosen according to the distribution of death times. The threshold defining the probability of dying was modelled according to age using natural cubic B-splines with knots at 70, 83, 90 and 100 years.

Estimation of the whole model was done step by step by fitting domain-specific mixed models separately then jointly and finally with dementia and death models to provide reasonable initial values and reduce computational time.

6.3 Results

Fixed effects of the multivariate mixed model for cognition, functional dependency and depressive symptoms are given in Table 2 and predicted trajectories by covariate profile are displayed in Figure S1 of Supplementary Materials. In summary, each domain had a quadratic trajectory with age characterized by an acceleration in older ages. Individuals included at an older age were systematically more impaired than those included at an younger age (whatever the current age) highlighting a cohort effect. Evaluations at baseline underestimated cognitive level and depressive symptoms while overestimating functional dependency. More educated individuals had better cognitive and functional levels. ApoE4 carriers had a faster increase of cognitive and functional impairments. Finally, men had a lower level of depressive symptoms at 65 years but the difference with women reduced when age increased. Men also had a slower increase of functional dependency.

Estimates of the submodel for the degradation process toward dementia are given in Table 3. When not adjusting for covariates (right panel), only cognitive and functional domains significantly contributed to the degradation process toward dementia with increased impairments associated with higher degradation of the dementia process. When adjusting the threshold of dementia and the contributions for covariates (left panel), cognitive and functional domains still highly contributed to the degradation process toward dementia with a higher weight of function for ApoE4 carriers than for non carriers. Depressive symptoms still did not significantly contribute to the degradation process toward dementia among individuals with a high educational level but it contributed significantly among those with a low educational level; in this group, higher depressive symptoms induced a lower level of the degradation process for the same level of cognition and function. This may be explained by the cognitive reserve linked with education. Among individuals who did not reach a sufficient educational level, having higher depressive symptoms alters the cognitive evaluation while individuals who reached a sufficient educational level are able to compensate and properly pass the cognitive evaluation. Finally, the threshold at which dementia is diagnosed differed according to covariates, with dementia diagnosed at a lower level of degradation for those with a higher educational level, a younger age or a diagnosis made after the 10year visit. No significant difference was found according to gender or ApoE4 status.

Figure 3 displays the mean predicted trajectory of the degradation process according to education, age at entry and ApoE4 status. Note that here, the degradation process was recentered by absorbing the covariate specific threshold so that dementia is diagnosed above 0. ApoE4 status constitutes the main modulating factor of the trajectory toward dementia with ApoE4 carriers diagnosed with a dementia 5 to 6 years before ApoE4 non carriers. In contrast, despite a different structure of the degradation process, limited differences were found according to education highlighting that the differential contribution of depressive symptoms mainly served as a compensating factor in the degradation process definition.

6.4 Supplementary results and analyses

The estimated splines transformations linking each marker to its underlying domain (Figure S2 in Supplementary Materials) exhibited clear nonlinear relationships except for IST and BVRT cognitive tests. This illustrates the varying sensitivity to change of scales observed in previous works (Proust-Lima et al., 2011) and the relevance of taking such nonlinear relationships into account. The correlation matrix between the 9 individual random effects also exhibited very high correlations between the level at 65 years, age and age within and between cognitive, functional and depression domains with correlations up to 0.90 between slopes of cognitive and functional domains (Figure S3 in Supplementary Materials). We checked the goodness-of-fit of the model by verifying that the individual predictions were closed enough to the observations in the multivariate mixed submodel (Figure S4 in Supplementary Materials). We also verified that the model for dementia-free death considering thresholds approximated by splines was flexible enough by comparing it with a model considering interval-specific probability of death (Figure S5 in Supplementary Materials).

7 Discussion

We proposed a novel joint model for multiple longitudinal dimensions and clinical endpoints. Initially motived by the study of one repeated binary clinical endpoint measured at predefined visits, the model also applies to continuous endpoints such as death (provided they can be discretized) and competing risk setting. The complexity in joint models, as induced for instance by multiple longitudinal markers, usually orientates the model development toward Bayesian approaches which better tackle numerical problems. An originality of this work is that the estimation in the frequentist framework was made possible thanks to properties of skewed gaussian distributions which avoided the limiting multiple numerical integration over the random effects usually encountered in joint shared random effect models (

Rizopoulos et al., 2009). As such, this approach can be applied to contexts where the number of random effects becomes substantial (such as 9 in the application) and/or correlated gaussian processes are added to relax the model. Being able to jointly model a substantial number of markers becomes indeed a real challenge with the growing availibility of dynamic markers in longitudinal health studies.

The properties of skewed gaussian distributions had already been exploited to simplify inference in joint models in the case of a single repeated biomarker and a single continuous time to event (Barrett et al., 2015). They had opted for a sequential probit model for the discretized time where we opted for a degradation process model (Section 4.2). Although different in their definition, the two approaches are numerically equivalent in the absence of delayed entry as shown in Appendix 1 of Supplementary Materials. Ganiayre et al. (2008) had already proposed to define a clinical endpoint as the passing above/below a latent process but not in the multivariate case as here and not profiting from an exact likelihood formulation.

The approach has however several limits. Although elegant, the likelihood technique considered here is limited to manageable dimensions of repeated clinical endpoints at the individual level. The likelihood calculation relies on algorithms to compute the multivariate normal cumulative distribution function. Their good precision may become questionable beyond very large dimensions which probably limits the methodology to no more than two competing events, as explored in the application. In our simulations with up to 10 intervals for death by individuals, inference remained very good and with a few dimensions more, we found that calculations remained accurate. Such limitation remains at the individual level, not at the population level which may still include a lot more intervals and/or diagnosis visits. In addition, we believe this is a reasonable concession in many applications where multivariate longitudinal dimensions with a large amount of random effects are to be modelled and could not by using usual numerical approaches. A second limitation is that the methodology does not apply to repeated binary, ordinal or count markers. However, by using parameterized link functions, we can handle continuous non Gaussian markers which permits to correctly analyze many asymmetric scales (Proust-Lima et al., 2011).

By handling multivariate repeated markers and clinical endpoint, this joint model opens to many applications in which identifying markers of clinical progression are of importance. The definition of the clinical endpoint differs from most joint model proposals but it is actually clinically relevant in many diseases characterized, as in neurodegenerative diseases, by a body of progressive impairments. We focused here only on the understanding of the degradation process but the model has also some potential for prediction since the computation of individual dynamic predictions will benefit from the same computational advantages.


Conflict of Interest: None declared.
This work was supported by project SMALA (ANR-15-CE37-0002) of the French National Research Agency (ANR) and project 2CFaAL of France Alzheimer Association. Computer time was provided by the computing facilities MCIA (Mésocentre de Calcul Intensif Aquitain) at the Université de Bordeaux and the Université de Pau et des Pays de l’Adour.


  • Albert and Shih (2010) Albert, P. S. and Shih, J. H. (2010). An approach for jointly modeling multivariate longitudinal measurements and discrete time-to-event data. The annals of applied statistics 4, 1517–1532.
  • Amieva et al. (2008) Amieva, H., Le Goff, M., Millet, X., Orgogozo, J.-M., Pérès, K., Barberger-Gateau, P., Jacqmin-Gadda, H., and Dartigues, J.-F. (2008). Prodromal Alzheimer’s disease: successive emergence of the clinical symptoms. Annals of neurology 64, 492–498.
  • Andrinopoulou et al. (2014) Andrinopoulou, E.-R., Rizopoulos, D., Takkenberg, J. J. M., and Lesaffre, E. (2014). Joint modeling of two longitudinal outcomes and competing risk data. Statistics in Medicine 33, 3167–3178.
  • Arnold (2009) Arnold, B. C. (2009). Flexible univariate and multivariate models based on hidden truncation. Journal of Statistical Planning and Inference 139, 3741–3749.
  • Baghfalaki et al. (2014) Baghfalaki, T., Ganjali, M., and Berridge, D. (2014). Joint modeling of multivariate longitudinal mixed measurements and time to event data using a Bayesian approach. Journal of Applied Statistics 41, 1934–1955.
  • Barrett et al. (2015) Barrett, J., Diggle, P., Henderson, R., and Taylor-Robinson, D. (2015). Joint modelling of repeated measurements and time-to-event outcomes: flexible model specification and exact likelihood inference. Journal of the Royal Statistical Society. Series B, Statistical Methodology 77, 131–148.
  • Chi and Ibrahim (2006) Chi, Y.-Y. and Ibrahim, J. G. (2006). Joint Models for Multivariate Longitudinal and Multivariate Survival Data. Biometrics 62, 432–445.
  • Choi et al. (2014) Choi, J., Anderson, S. J., Richards, T. J., and Thompson, W. K. (2014). Prediction of transplant-free survival in idiopathic pulmonary fibrosis patients using joint models for event times and mixed multivariate longitudinal data. Journal of applied statistics 41, 2192–2205.
  • Dickerson et al. (2009) Dickerson, B. C., Bakkour, A., Salat, D. H., Feczko, E., Pacheco, J., Greve, D. N., Grodstein, F., Wright, C. I., Blacker, D., Rosas, H. D., Sperling, R. A., Atri, A., Growdon, J. H., Hyman, B. T., Morris, J. C., Fischl, B., and Buckner, R. L. (2009). The cortical signature of Alzheimer’s disease: regionally specific cortical thinning relates to symptom severity in very mild to mild AD dementia and is detectable in asymptomatic amyloid-positive individuals. Cerebral Cortex (New York, N.Y.: 1991) 19, 497–510.
  • Ferrer et al. (2016) Ferrer, L., Rondeau, V., Dignam, J., Pickles, T., Jacqmin-Gadda, H., and Proust-Lima, C. (2016). Joint modelling of longitudinal and multi-state processes: application to clinical progressions in prostate cancer. Statistics in Medicine 35, 3933–3948.
  • Ganiayre et al. (2008) Ganiayre, J., Commenges, D., and Letenneur, L. (2008). A latent process model for dementia and psychometric tests. Lifetime data analysis 14, 115–133.
  • Genz (1992) Genz, A. (1992). Numerical computation of multivariate normal probabilities. Journal of Computational and Graphical Statistics 1, 141–149.
  • Graham et al. (2011) Graham, P. L., Ryan, L. M., and Luszcz, M. A. (2011).

    Joint modelling of survival and cognitive decline in the Australian Longitudinal Study of Ageing.

    Journal of the Royal Statistical Society: Series C (Applied Statistics) 60, 221–238.
  • He and Luo (2016) He, B. and Luo, S. (2016). Joint modeling of multivariate longitudinal measurements and survival data with applications to Parkinson’s disease. Statistical methods in medical research 25, 1346–1358.
  • Hickey et al. (2016) Hickey, G. L., Philipson, P., Jorgensen, A., and Kolamunnage-Dona, R. (2016). Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues. BMC Medical Research Methodology 16,.
  • Jack et al. (2013) Jack, C. R., Knopman, D. S., Jagust, W. J., Petersen, R. C., Weiner, M. W., Aisen, P. S., Shaw, L. M., Vemuri, P., Wiste, H. J., Weigand, S. D., Lesnick, T. G., Pankratz, V. S., Donohue, M. C., and Trojanowski, J. Q. (2013). Tracking pathophysiological processes in Alzheimer’s disease: an updated hypothetical model of dynamic biomarkers. Lancet Neurology 12, 207–216.
  • Laird and Ware (1982) Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics 38, 963–974.
  • Letenneur et al. (1994) Letenneur, L., Commenges, D., Dartigues, J.-F., and Barberger-Gateau, P. (1994). Incidence of dementia and Alzheimer’s disease in elderly community residents of south-western France. Int J Epidemiol 23, 1256–61.
  • Li et al. (2012) Li, N., Elashoff, R. M., Li, G., and Tseng, C.-H. (2012). Joint analysis of bivariate longitudinal ordinal outcomes and competing risks survival times with nonparametric distributions for random effects. Statistics in medicine 31, 1707–1721.
  • Lin et al. (2002) Lin, H., McCulloch, C. E., and Mayne, S. T. (2002). Maximum likelihood estimation in the joint analysis of time-to-event and multiple longitudinal variables. Statistics in Medicine 21, 2369–2382.
  • Luo (2014) Luo, S. (2014). A Bayesian approach to joint analysis of multivariate longitudinal data and parametric accelerated failure time. Statistics in medicine 33, 580–594.
  • Pantazis et al. (2005) Pantazis, N., Touloumi, G., Walker, A. S., and Babiker, A. G. (2005). Bivariate modelling of longitudinal measurements of two human immunodeficiency type 1 disease progression markers in the presence of informative drop-outs. Journal of the Royal Statistical Society: Series C (Applied Statistics) 54, 405–423.
  • Proust et al. (2006) Proust, C., Jacqmin-Gadda, H., Taylor, J. M. G., Ganiayre, J., and Commenges, D. (2006). A nonlinear model with latent process for cognitive evolution using multivariate longitudinal data. Biometrics 62, 1014–1024.
  • Proust-Lima et al. (2013) Proust-Lima, C., Amieva, H., and Jacqmin-Gadda, H. (2013). Analysis of multivariate mixed longitudinal data: a flexible latent process approach. The British journal of mathematical and statistical psychology 66, 470–487.
  • Proust-Lima et al. (2011) Proust-Lima, C., Dartigues, J.-F., and Jacqmin-Gadda, H. (2011). Misuse of the linear mixed model when evaluating risk factors of cognitive decline. American journal of epidemiology 174, 1077–1088.
  • Proust-Lima et al. (2016) Proust-Lima, C., Dartigues, J.-F., and Jacqmin-Gadda, H. (2016). Joint modeling of repeated multivariate cognitive measures and competing risks of dementia and death: a latent process and latent class approach. Statistics in Medicine 35, 382–398.
  • Proust-Lima et al. (2017) Proust-Lima, C., Philipps, V., and Liquet, B. (2017). Estimation of Extended Mixed Models Using Latent Classes and Latent Processes: The R Package lcmm. Journal of Statistical Software, Articles 78, 1–56.
  • Ramsay (1988) Ramsay, J. O. (1988). Monotone Regression Splines in Action. Statistical Science 3, 425–441.
  • Rizopoulos et al. (2009) Rizopoulos, D., Verbeke, G., and Lesaffre, E. (2009). Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71, 637–654.
  • Tang and Tang (2015) Tang, A.-M. and Tang, N.-S. (2015).

    Semiparametric Bayesian inference on skew-normal joint modeling of multivariate longitudinal and survival data.

    Statistics in Medicine 34, 824–843.