Alzheimer’s Disease (AD) is the most common cause of dementia in the aged population . In order to prevent disease progression and take therapeutic treatment in the earliest stage, it is vital to identify AD-related pathological biomarkers of progression and diagnose early-stage AD. A considerable amount of research has been devoted to the use of structured magnetic resonance imaging (MRI) for early-stage AD diagnosis; e.g., [6, 7]. The structural MRI provides measures of cerebral atrophy and it is shown to be most closely coupled with clinical symptoms in AD .
Most work in the literature focus on the cross-sectional study with MRI collected at one single time point; see, e.g., [1, 8, 20]. However, the cross-sectional study could be insensitive to early pathological changes. As an alternative, longitudinal analysis of structural abnormalities has recently attracted attentions [2, 24, 26]. Most of these existing longitudinal studies focus on the atrophy of a few well-known biomarkers such as the hippocampus, entorhinal cortex, and ventricular cortex. However, these prespecified ROIs may be insufficient to capture the full morphological abnormality pattern of the brain MRI. Besides it, a few other issues remain as challenges in the longitudinal analysis. First, longitudinal scans across subjects are usually inconsistent. For example, subjects could have different scanning time and different total number of scans. Second, the total number of ROIs in the brain is large compared with the number of subjects, which poses a challenge to select AD-rated longitudinal biomarkers from the whole brain. Third, the rates of longitudinal change in different ROIs are different and this heterogeneity should be accounted by the modeling of progression.
The goal of this paper is to identify important AD-related ROIs in the whole brain MRI with longitudinal MRI data and use the selected ROIs for AD prediction. Specifically, we use the varying coefficient model  to characterize the heterogeneous changes of different ROIs in structural MRI. This model also allows a nonlinear functional modeling between MRI and clinical cognition functions. We propose a novel feature selection method by combining the smoothing splines and a -penalty, which can simultaneously select and estimate AD-related ROIs. We provide an efficient algorithm to implement the proposed feature selection method. Then the prediction is performed based on the selected longitudinal features and estimated varying coefficients. Our method is robust to the inconsistency among longitudinal scans and is adaptive to the heterogeneity of changes in different ROIs. The use of varying coefficient models is motivated by the hypothetical AD dynamic biomarkers curves proposed by [6, 7], where their principle is that the rates of change over time for MRI and clinical cognition functions are in a temporally ordered manner. Hence, the functional relationship between the atrophy of MRI and the change in clinical cognition functions must be nonlinear in time.
To evaluate our method, we perform experiments using data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI). We predict future clinical changes of mild cognitive impairment (MCI) subjects with brain MRI data. The MCI is a prodromal stage of AD. The prediction of clinical changes help to determine whether a MCI subject will convert into AD at a future time point, which is vital for early diagnosis of AD.
Different feature selections. We proposed a novel feature selection method by combining smoothing splines with a -penalty, which allows to simultaneously select and estimate features. This is different from the two-step method in  by doing the selection and estimation separately and [2, 24] by only using pre-selected features.
The varying coefficient model  can describe time-dependent covariate effects on the responses. Given scaled time , the response functional is related to covariates through
where the centered noise process is independent of s. The model (2.1) allows a nonlinear relationship between s and be letting the coefficients s vary on . On the other hand, (2.1) has an additive structure on covariates s, which enables efficient estimations of coefficients s.
In practice, data are obtained for subject at time , where , and . Note that and s are allowed to be different for different subjects . Denote and let be the response for subject at time , then (2.1) implies
The structure of heterogenous longitudinal data is illustrated in Figure 1, where some subjects could have missing feature values at certain time point. The number of covariates in (2.2) can be larger than the sample size , and then (2.2) becomes a high-dimensional model. Since some covariates might be irrelevant with the response, we want to select important covariates s based on data (2.2) and use the selected covariates for prediction.
We propose a new method to simultaneously select covariates and estimate their corresponding varying coefficients as follows. Assume that varying coefficients reside in a reproducing kernel Hilbert space (RKHS) with the reproducing kernel . Find and to minimize
where and is the RKHS norm. The first term in (2.3) measures the goodness of data fitting and the second term merits the selection property by the -like penalty . We first provide the following theorem to justify the existence of minimizer for (2.3).
There exists a minimizer of (2.3) that is in the domain and .
The proof of this theorem is given in Appendix C. The variable selection method (2.3) is new in the literature and (2.3) is efficient for optimization since it is convex in s and it has only one tuning parameter . We provide an algorithm in Appendix D.
is actually the same as the smoothing splines in nonparametric statistics , and the last term
is actually the same as the Lasso penalty  for the weights s.
3 Experiment Results
|Male/Female||44 / 30||61 / 37|
|Age (years)||73.03 6.65||74.35 7.47|
|Edu. (years)||15.51 3.05||15.59 3.07|
In this section, we predict future clinical changes of MCI subjects with real data from the ADNI database. A detailed description of the ADNI database is relegated to Appendix A. The MCI is a prodromal stage of AD. Generally, some MCI subjects will convert into AD after certain time (i.e., MCI converters, MCI-C for short), while others will not convert (i.e., MCI non-converters, MCI-NC for short) . The prediction of clinical change in a MCI subject help to determine whether the subject will convert into AD at a future time point, which is a central task for the early diagnosis of AD. We summarize the baseline demographic information of ADNI subjects studied here in Table 1.
The preprocessing steps for brain MR imaging used here are described in Appendix B. Specifically, we have total 324 ROIs for each imaging. For MCI subjects, MRI scans were performed at baseline (bl), 6 months (M06), one year (M12), 18 months (M18), two years (M24), three years (M36), and four years (M48). However, some subjects may miss a few visit times and hence they do not have MRI scans at these time points. We choose MCI subjects who have M48 imaging data. Table 2 lists the distributions of visit times for these 172 MCI subjects, where, e.g., 6 of MCI-C subjects make at most 3 visits among the scheduled six times (bl, M06, M12, M18, M24, M36) such that they have at most 3 longitudinal MRI scans.
Our goal is to use longitudinal information (from bl up to M36) to predict the clinical changes of MCI subjects at M48. Since the empirical evidences suggest that the rates of change over time for structural MRI and clinical cognition functions are in a temporally ordered manner (see, e.g., [6, 7]), a nonlinear modeling for the functional relationship between the atrophy of MRI and the change in clinical cognition functions is necessary. Hence, the varying coefficient model (2.1) is used. We choose the Alzheimer’s Disease Assessment Scale – Cognitive Subscale (ADAS-Cog) as the response clinical cognitive test score and it ranges from 70 (severe cognitive impairment) to 0 (no cognitive impairment). The ADAS-Cog measures disturbances of memory, language, and other cognitive abilities. The covariates s include 324 MR imaging ROIs and 3 demographic covariates: age, gender, and education years. The index in (2.1) should be identifiable and we let be the scaled time relative to subjects enter the ADNI study. Figure 2 gives the flowchart of our method.
We build six models by using six different levels of longitudinal information:
Model 1: bl.
Model 2: bl+M06 (including subjects have missings at bl).
Model 3: bl+M06+M12 (including subjects have missings at bl or M06).
Model 4: bl+M06+M12+M18 (including subjects have missings at bl, M06 or M12).
Model 5: bl+M06+M12+M18+M24 (including subjects have missings at bl, M06, M12 or M18).
Model 6: bl+M06+M12+M18+M24+M36 (including subjects have missings at bl, M06, M12, M18, or M24).
Following the flowchart in Figure 2, we first perform the feature selection method in (2.3) for each of the six models. In each experiment, we randomly leave out half of samples in both MCI-C and MCI-NC for prediction. For the training of each model, a 10-fold cross validation is performed to select the tuning parameter in (2.3). The experiments are replicated for 100 times. We summarized the prediction results in Figure 3. It is clear that the longitudinal data can significantly improve the prediction results compared with only using baseline information. And the more longitudinal data included, the better prediction will be obtained. We also observe that the prediction results for MCI-NC are slightly better compared with MCI-C, which can be explained by the fact that MCI-NC subjects have more stable clinical status and less varied clinical scores.
We give examples of selected feature in Figure 4. These are four ROIs that consistently selected in Model 6 for 100 experiments. Figure 4 demonstrates the varying coefficients of the ROIs. Specifically, gender is an important factor and different ROIs have different functional relations with clinical functions (i.e., the maximum effect of each biomarker varies over the course of disease progression). This confirms the evidence and hypothesis in [15, 16] that atrophy does not affect all regions of the brain simultaneously, but perhaps in a sequential manner.
Now we compare our method (2.3) with other two state-of-the-art methods:
The longitudinal analysis in  which only uses the hippocampal volume shrinkage rate as the feature.
Since the methods in [2, 26] require same scanning times and a same number of scans across samples, we perform Model 1–6 for AD prediction with samples having no missing visits. In each experiment, we randomly leave out half of samples in both MCI-C and MCI-NC for prediction. For the training of each model, a 10-fold cross validation is performed to select the tuning parameters in (2.3) and in [2, 26]. The experiments are replicated for 100 times. The prediction comparison results for MCI-C are summarized in Figure 5 and the prediction comparison results for MCI-NC are summarized in Figure 6. It is clear that our proposed method consistently achieves better prediction performances for both MCI-C and MCI-NC. The reason of the superior performance of our method is due to the modeling of nonlinear progression of longitudinal features and selecting important features from the whole brain instead of only using a prespecified feature for prediction.
We study a framework to integrate longitudinal features from the structural MR images for AD prediction based on varying coefficient models. We propose a novel variable selection method by combining smoothing splines and Lasso, which enables simultaneous selection and estimation and is adaptive to heterogeneous longitudinal data. To illustrate the effectiveness of the proposed method, we conduct experiments with the ADNI dataset and show that the proposed method outperforms the state-of-the-art longitudinal analysis methods.
Our work is the first in the literature to model nonlinear progressions of longitudinal features and propose a novel effective variable selection method for the high-dimensional setting. This method shows superior performance in real data AD prediction. It is promising and easy to implement the proposed method in other longitudinal data analysis examples.
There are many interesting future directions. For example, we only use MR images for AD prediction in this paper. It is of interest to apply the proposed method to integrate multi-modal data including MRI, PET, and functional MRI. We expect the integration of multi-modal information would further improve the accuracy of the AD prediction.
Appendix A ADNI Database Description
The Alzheimer’s Disease Neuroimaging Initiative (ADNI) was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and non-profit organization, as a million, five year public-private partnership.
The Principal Investigator of ADNI is Michael W. Weiner, MD, VA Medical Center and University of California – San Francisco. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations. ADNI recruited from over 50 sites across the U.S. and Canada. The initial phase of ADNI recruited 800 adults, aged 55 to 90 and having a study partner able to provide an independent evaluation of functioning, to participate in the research. Among them, there are approximately 200 healthy control older individuals to be followed for 3 years, 400 people with MCI to be followed for 3 years, and 200 people with early AD to be followed for 2 years. See www.adni-info.org for up-to-date information.
The primary goal of ADNI has been to test whether serial Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. Criteria for the different diagnostic groups are summarized in Table 3
. Cognitively healthy control (HC) subjects must have no significant cognitive impairment or impaired activities of daily living. Clinical diagnosed Alzheimer’s disease patients (AD) must have had mild AD and had to meet the National Institute of Neurological and Communicative Disorders and Stroke–Alzheimer’s Disease and Related Disorders Association (NINCDS/ADRDA) criteria for probable AD in. The mild cognitive impairment subjects (MCI) should meet defined criteria for MCI but do not meet the criteria in  and the MCI subjects should have largely intact general cognition as well as functional performance. Study subjects should have given written informed consent at the time of enrollment for imaging and genetic sample collection and completed questionnaires approved by each participating sites Institutional Review Board (IRB).
|Delayed recall Logical||16 Edu:||16 Edu:||16 Edu:|
|Memory II subscale of||8-15 Edu:||8-15 Edu:||8-15 Edu:|
|WMSR||0-7 Edu:||0-7 Edu:||0-7 Edu:|
Appendix B Preprocessing of the brain MRI used here
The structural MRI used in this study are cortical gray matter volumes processed using FreeSurfer software version 4.4 longitudinal image processing framework (https://surfer.nmr.mgh.harvard.edu/) (“ucsffsl" file). This dataset has been used in, for example, [9, 18, 19]. Specifically, subjects with a 1.5-T MRI were included in the dataset where the scans were preprocessed by certain correction methods including gradwarp, B1 calibration, N3 correction, and skull-stripping (see, e.g.,  for detail), and the FreeSurfer 4.4 implements the symmetric registration  and unbiased robust template estimation . Only MRIs which passed the quality control for all the areas were included in our study. There are total 393 ROIs of brain MRI created by FreeSurfer 4.4 and they consist of volumes of brain regions obtained after cortical parcellation and white matter parcellation, surface area of the brain regions and cortical thickness of the brain regions. However, some ROIs are missing more than across all samples due to the preprocessing. In Section 3 of the paper, we use 324 ROIs with at most missing values across the preprocessed samples.
Appendix C Proof of Theorem 2.1
Denote by the functional to be minimized in (2.3). It is clear that is convex and continuous in s. Denote by , and without loss of generality, we assume . Denote by and . By Cauchy-Schwarz inequality, for any , ,
Denote . Consider the set
Since is closed, convex, and bounded set, there exists a minimizer for (2.3) in . Denote the minimizer by . Then, . On the other hand, for any satisfying . It is clear that . For any with and , (C.1) implies that for any , ,
Hence, . Therefore, for any , we have that , where is the minimizer of (2.3). This completes the proof.
Appendix D Algorithm
This algorithm is based on Theorem 2.2 whose proof is given later in Appendix E. Consider for any fixed . If for some , then in the optimization (2.4). Without less of generality, let and (2.4) is equivalent to the smoothing spline type problem: find to minimize
By the representer lemma , have a closed form expression:
Define a matrix by
and let be a () matrix where the th matrix is . Define kernel matrix by
Let the unknown coefficient vectorbe
Write the response vector as
Let be the column vector consisting of ’s. Then (D.1) becomes
which has the unique solution given as follows:
Note that when are fixed, (2.4) is equivalent to find to minimize
The minimizer of (D.3) is
where and are given by (D.2).
On the other hand, consider when is fixed, then the minimization of (2.4) is equivalent to
which can be written as
for some .
Therefore, we propose the algorithm of iterating (D.3) and (D.4) for giving the minimizer of (2.4). We observe in simulations that the objective function in optimization (2.4) decreases quickly in the first iteration and after the first iteration the objective function is close to the objective function at convergence. This motivates us to consider the following one-step update algorithm:
Initialization: fix for .
Solve for and in (D.3) and tune according to the generalized cross-validation (GCV). Fix at the chosen value in all later steps.
For and obtained in step 2, solve for in (D.4) with a fixed .
With obtained in step 3, solve for and in (D.3).
We choose the best in Step 3 according to the fivefold cross-validation. In the simulations we find that when is fixed according to step 2, the optimal seems to be close to the number of important components. This gives a range to determine the tuning for .
Appendix E Proof of Theorem 2.2
-  C. Aguilar, E. Westman, J.S. Muehlboeck, P. Mecocci, B. Vellas, M. Tsolaki, I. Kloszewska, H. Soininen, S. Lovestone, C. Spenger, and A. Simmons, Different multivariate techniques for automated classification of MRI data in Alzheimer’s disease and mild cognitive impairment, Psychiatry Research: Neuroimaging, 212 (2013), pp. 89–98.
-  A. Chincarini, F. Sensi, L. Rei, G. Gemme, S. Squarcia, R. Longo, F. Brun, S. Tangaro, R. Bellotti, N. Amoroso, and M. Bocchetta, Integrating longitudinal information in hippocampal volume measurements for the early detection of Alzheimer’s disease, NeuroImage, 125 (2016), pp. 834–847.
-  T. Hastie, and R. Tibshirani, Varying-coefficient models, Journal of the Royal Statistical Society. Series B (Methodological), (1993), pp. 757–796.
-  C.R. Jack, M.A. Bernstein, N.C. Fox, P. Thompson, G. Alexander, D. Harvey, B. Borowski, P.J. Britson, J. L. Whitwell, and C. Ward, The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods, Journal of Magnetic Resonance Imaging, 27 (2008), pp. 685–691.
-  C.R. Jack, M.A. Bernstein, N.C. Fox, P. Thompson, G. Alexander, D. Harvey, B. Borowski, P.J. Britson, J. L. Whitwell, and C. Ward, Serial PIB and MRI in normal, mild cognitive impairment and Alzheimer’s disease: implications for sequence of pathological events in Alzheimer’s disease, Brain, 132 (2009), pp. 1355–1365.
-  C.R. Jack, D.S. Knopman, W.J. Jagust, L.M. Shaw, P.S. Aisen, M.W. Weiner, R.C. Petersen, and J.Q. Trojanowski, Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cascade, The Lancet Neurology, 9 (2010), pp. 119–128.
-  C.R. Jack, D.S. Knopman, W.J. Jagust, R.C. Petersen, M.W. Weiner, P.S. Aisen, L.M. Shaw, P. Vemuri, H.J. Wiste, and S.D. Weigand, Tracking pathophysiological processes in Alzheimer’s disease: an updated hypothetical model of dynamic biomarkers, The Lancet Neurology, 12 (2013), pp. 207–216.
-  M. Liu, D. Zhang, and D. Shen, Relationship induced multi-template learning for diagnosis of Alzheimer’s disease and mild cognitive impairment, IEEE Transactions on Medical Imaging, 35 (2016), pp. 1463–1474.
-  L. Mah, M.A. Binns, D.C. Steffens, and Alzheimer’s Disease Neuroimaging Initiative, Anxiety symptoms in amnestic mild cognitive impairment are associated with medial temporal atrophy and predict conversion to Alzheimer disease, The American Journal of Geriatric Psychiatry, 23 (2015), pp. 466–476.
-  L.K. McEvoy, D. Holland, D.J. Hagler, C. Fennea-Notestine, J.B. Brewer, and A.M. Dale, Mild cognitive impairment: baseline and longitudinal structural MR imaging measures improve predictive prognosis, Radiology, 259 (2011), pp. 834–843.
-  G. McKhann, D. Drachman, M. Folstein, R. Katzman, D. Price, and E.M. Stadlan, Clinical diagnosis of Alzheimer’s disease Report of the NINCDS-ADRDA Work Group* under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease, Neurology, 34 (1984), pp. 939.
-  M. Prince, R. Bryce, E. Albanese, A. Wimo, W. Ribeiro, and C.P. Ferri, The global prevalence of dementia: a systematic review and metaanalysis, Alzheimer’s & Dementia, 9 (2013), pp. 63–75.
-  M. Reuter, H.D. Rosas, and B. Fischl, Highly accurate inverse consistent registration: a robust approach, Neuroimage, 53 (2010), pp. 1181–1196.
-  M. Reuter, N.J. Schansky, H.D. Rosas, and B. Fischl, Within-subject template estimation for unbiased longitudinal image analysis, Neuroimage, 61 (2012), pp. 1402–1418.
-  M.R. Sabuncu, R.S.. Desikan, J. Sepulcre, B.T.T. Yeo, H. Liu, N.J. Schmansky, M. Reuter, M.W. Weiner, R.L. Buckner, and R.A. Sperling, The dynamics of cortical and hippocampal atrophy in Alzheimer disease, Archives of neurology, 68 (2011), pp. 1040–1048.
-  N. Sabuncu, D. Tosun, P.S. Insel, G.C. Chiang, D. Truran, P.S. Aisen, C.R. Jack, M.W. Weiner, and Alzheimer’s Disease Neuroimaging Initiative, Nonlinear time course of brain volume loss in cognitively normal and impaired elders, Neurobiology of aging, 33 (2012), pp. 845–855.
-  R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological), (1996), pp. 267–288.
-  J.B. Toledo, X. Da, M.W. Weiner, D.A. Wolk, S.X. Xie, S.E. Arnold, C. Davatzikos, L.M. Shaw, J.Q. Trojanowski, and Alzheimer’s Disease Neuroimaging Initiative, CSF Apo-E levels associate with cognitive decline and MRI changes, Acta neuropathologica, 127 (2014), pp. 621–632.
-  D. Tosun, N. Schuff, L.M. Shaw, J.Q. Trojanowski, and M.W. Weiner, Relationship between CSF biomarkers of Alzheimer’s disease and rates of regional cortical thinning in ADNI data, Journal of Alzheimer’s Disease, 26 (2011), pp. 77–90.
-  N. Tzourio-Mazoyer, B. Landeau, D. Papathanassiou, F. Crivello, O. Etard, N. Delcroix, B. Mazoyer, and M. Joliot, Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain, Neuroimage, 15 (2002), pp. 273–289.
-  P. Vemuri, H. Wiste, S. Weigand, L. Shaw, J. Trojanowski, M. Weiner, D.S. Knopman, R.C. Petersen, and C.R. Jack, MRI and CSF biomarkers in normal, MCI, and AD subjects: predicting future clinical change, Neurology, 73 (2009), pp. 294–301.
-  G. Wahba, Spline models for observational data, SIAM.
-  J.L. Whitwell, S.A. Przybelski, S.D. Weigand, D.S. Knopman, B.F. Boeve, R.C. Petersen, and C.R. Jack, 3D maps from multiple MRI illustrate changing atrophy patterns as subjects progress from mild cognitive impairment to Alzheimer’s disease, Brain, 130 (2007), pp. 1777–1786.
-  W.Y.W. Yau, D.L. Tudorascu, E.M. McDade, S. Ikonomovic, J.A. James, D. Minhas, W. Mowrey, L.K. Sheu, B.E. Snitz, and L. Weissfeld, Longitudinal assessment of neuroimaging and clinical markers in autosomal dominant Alzheimer’s disease: a prospective cohort study, The Lancet Neurology, 14 (2015), pp. 804–813.
-  M. Yuan, and Y. Lin, Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68 (2006), pp. 49–67.
-  D. Zhang, D. Shen, and Alzheimer’s Disease Neuroimaging Initiative, Predicting future clinical changes of MCI patients using longitudinal and multimodal biomarkers, PloS One, 7 (2012), e33182.