Multi-scale Graph-based Grading for Alzheimer's Disease Prediction

07/15/2019 ∙ by Kilian Hett, et al. ∙ 4

The prediction of subjects with mild cognitive impairment (MCI) who will progress to Alzheimer's disease (AD) is clinically relevant, and may above all have a significant impact on accelerate the development of new treatments. In this paper, we present a new MRI-based biomarker that enables us to predict conversion of MCI subjects to AD accurately. In order to better capture the AD signature, we introduce two main contributions. First, we present a new graph-based grading framework to combine inter-subject similarity features and intra-subject variability features. This framework involves patch-based grading of anatomical structures and graph-based modeling of structure alteration relationships. Second, we propose an innovative multiscale brain analysis to capture alterations caused by AD at different anatomical levels. Based on a cascade of classifiers, this multiscale approach enables the analysis of alterations of whole brain structures and hippocampus subfields at the same time. During our experiments using the ADNI-1 dataset, the proposed multiscale graph-based grading method obtained an area under the curve (AUC) of 81 predict conversion of MCI subjects to AD within three years. Moreover, when combined with cognitive scores, the proposed method obtained 85 results are competitive in comparison to state-of-the-art methods evaluated on the same dataset.



There are no comments yet.


page 4

page 7

page 9

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Alzheimer’s disease (AD) is the most prevalent dementia affecting elderly people (petrella2003neuroimaging, ). According to the World Health Organization, the number of patients with AD will double in 20 years. AD is a serious condition characterized by an irreversible neurodegenerative process that causes mental dysfunctions such as longterm memory loss, language impairment, disorientation, change in personality, and finally death (alzheimer20152015, ). This disease is characterized by an accumulation of beta-amyloid plaques and neurofibrillary tangles composed of tau amyloid fibrils (hardy2006alzheimer, )

leading to synapse and neuronal losses. To date, no known therapy has been able to stop or slow down the progression of AD. Moreover, neuroimaging studies have revealed that brain changes occur decades before the diagnosis is established

coupe2015detection ; coupe2019lifespan . Thus, when the diagnosis is made, the pathological load is already high (decarli2003mild, ).

Indeed, before the diagnosis is established the patient is already suffering from amnesic mild cognitive impairment (MCI). MCI is considered a prodromal phase of AD. The clinical symptoms of MCI are slight but a decrease of cognitive abilities is measurable. Previous studies have suggested that approximately 12% of subjects suffering from MCI progress to AD in the four years following the first symptoms (petersen1999mild, ). Therefore, although MCI subjects present a high risk of AD development, subjects suffering from MCI can remain stable (i.e., do not convert to AD) or even recover normal cognitive status. The early prediction of the subjects suffering from MCI symptoms who will convert to AD is thus crucial. This can improve the effectiveness of the future therapies by reducing the brain changes before the therapy starts. Also, the prediction of conversion to AD can accelerate the development of new therapies by making the subject selection more accurate. This would decrease the cost of clinical trials and enable more accurate clinical studies.

With the improvement of medical imaging techniques such as magnetic resonance imaging (MRI), many methods have been developed to increase the ability of computer-aided diagnosis systems to help early AD detection (arbabshirani2017single, ; rathore2017review, ). These methods can be grouped into two categories related to how they analyze AD alterations.

On the one hand, methods have been developed to study the inter-subject similarity between individuals from different groups that represent specific disease severities. Among these approaches, a popular method to estimate similarity at a voxel scale is the voxel-based morphometry (VBM)

(ashburner2000voxel, ; bron2015standardized, ; moradi2015machine, ). Methods based on region of interest (ROI) have also been proposed. A widely used approach is based on a volumetric measurement of gray matter within brain structures (bron2015standardized, ; ledig2018structural, ). Other ROI-based methods such as thickness measurement have been developed to capture the variations of gray matter along the cerebral cortex (wolz2011multi, ; wee2013prediction, ). Among advanced methods, patch-based grading (PBG) framework has been proposed to capture subtler alterations caused by the pathology. Indeed, PBG method has demonstrated state-of-the-art performance to detect alterations of hippocampus (coupe2012scoring, ; hett2018adaptive, ). This framework has also been extended to perform a whole brain analysis tong2017novel

. This extension has shown competitive performance for AD prediction especially compared to other approaches based on deep-learning architectures

basaia2018automated ; lian2018hierarchical .

On the other hand, several methods have been proposed to capture the intra-subject variability. Such methods assume that AD does not occur at isolated areas but in several inter-related regions. Although similarity-based biomarkers provide helpful tools for detecting the first signs of AD, the structural alterations leading to cognitive decline are not homogeneous within a given subject. Therefore, intra-subject variability features could encode relevant information. Some methods proposed to capture the relationship of spread cortical atrophy with a network-based framework wee2013prediction . Other approaches estimate inter-regional correlation of brain tissue volumes zhou2011hierarchical . A study has proposed a generic framework that embeds spatial and anatomical priors within a graph model. This method extracts inter-subject variability from different features (for instance, voxel-based and cortical thickness) and various MRI modalities cuingnet2013spatial

. Recently, convolutional neural networks (CNN) have been used to capture relationships between anatomical structures volumes

suk2017deep , and cortical thickness wee2019cortical . It is interesting to note that methods based on inter-subject similarities and intra-subject variability have performed similarly for AD prediction.

All these elements indicate that inter-subject similarity and intra-subject variability features provide important information for predicting the subject’s conversion. Consequently, our first contribution is the development of a new method that efficiently combines inter-subject similarities estimated with a patch-based grading approach and intra-subjects’ variability modeled by a graph-based approach. We applied our new method to two different anatomical scales: hippocampal subfields and whole brain structures. The experiments carried out show an increase in prediction performances for both anatomical scales. This demonstrates the generic nature of our new method. The second contribution presented in this paper is the development of a novel method based on a cascade of classifiers to efficiently and simultaneously combine information related to hippocampal subfields and whole brain structures alterations. Our multi-scale graph-based grading method demonstrates competitive performance with an area under the curve (AUC) of 81% for AD prediction. Moreover, when combined with cognitive scores, the proposed method obtained 85% of AUC.

2 Materials

2.1 Dataset

Data used in this work were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset111 ADNI is a North American campaign launched in 2003 with aims to provide MRI, positron emission tomography scans, clinical neurological measures and other biomarkers. We use baseline T1-weighted (T1w) MRI of the ADNI1 phase that has been proposed in (tong2017novel, ). This dataset includes AD patients, subjects with mild cognitive impairment (MCI) and cognitive normal (CN) subjects. MCI is a presymptomatic phase of AD composed of subjects who have abnormal memory dysfunctions. In our experiments we consider two groups of MCI. The first group is composed of patients who have stable MCI (sMCI) and the second one is composed of patients having MCI symptoms at the baseline and then converted to AD in the following 36 months. This group is named progressive MCI (pMCI). The information of the dataset used in our work is summarized in Table 1. Moreover, the dataset and the code developed during this study are available from the corresponding author on reasonable request.

CN sMCI pMCI AD P value
      Number 213 90 126 130
      Ages (years) = 0.63
      Sex (M/F) =5.29, = 0.15
      MMSE < 0.01
      CDR-SB 3.5 2.7 4.5 2.3 4.8 2.1 4.7 1.9 < 0.01
      RAVLT 45.4 9.7 35.5 10.2 27.7 8.9 24.6 7.0 < 0.01
      FAQ 8.4 4.4 13.3 5.4 20.2 6.7 30.0 9.0 < 0.01
      ADAS11 5.2 3.0 8.1 3.6 12.5 4.9 20.2 7.6 < 0.01
      ADAS13 0.2 0.9 2.3 3.7 4.3 4.8 14.6 6.6 < 0.01

Significant at p < 0.05.

Chi-square test (df = 3).

Kruskal–Wallis test (df = 3).

Table 1: Description of the dataset used in this work. Data are provided by ADNI.

2.2 Preprocessing

The data are preprocessed using the following steps: (1) denoising using a spatially adaptive non-local means filter (manjon2010adaptive, ), (2) inhomogeneity correction using N4 method (tustison2010n4itk, ), (3) low-dimensional non-linear registration to MNI152 space using ANTS software (avants2011reproducible, ), (4) intensity standardization. All the following experiments have been carried out with images in the MNI space.

3 Method

3.1 Method overview

As illustrated in Figure 1, our graph of structure grading method, that combines inter-subjects’ similarities and intra-subjects’ variability, is composed of several steps. First, a segmentation of the structures of interest is conducted on the input images. Then, a patch-based grading (PBG) approach is carried out over every segmented structures (e.g., hippocampal subfields and brain structures). Two different alterations impacting the brain structures are captured with PBG methods: the changes caused by normal aging (koikkalainen2012improved, )

and the alterations caused by the progression of AD. Therefore, at each voxel, the grading values are age-corrected to avoid bias due to normal aging. After the patch-based grading maps are age-corrected, we construct an undirected graph to model the topology of alterations caused by Alzheimer’s disease. This results in a high dimensional feature vector. Consequently, to reduce the dimensionality of the feature vector computed by our graph-based method, we use an elastic net that provides a sparse representation of the most discriminative elements of our graph (


, edges and vertices). We use only the most discriminative features of our graph as the input to a random forest method which predicts the subject’s conversion.

Figure 1:

Pipeline of the proposed graph-based grading method. PBG is computed using CN and AD training groups. CN group is also used to correct the bias related to age. Then, this estimation is applied to AD and MCI subjects. Afterwards, the graph is constructed, and the feature selection is trained on CN and AD and then is applied to CN, AD and MCI. Finally the classifier is trained with CN and AD.

3.2 Segmentation

First, to enable analysis of the alterations that occur over different brain structures, segmentation using a non-local label fusion (giraud2016optimized, ) and a systematic error correction (wang2011learning, ) at two different anatomical scales are performed, the hippocampal subfields and the whole brain structures.

Segmentation of hippocampal subfields was performed with HIPS, which is a method based on a combination of non-linear registration and patch-based label fusion (romero2017hips, ). This method uses a training library based on a dataset composed of high resolution T1w images manually labeled according to the protocol proposed in (winterburn2013novel, )

. To perform the segmentation, the ADNI images were up-sampled with a local adaptive super-resolution method to fit in the training image resolution

(coupe2013collaborative, ). The method provides automatic segmentation of hippocampal subfields grouped into 5 labels: Subiculum, CA1SP, CA1SR-L-M, CA2-3 and CA4/DG. Afterwards, the segmentation maps obtained on the up-sampled T1w images were down-sampled to fit in the previous MNI space resolution.

Whole brain structures have been labeled with a patch-based multi-template segmentation manjon2016volbrain . This method has been performed using 35 images manually labeled by Neuromorphometrics, Inc. 222 following the brain-COLOR labeling protocol composed of 133 structures.

Finally, visual quality control was conducted to remove all incorrect segmentations from the dataset. Moreover, to prevent any bias in the dataset, the pathological status of each subject was hidden during the entire quality control process.

3.3 Patch-based grading

Once the images are segmented, a patch-based grading of the entire brain is performed using the method described in hett2018adaptive . Such a method was first proposed to detect hippocampus structural alterations with a new scale of analysis coupe2012simultaneous ; coupe2012scoring

. The patch-based grading approach provides the probability that the disease has impacted the underlying structure at each voxel. This probability is estimated via an inter-subject similarity measurement derived from a non-local approach.

The method begins with building a training library from two datasets of images: one with images from CN subjects and the other one from AD patients. Then, for each voxel of the region of interest in the considered subject , the PBG method produces a weak classifier denoted . This weak classifier provides a surrogate for the pathological grading at the considered position . The weak classifier is computed using a measurement of the similarity between the patch surrounding the voxel belonging to the image under study and a set of the closest patches , surrounding the voxel , extracted from the template . The grading value at is defined as:


where is the weight assigned to the pathological status of the training image . We estimate as:


where and . The pathological status is set to for patches extracted from AD patients and to for patches extracted from CN subjects. Therefore, the PBG method provides a score representing an estimate of the alterations caused by AD at each voxel. Consequently, cerebral tissues strongly altered by AD have scores close to while healthy tissues have scores close to .

3.4 Graph construction

Once structure alterations are estimated using patch-based grading, we can model intra-subject variability for each subject using a graph to better capture the AD signature. Indeed, within the last decade, graph modeling has been widely used for its ability to capture the patterns of different diseases. This is achieved by encoding the relationships of abnormalities between different structures in the edges of the graph. Most of the proposed methods estimate the degree of correlation between two different structures for each edge of the graph. Furthermore, graph modeling can also model inter-subject similarity, by independently encoding the abnormality of each structure in the vertices measurement. Consequently, we proposed a graph-based grading approach that uses a graph model to combine inter-subject similarities computed with the PBG and intra-subjects’ variability which is computed with a distribution of the grading value distributions within each structure.

In our graph-based grading method, the segmentation maps are used to fuse grading values into each ROI, and to build our graph. We define an undirected graph , where is the set of vertices for the considered brain structures and is the set of edges. In our work, the vertices are the mean of the grading values for a given structure while the edges are based on grading distribution distances between two structures.

To this end, the probability distributions of PBG values are estimated with a histogram

for each structure . The number of bins is computed with Sturge’s rule (sturges1926choice, ). For each vertex we assign a function defined as , where is the mean of . For each edge we assign a weight given by the function defined as follows:


where is the Wasserstein distance with norm (rubner2000earth, ) that showed best performance during our experiments. Wasserstein distance between two histograms is defined as the minimization of the following equation,


subject to,


where is the index set for bins, and are the two normalized histograms. is the set for flows, and is the group distance defined by a norm. As described above, in our experiment we used the -norm.

3.5 Selection of discriminant graph components

Completion of the previous step results in a high-dimensional feature vector. Moreover, all the components computed from the graph-based grading method do not have the same significance. For instance, some structures and some alterations relationship are not discriminant to detect Alzheimer’s disease.

Consequently, in this work we used the elastic net regression method that provides a sparse representation of the most discriminating edges and vertices, and thus reducing the feature dimensionality by capturing the key structures and the key relationships between the different brain structures (see Figure 1). Indeed, it has been demonstrated that combining the and norms takes into account possible inter-feature correlation while imposing sparsity (zou2005regularization, ). Finally, after normalization, the resulting feature vector is given as the input of the feature selection, defined as the minimization of the following equation:


where is a sparse vector that represents the regression coefficients and is a matrix with rows corresponding to the subjects and columns corresponding to the features, including: the vertices, the edges or a concatenation of both for the full graph of grading feature vector. and are the regularization hyper-parameters set to balance the sparsity and the correlation inter-feature. Finally, represents the pathological status of each patient.

3.6 Application to different anatomical scales

In our experiments two different anatomical scales have been considered. First, as presented in (hett2018ghsg, ), we applied our graph of structure grading method within a definition of the hippocampal subfields. A histogram is computed to estimate the probability distribution of the grading values for each hippocampal subfields. Thus, , represents the graph of the hippocampal subfields grading. The vertices represent alteration of hippocampal subfields measured with patch-based grading, and the edges represent the relationship between hippocampal subfield alterations embedded in graph modeling.

Second, we applied our graph-based approach to a whole brain parcelization. Here, the histograms are computed to estimate the probability distribution of the grading values within each brain structure as proposed in (hett2018gbsg, ). Thus, for this second anatomical scale of analysis, represents the graph of brain structure grading. The vertices represent measures of alteration of brain structures and the edges represent the alteration relationship between two brain structures.

3.7 Multi-scale graph-based grading

To combine multiple anatomical scales (for instance, brain structures and hippocampal subfields), we developed a multi-scale graph-based grading () approach.

This method is based on a cascade of classifiers. In this approach, a graph of brain structures and a graph of hippocampal subfields are computed separately as has been described in the previous sections (see Figure 2). The elastic net regression method is used to select the most discriminating features of each graph. Afterward, a first layer of RF classifier is used to compute both a posteriori probabilities and for whole brain and hippocampal subfields, respectively. Here, represents the pathological status of the subject under study, while and represent the selected features of and models respectively. Finally, these a posteriori probabilities are used as the input of a linear classifier to make the final decision.

In addition of this new method, we also proposed a straightforward extension of our graph-based grading method. This approach results in the concatenation of and features into a single feature vector before the feature selection step.

Figure 2: Schema of the proposed multi-scale graph-based grading method. First, the segmentation maps are used to aggregate grading values. Our method computes a histogram for each structure/subfield. Once the graphs are built, an elastic net is computed to select the most discriminating graph features for each anatomical scale. A first layer of random forest classifiers are computed to estimate a posteriori probabilities. Finally, a linear classifier is trained with the a posteriori probabilities from each anatomical scale to compute the final decision. A random forest classifier replaces the linear classifier for the multimodal experiments to deal with the feature heterogeneity resulting from the concatenation of a posteriori probabilities and cognitive scores.

3.8 Combination with cognitive tests

It has been shown in previous works that MRI-based biomarkers can be complementary to cognitive assessments used in clinical routines tong2017novel ; samper2019reproducible .

Therefore, in addition of studying the efficiency of our novel imaging-based biomarkers, a study of the complementarity of our proposed method with cognitive scores has also been conducted. In this work, we have considered different cognitive scores such as MMSE, CDR-SB, RAVLT, FAQ, ADAS11, and ADAS13 cognitive tests. The cognitive scores are concatenated into a vector of cognitive features and normalized by a z-score. Finally, a concatenation of normalized cognitive scores and graph-based features are used as inputs of the final classifier as illustrated in Figure 


3.9 Details of implementation

The most similar patches were extracted with a patch-match method (giraud2016optimized, ). We used the grading method proposed in (hett2018adaptive, ), with the same parameters for the size of the patches and

. The effect of age has been corrected using a linear regression estimated on CN population

(koikkalainen2012improved, ).

The elastic net feature selection has been computed with the SLEP package (liu2009slep, ). The two parameters and have been set up with a grid search method resulting in and being the best parameters for the experiment. The classifications were obtained using a random forest (RF) as implemented in 333 In our experiments, we used the Gini index as impurity criterion. RF has also two parameters, the numbers of three and the number of randomly selected features . These two parameters was set as follows, and . A linear discriminant analysis (LDA) classifier has been used to compute the final decision for the multi-scale graph-based grading approach. However, a random forest classifier replaces the linear classifier for the multimodal experiments to deal with the feature heterogeneity resulting from the concatenation of a posteriori probabilities and cognitive scores. All features were normalized using z-scores before the selection and classification steps.

In our experiments, we performed sMCI versus pMCI and CN versus AD classifications. For sMCI versus pMCI classification, the elastic net feature selection and the classifiers were trained with CN and AD. Indeed, as shown in (tong2017novel, ), the use of CN and AD to train the feature selection method and the classifier enables to better discrimination between sMCI and pMCI subjects. Furthermore, this technique also limits bias and the overfitting problem and does not require a cross validation step. Finally, to estimate the inner-variability of the RF, 100 runs were performed. A stratified 10-folds cross-validation procedure has been conducted for the comparison of CN versus AD. Mean area under curve (AUC), accuracy (ACC), balanced accuracy (BACC), sensitivity (SEN), and specificity (SPE) are provided for each experiment.

4 Results

In this section, to evaluate the performance of the graph-based grading method, we first propose a comparison of the prediction and detection accuracy of the different graph components. Afterwards, we apply our method within the hippocampal subfields and the whole brain structures (see Table 2). Moreover, we evaluate the proposed approach to combine different anatomical scales (see Table 3). Then, we evaluate the complementarity of our image-based biomarker and the cognitive scores that are usually used in clinical routines (see Table 4). Finally, we compare the performance of our method with state-of-the-art methods for early detection of Alzheimer’s disease (see Table 5 and 6).

4.1 Graph of hippocampal subfields

Figure 3: Representation of the most selected structures. The brain structures and hippocampal subfields are selected separately with the elastic net method. Frequently selected structures are colored red. (A) the most frequently selected brain structures are the temporal lobe, the postcentral gyrus, the anterior cingulate gyrus, the hippocampus and the precuneus. (B) the most frequently selected hippocampal subfields are the CA1-SP, the CA1-SRLM, and the subiculum.

First, we compared each element of our graph of structure grading within the hippocampal subfields (see Table 2). As previously proposed in (hett2018adaptive, ), the PBG applied within the whole hippocampus is used as baseline for this experiment.

Thus, PBG based on the whole hippocampus structure obtains 76.8% of AUC, 70.3% of ACC and is more specific than sensitive. Although PBG values of all hippocampal subfields (see “all” in the table 2) do not improve prediction performances, PBG values within selected vertices (i.e., subiculum, CA1-SP, and CA1-SRLM) obtain 77.1% of AUC, 71.1% of ACC (see “selected” in the table 2), and improve the specificity in comparison to hippocampus grading. Thus, the use of hippocampal subfields selected with the elastic net method slightly increases the prediction performance of AD compared to the union of all subfields or the whole hippocampus. Furthermore, the edges selected by the elastic net do not improve the prediction performance compared to other hippocampal features. Finally, the proposed method combining edges and vertices improves the AUC by 1.4 percent points and the accuracy 4.4 percent points compared to the global hippocampus grading. Our graph-based method also improves the AUC by 1.1 percent points and the accuracy by 3.6 percent points when compared to the use of the most discriminant hippocampal subfields. Moreover, in both cases, our proposed graph-based method has a higher sensitivity.

The figure 3-B illustrates the contribution (i.e., the number of selection by the elastic net) for each hippocampal subfield in the graph-based features vectors after the feature selection step. The experiments have shown that the most discriminant hippocampal subfields selected are the subiculum, and the two subfields representing the CA1. This is interesting because hippocampal subfields selected by EN method are in line with previous studies which have shown that the CA1 and subiculum as the subfields with the most significant atrophy in late stages of AD (kerchner2012hippocampal, ; trujillo2014early, ).

      Methods AUC ACC BACC SEN SPE
      Hippocampus PBG 76.80.2 70.30.0 70.60.0 69.00.0 72.20.0
      Hipp. all vertices 73.90.2 67.10.0 67.90.0 72.20.0 63.50.0
      Hipp. selected vertices 77.10.2 71.10.4 71.40.4 69.50.6 73.20.5
      Hipp. all edges 66.70.2 61.10.4 62.00.4 68.00.4 56.00.4
      Hipp. selected edges 67.90.2 63.00.4 63.80.4 68.90.4 58.70.4
       78.20.2 74.70.4 74.30.5 77.10.5 71.40.9
      Brain all vertices 68.20.2 65.30.4 66.70.5 68.60.5 62.20.5
      Brain selected vertices 77.20.2 70.10.4 71.10.5 77.80.5 64.40.5
      Brain all edges 67.10.2 65.70.2 64.80.7 69.40.2 60.50.2
      Brain selected edges 76.90.2 72.20.4 71.90.5 73.80.5 70.00.5
       79.40.2 75.50.4 75.10.5 77.60.5 72.60.5
Table 2: Classification of sMCI versus pMCI. Results obtained by inter-subject similarity features (i.e., vertices), intra-subject variability features (i.e., edges) and a combination of both. The patch-based grading applied on the hippocampus is used as baseline. The results demonstrate the genericity of our method that obtains best performances within the hippocampal subfields and the whole brain structures. Moreover, the experiments also show a slight superiority of the whole brain structures for AD prediction.

4.2 Graph of brain structures

As done for hippocampal subfields, an evaluation of structure grading over the whole brain has also been conducted. We estimated the performance obtained by each feature separately (see Table 2). The use of all vertices (i.e., the averages of PBG values computed within each brain structure) decreases the prediction performance compared to the use of only the hippocampus (65.3% compared to 70.3% of accuracy). A selection of the most discriminating vertices obtains similar results to those of the hippocampus only with an accuracy of 70.1%. Contrary to the hippocampal subfields where vertices were most efficient than edges, the use of edge features performs similarly to the vertices.

Finally, as shown with the hippocampal subfields, the combination of both features, edges and vertices, that capture the inter-subjects’ similarities and intra-subject variability enables an important increase of prediction performance. Our method applied with the brain structures obtains 75.5% accuracy and 79.4% AUC. Moreover, the experiments also show a similar sensitivity similar to that of using only selected vertices and a higher specificity than using only selected edges.

Figure 3-A illustrates the most selected brain structures during the feature selection step. The experiments have shown that the the most frequently selected brain structures are the temporal lobe, the postcentral gyrus, the anterior cingulate gyrus, the hippocampus and the precuneus. It is also interesting, as the results obtained from the hippocampal subfields, the most selected brain structures are in line with clinical studies that show a relationship between the atrophy of specific brain structures.

   Hippocampus PBG 76.80.2 70.30.0 70.60.0 69.00.0 72.20.0
   Graph of hippocampal subfields () 78.20.2 74.70.4 74.3 0.4 77.10.4 71.40.4
   Graph of brain structures () 79.40.2 75.50.4 75.2 0.4 77.60.4 72.60.4
   Graph of hipp. sub. + brain 79.60.2 74.50.4 73.9 0.4 77.30.4 70.60.4
   Multi-scale graph-based grading ()* 80.60.2 76.00.4 75.70.4 77.80.4 73.60.4

* Method illustrated in Figure 2

Table 3: Comparisons of the different PBG approaches for the task of classifying sMCI versus pMCI. PBG computed over the hippocampus is provided as a baseline. The results show that the approach improves performance in terms of AUC, ACC, BACC, SEN and SPE. All results are expressed in terms of percentages.

4.3 Multiscale graph-based grading

A comparison of prediction performances obtained with our graph-based grading method applied in each anatomical scale independently and the combination of both is provided in Table 3. In this experiment, two approaches have been investigated.

First, the results of this comparison confirm that for sMCI versus pMCI classification, whole-brain analysis enables better performance than analysis of the hippocampus subfields. Indeed, (whole brain) obtains 79.4% of AUC and 75.5% of accuracy while (hippocampus subfields) obtains 78.2% of AUC and 74.7% of accuracy.

Second, we compare the two approaches of combining both anatomical scales (i.e., simple concatenation or cascade of classifiers). The results suggest that the straightforward concatenation of the feature vectors from and methods does not improve the performance compared to and . Indeed, the concatenation of the feature vectors obtains 79.6% of AUC and 74.5% of ACC, which is lower than the results obtained from the use of whole brain structures. However, the multi-scale graph-based approach () method (see Figure 2) shows increased performance for each considered measure of classification. This last method obtains 80.6% of AUC and 76% of accuracy. This result indicates that the analysis of hippocampal subfields and whole brain structures are complementary. Therefore, in the rest of the experiments, we only consider the method.

4.4 Complementarity with cognitive tests

Table 4 presents a comparison of the results obtained using features derived from cognitive tests, our imaging-based method and the combination of both. This comparison demonstrates that our imaging-based method obtains better results than using cognitive scores. Indeed, improves the sMCI versus pMCI classification by 1.8 percent of AUC and 1.5 percent of accuracy compared to using cognitive scores only.

Moreover, the results of the experiment indicate the complementarity of imaging-based and cognitive assessments for AD prediction. Thus, the combination of cognitive scores and features obtains 85.5% AUC and 80.6% accuracy which improves AUC by 4.9% and improves accuracy by 4.6% when compared to the method.

    Cognitive score 78.80.2 74.50.4 72.40.4 84.90.4 60.00.4
     80.60.2 76.00.4 75.70.4 77.80.4 73.60.4
     + Cognitive score 85.50.2 80.60.4 79.20.4 87.30.4 71.10.4
Table 4: Comparison of our graph-based approach with cognitive test scores and combination of both for AD prediction. Although our MSGG obtains better results in terms of AUC, ACC, BACC, and SPE, the results of this comparison demonstrate the complementarity of our imaging-based method with cognitive scores. All results are expressed as percentage.

4.5 Comparison with state-of-the-art methods

Comparison with state-of-the-art methods is divided into two sub-comparisons. First, a comparison of MRI-based methods using similar ADNI datasets is provided in Table 5. Second, a comparison of multi-modal methods was conducted. Besides cognitive assessments, the presented methods involved cerebral spinal fluid biomarkers (CSF), positron emission tomography (PET), and fluorodeoxyglucose PET (FDG-PET). The results of this comparison are provided in Table 6.

  Methods Subjects CN vs. AD sMCI vs. pMCI
  Patch-based grading (coupe2012scoring, ) 231 238 167 198 88.0 (87.5) 71.0 (71.0)
  Sparse ensemble grading (liu2012ensemble, ) 229 198 90.8 (90.5) ()
  Voxel-based morphometry (moradi2015machine, ) 231 100 164 200 () 74.7 (70.2)
  Sparse-based grading (tong2017novel, ) 229 129 171 191 () 75.0 ()
  Multiple ensemble learning (tong2014multiple, ) 231 238 167 198 89.0 (89.5) 70.4 (71.5)
  Deep ensemble learning (suk2017deep, ) 226 226 167 186 91.0 (91.3) 74.8 (74.9)
  Hierarchical convolutional network (lian2018hierarchical, ) 229 226 167 199 90.3 (89.4) 80.9 (69.0)
  Deep neural network (basaia2018automated, ) 352 510 253 295 99.2 (99.2) 75.1 (75.0)
  Cortical graph neural network wee2019cortical 242 355 85.8 (85.5) ()
  Proposed method 213 90 126 130 91.6 (91.4) 76.0 (75.7)
Table 5: Comparison with state-of-the-arts methods for Alzheimer’s disease classification using similar ADNI1 dataset. In addition to sMCI versus pMCI, we provided results of CN versus AD classification. All results are expressed in percentage of accuracy (ACC) and balanced accuracy (BACC). Best balanced accuracy for each comparison is presented in bold font.

The first comparison of results with state-of-the-art methods is provided in Table 5. is compared with state-of-the-art methods using a similar ADNI1 dataset. Our graph-based method is compared with the original PBG method (coupe2012scoring, ), a graph-based grading method (tong2014multiple, ), an ensemble grading method (liu2012ensemble, ), a sparse-based grading method (tong2017novel, ), a VBM method (moradi2015machine, ) and advanced approaches based on deep ensemble learning technique (suk2017deep, ; lian2018hierarchical, ; basaia2018automated, ; wee2019cortical, ).

The results of these comparisons demonstrate the competitive performance of our method for CN versus AD and sMCI versus pMCI classifications. Indeed, our method obtains state-of-the-art results with 91.6% of accuracy for CN versus AD which are comparable to the most recent method based on deep-learning techniques. Furthermore, our method also obtains state-of-the-art performances for sMCI versus pMCI classification with 76.0% of accuracy. These results are competitive with recent approaches based on deep-learning methods (suk2017deep, ; lian2018hierarchical, ; basaia2018automated, ; wee2019cortical, ). Moreover, our multi-scale graph-based method improves accuracy by 3.6 and 5 percentage points of the original PBG method for CN versus AD and sMCI versus pMCI classification, respectively (coupe2012scoring, ).

Moreover, as presented in Table 6, our combination of and cognitive scores has been compared with state-of-the-art multimodal approaches. This comparison includes a method combining structural MRI and cognitive scores that obtains 80.7% of accuracy tong2017novel , a method combining MRI, PET scans and CSF that obtains 83.3% of accuracy suk2015latent , a voxel-wise approach that combines MRI, FDG-PET and cognitive scores that obtains 80.9% of ACC samper2019reproducible , and a recent multimodal deep-learning approach combining MRI, CSF and cognitive scores that obtains 76 % of accuracy lee2019predicting . This demonstrates the competitive performance of our graph-based approach that obtains state-of-the-art results with only the use of MRI-based and cognitive score features.

    Methods Source AUC ACC
    Latent feature representation suk2015latent MRI + PET + CSF n.a 83.3
    Combined sparse-based grading tong2017novel MRI + Cognitive scores 87.0 80.7
    Voxel-wise approach samper2019reproducible MRI + FDG-PET + Cognitive score 88.5 80.9
    Multimodal deep learning approach lee2019predicting MRI + CSF + Cognitive scores n.a 76.0
    Proposed MRI + Cognitive scores 85.5 80.6

Gender, MMSE, Education level, CDR-sb, RAVLT, ADAS

Table 6: Comparison of the different combination of different imaging biomarkers CSF, and cognitive scores used in clinical routines for the prediction of MCI conversion (i.e., sMCI versus pMCI comparison). All results are expressed as percentage. Best AUC is expressed in bold font.

5 Discussion

The first contribution of this paper is the development of a new graph-based grading approach that combines inter-subject similarities and intra-subject variability efficiently. We validated this new method with two different anatomical scales: the hippocampal subfields and the whole brain structures. The second contribution is the development of an anatomical scale fusion based on a cascade of classifiers approach. We applied this multi-scale graph-based grading framework to the hippocampal subfields and a parcellation of the entire brain structures. To validate our new multi-scale graph-based grading framework, we compared each component of our graph at each anatomical scale. Then, we compared the results obtained in our experiments with the results of state-of-the-art methods proposed in the literature. Finally, we compared the results obtained with our imaging-based biomarker with a bank of cognitive scores that are used in clinical routines.

5.1 Graph of hippocampal subfields

Postmortem and in-vivo studies have suggested that the first regions of the brain which are changed in typical disease progression are the entorhinal cortex (EC) and the hippocampus (jack1992mr, ; braak1995staging, ; bobinski1999mri, ). Moreover, neuroimaging studies have shown that the hippocampus is the brain structure with the most significant alterations at the early stage of AD (frisoni2010clinical, ; schwarz2016large, ). However, recent methods applied to the hippocampus have shown limited performances for AD prediction (hett2017adaptive, ; tong2017novel, ). This limitation could come from global analysis of the hippocampus. Indeed, the hippocampus is subdivided into several subfields. The terminology differs across segmentation protocols (yushkevich2015quantitative, ) but the most recognized definition (lorente1934studies, ) mainly divides the hippocampus into the subiculum, the cornu ammonis (CA1/2/3/4), and the dentrate gyrus (DG). Furthermore, studies showed that hippocampal subfields are not equally impacted by AD (braak1997alzheimer, ; braak2006staging, ; apostolova2006conversion, ; la2013hippocampal, ; kerchner2010hippocampal, ; kerchner2012hippocampal, ). Indeed, postmortem, animal-based and recent in-vivo imaging studies showed that the CA1 and the subiculum are the subfields impacted by the most discriminant atrophy in the last stage of AD (apostolova2006conversion, ; la2013hippocampal, ; kerchner2012hippocampal, ; li2013discriminative, ; trujillo2014early, ). This indicates that the analysis of the hippocampus with a global measure could limit prediction performance. Indeed, better modeling of the structural alterations within the hippocampal subfields could improve prediction performance.

Consequently, we proposed to better model hippocampus alterations with the application of our novel graph-based framework within the hippocampal subfields. First, we studied the efficiency of a straightforward approach that computes the average of grading values in each hippocampus subfield separately instead of the whole hippocampus structures as is usually done. This results in poorer performance compared to the average of grading values within the whole hippocampus. However, the grading values within the most discriminant hippocampal subfields (i.e., subiculum and the two definitions of CA1) obtain similar performances to the average of grading values within the whole hippocampus. This might be due to the fact that the subiculum and CA1 represent the major part of the hippocampus.

The related hippocampal subfield features selected by the elastic net are consistent with previous in-vivo imaging studies, which are based on 3T MRI and ultra-high field MRI at 7T. These studies analyzed the atrophy of each hippocampal subfield at an advanced stage of AD. These studies showed that CA1 is the subfield with the most severe atrophy (apostolova2006conversion, ; mueller2007measurement, ; la2013hippocampal, ; carlesimo2015atrophy, ), and also indicate that CA1SR-L-M is the subfield with the greatest atrophy at advanced stages of AD kerchner2010hippocampal ; kerchner2012hippocampal . It is interesting to note that the results of our experiments are also in accordance with previous postmortem, animal-based, and in-vivo studies combining volume and diffusivity MRI. These last studies demonstrated that the subiculum is the earliest hippocampal region affected by AD (trujillo2014early, ; li2013discriminative, ).

Finally, the great improvement obtained with the combination of inter-subject similarities and intra-subject variability shows that this information is complementary. It also confirms this combination enables to obtain results similar to methods based on whole brain analysis with only the use of the hippocampus.

5.2 Graph of brain structures

Next, we investigated our method at the whole brain scale. The comparison of hippocampus PBG and the most discriminant vertices indicate that the straightforward combination of other discriminant brain structures does not increase the prediction performance compared to using only the hippocampus. Moreover, when the edges and the vertices are combined, our experiments show that the edges are the most discriminant selected elements.

Our experiments indicate that the most selected brain structures are the postcentral gyrus, the anterior cingulate gyrus, the hippocampus, and the precuneus (see Figure 3). These results are in line with the literature. First, it is interesting to note that the automatic selection of most discriminant feature shows the importance of the temporal lobe and the hippocampus. Studies have shown a significant loss of gray matter within the temporal lobe (killiany1993temporal, ; busatto2003voxel, ), while the hippocampus has long been known as the structure with the earliest alterations hyman1984alzheimer ; west1994differences ; braak1995staging ; ledig2018structural . Second, VBM and perfusion studies have shown that the precuneus suffers from a noticeable atrophy and a bilateral decrease of regional cerebral blood flow compared to control subjects kogure2000longitudinal ; karas2007precuneus . Studies have shown a significant reduction in volume of the anterior cingulate gyrus compared to control frisoni2002detection ; jones2006differential . Moreover, a study showed that the volume of the anterior cingulate gyrus is correlated with apathy which is symptomatic of AD apostolova2007structural . However, the importance of the postcentral gyrus was unexpected since it has been shown that this structure seems unaffected by AD process halliday2003identifying . These elements seem to indicate that the structural pattern of AD is composed of both highly impacted and healthy brain structures.

Finally, the good performance of the graph-based grading method demonstrates that the combination of both features enables a better discrimination of subjects who convert to dementia in the years following their first visits. Moreover, the good results within hippocampal subfield and brain structure parcellation indicate the generic nature of our new method, which can be applied at different anatomical representation.

5.3 Multi-scale graph-based grading

Afterwards, we compared the results of our multi-scale () approach with the previously described and . First, the conducted experiments show that our graph of structure grading applied within hippocampal subfields improves prediction of conversion to Alzheimer’s disease compared to the PBG applied within the hippocampus.

The results obtained by the straightforward extension of the graph of structure grading to combine whole brain structure and hippocampal subfields did not demonstrate an improvement in AD conversion prediction compared to the single use of and . The main limitation might come from the fact that the straightforward combination of different anatomical representations suffer from a substantial augmentation of feature dimensionality. To address these limitations, we proposed the method that is based on a cascade of classifiers. This method alleviates the dimensionality issue by estimating an intermediate conversion probability for each anatomical scale considered. This results in an increase in AD prediction performances compared to and methods.

5.4 Comparison with state-of-the-art methods

In this last decade, many improvements in computer-aided diagnosis methods have been proposed to better capture structural alterations using anatomical MRI (see rathore2017review for a review). Two main approaches have been proposed: methods based on inter-subject similarity coupe2012scoring ; moradi2015machine ; tong2017novel and methods based on intra-subject variability tong2014multiple ; suk2014hierarchical . Consequently, the first contribution of our work was to combine inter-subject similarity – using the PBG framework – and the intra-subject variability – with the integration of PBG into a graph-based model. A strength of our method reside in its generic nature. Indeed, our graph-based grading can obtain competitive results with different anatomical representations.

Another difference with state-of-the-art methods comes from the proposition of a multi-anatomical scale analysis of AD alterations. In contrast to previous methods which analyzed changes at a unique anatomical scale (i.e., cortical cortex, whole brain structures, hippocampus, or hippocampal subfields,…), we proposed combining whole brain structures parcellation with a representation of hippocampal subfields. This combination has resulted in performances competitive with state-of-the-art methods.

Finally, the comparison with state-of-the-art approaches using similar ADNI1 subset has shown that our multi-scale graph-based grading method obtains competitive results for both AD detection and prediction. The high performances of the methods proposed in basaia2018automated and lian2018hierarchical have to be moderated. First, it is unclear how (basaia2018automated, ) obtains high accuracy for AD detection since misdiagnosis of AD in ADNI dataset is around . Second, the excellent result of lian2018hierarchical for AD prediction mainly takes advantage of unbalanced results, after the accuracy is corrected (see Table 5), this approach obtains lower prediction performance.

5.5 Complementarity with cognitive tests

Finally, an analysis of the complementarity of our imaging-based method with scores resulting from cognitive assessments has been carried out. These experiments enabled the comparison of the performance of cognitive scores and our imaging biomarker for AD prediction.

First, the conducted experiments demonstrate that our graph-based grading approach using T1 weighted MRI obtains substantially better results for the prediction of AD than the single use of cognitive scores. Moreover, as shown in many works listed in Table 6 suk2015latent ; tong2017novel ; samper2019reproducible ; lee2019predicting , MRI-based biomarkers and cognitive assessments are complementary. Indeed, their combination improves classification performances. Thus, the combination of our graph-based grading technique and cognitive assessments demonstrates a great improvement in performance compared to the use of each method separately. This improvement is comparable to studies based on multimodality frameworks, which use more expensive biomarkers (e.g., PET, FDG-PET,…), are less reproducible (often due to sample storage issues), and are highly invasive (e.g., CSF).

6 Conclusion

Improved modeling of AD alterations is a great challenge that could lead to earlier predictions of conversion. Therefore, in this work, we developed a new method to better model AD signature. Our proposed method models the pattern of AD alterations by combining inter-subject similarity and intra-subject variability. The conducted experiments have shown the generic nature of our new framework. Consequently, we proposed a multi-anatomical scale graph-based grading method to combine the alterations at different anatomical scales. In addition, we conducted the first joint analysis of the hippocampus subfields and brain structure changes in the same framework. The results show state-of-the-art-performance, confirming the complementarity of hippocampal subfields and whole brain analysis, and the complementarity of inter-subject similarity and intra-subject variability.

7 Acknowledgement

This work benefited from the support of the project DeepvolBrain of the French National Research Agency (ANR-18-CE45-0013). This study was achieved within the context of the Laboratory of Excellence TRAIL ANR-10-LABX-57 for the BigDataBrain project. Moreover, we thank the Investments for the future Program IdEx Bordeaux (ANR-10- IDEX- 03- 02, HL-MRI Project), Cluster of excellence CPU and the CNRS.

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Biogen; Bristol-Myes Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffman-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Pharmaceutical Research & Development LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute of Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

8 Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could result as a potential conflict of interest.


  • [1] Association Alzheimer’s. 2015 Alzheimer’s disease facts and figures. Alzheimer’s & dementia: the journal of the Alzheimer’s Association, 11(3):332, 2015.
  • [2] Liana G Apostolova, Gohar G Akopyan, Negar Partiali, Calen A Steiner, Rebecca A Dutton, Kiralee M Hayashi, Ivo D Dinov, Arthur W Toga, Jeffrey L Cummings, and Paul M Thompson. Structural correlates of apathy in alzheimer’s disease. Dementia and geriatric cognitive disorders, 24(2):91, 2007.
  • [3] Liana G Apostolova, Rebecca A Dutton, Ivo D Dinov, Kiralee M Hayashi, Arthur W Toga, Jeffrey L Cummings, and Paul M Thompson. Conversion of mild cognitive impairment to Alzheimer disease predicted by hippocampal atrophy maps. Archives of neurology, 63(5):693–699, 2006.
  • [4] Mohammad R Arbabshirani, Sergey Plis, Jing Sui, and Vince D Calhoun. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls. NeuroImage, 145:137–165, 2017.
  • [5] John Ashburner and Karl J Friston. Voxel-based morphometry—the methods. Neuroimage, 11(6):805–821, 2000.
  • [6] Brian B Avants, Nicholas J Tustison, Gang Song, Philip A Cook, Arno Klein, and James C Gee. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage, 54(3):2033–2044, 2011.
  • [7] Silvia Basaia, Federica Agosta, Luca Wagner, Elisa Canu, Giuseppe Magnani, Roberto Santangelo, Massimo Filippi, Alzheimer’s Disease Neuroimaging Initiative, et al. Automated classification of alzheimer’s disease and mild cognitive impairment using a single mri and deep neural networks. NeuroImage: Clinical, page 101645, 2018.
  • [8] Maciek Bobinski, Mony J De Leon, Antonio Convit, Susan De Santi, Jerzy Wegiel, Chaim Y Tarshish, LA Saint Louis, and Henryk M Wisniewski. Mri of entorhinal cortex in mild alzheimer’s disease. The Lancet, 353(9146):38–40, 1999.
  • [9] E Braak and H Braak. Alzheimer’s disease: transiently developing dendritic changes in pyramidal cells of sector CA1 of the ammon’s horn. Acta neuropathologica, 93(4):323–325, 1997.
  • [10] Heiko Braak, Irina Alafuzoff, Thomas Arzberger, Hans Kretzschmar, and Kelly Del Tredici. Staging of Alzheimer disease-associated neurofibrillary pathology using paraffin sections and immunocytochemistry. Acta neuropathologica, 112(4):389–404, 2006.
  • [11] Heiko Braak and Eva Braak. Staging of Alzheimer’s disease-related neurofibrillary changes. Neurobiology of aging, 16(3):271–278, 1995.
  • [12] Esther E Bron, Marion Smits, Wiesje M Van Der Flier, Hugo Vrenken, Frederik Barkhof, Philip Scheltens, Janne M Papma, Rebecca ME Steketee, Carolina Méndez Orellana, and Rozanna Meijboom. Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge. NeuroImage, 111:562–579, 2015.
  • [13] Geraldo F Busatto, Griselda EJ Garrido, Osvaldo P Almeida, Claudio C Castro, Cândida HP Camargo, Carla G Cid, Carlos A Buchpiguel, Sergio Furuie, and Cassio M Bottino. A voxel-based morphometry study of temporal lobe gray matter reductions in alzheimer’s disease. Neurobiology of aging, 24(2):221–231, 2003.
  • [14] Giovanni A Carlesimo, Fabrizio Piras, Maria Donata Orfei, Mariangela Iorio, Carlo Caltagirone, and Gianfranco Spalletta. Atrophy of presubiculum and subiculum is the earliest hippocampal anatomical marker of Alzheimer’s disease. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, 1(1):24–32, 2015.
  • [15] Pierrick Coupé, Simon F Eskildsen, José V Manjón, Vladimir S Fonov, D Louis Collins, Alzheimer’s disease Neuroimaging Initiative, et al. Simultaneous segmentation and grading of anatomical structures for patient’s classification: application to alzheimer’s disease. NeuroImage, 59(4):3736–3747, 2012.
  • [16] Pierrick Coupé, Simon F Eskildsen, José V Manjón, Vladimir S Fonov, Jens C Pruessner, Michèle Allard, D Louis Collins, and Alzheimer’s Disease Neuroimaging Initiative. Scoring by nonlocal image patch estimator for early detection of Alzheimer’s disease. NeuroImage: clinical, 1(1):141–152, 2012.
  • [17] Pierrick Coupé, Vladimir S Fonov, Charlotte Bernard, Azar Zandifar, Simon F Eskildsen, Catherine Helmer, José V Manjón, Hélène Amieva, Jean-François Dartigues, and Michèle Allard. Detection of Alzheimer’s disease signature in MR images seven years before conversion to dementia: Toward an early individual prognosis. Human brain mapping, 36(12):4758–4770, 2015.
  • [18] Pierrick Coupé, José V Manjón, Maxime Chamberland, Maxime Descoteaux, and Bassem Hiba. Collaborative patch-based super-resolution for diffusion-weighted images. NeuroImage, 83:245–261, 2013.
  • [19] Pierrick Coupé, José Vicente Manjón, Enrique Lanuza, and Gwenaelle Catheline. Lifespan Changes of the Human Brain In Alzheimer’s Disease. Scientific reports, 9(1):3998, 2019.
  • [20] Rémi Cuingnet, Joan Alexis Glaunès, Marie Chupin, Habib Benali, and Olivier Colliot. Spatial and anatomical regularization of svm: a general framework for neuroimaging data. IEEE transactions on pattern analysis and machine intelligence, 35(3):682–696, 2013.
  • [21] Charles DeCarli. Mild cognitive impairment: prevalence, prognosis, aetiology, and treatment. The Lancet Neurology, 2(1):15–21, 2003.
  • [22] GB Frisoni, C Testa, A Zorzan, F Sabattoli, A Beltramello, H Soininen, and MP Laakso. Detection of grey matter loss in mild alzheimer’s disease with voxel based morphometry. Journal of Neurology, Neurosurgery & Psychiatry, 73(6):657–664, 2002.
  • [23] Giovanni B Frisoni, Nick C Fox, Clifford R Jack, Philip Scheltens, and Paul M Thompson. The clinical use of structural MRI in Alzheimer disease. Nature Reviews Neurology, 6(2):67–77, 2010.
  • [24] Rémi Giraud, Vinh-Thong Ta, Nicolas Papadakis, José V Manjón, D Louis Collins, Pierrick Coupé, and Alzheimer’s Disease Neuroimaging Initiative. An optimized patchmatch for multi-scale and multi-feature label fusion. NeuroImage, 124:770–782, 2016.
  • [25] GM Halliday, KL Double, V Macdonald, and JJ Kril. Identifying severely atrophic cortical subregions in alzheimer’s disease. Neurobiology of aging, 24(6):797–806, 2003.
  • [26] John Hardy. Alzheimer’s disease: the amyloid cascade hypothesis: an update and reappraisal. Journal of Alzheimer’s disease, 9(s3):151–153, 2006.
  • [27] Kilian Hett, Vinh-Thong Ta, José V Manjón, and Pierrick Coupé. Graph of hippocampal subfields grading for alzheimer’s disease prediction. In

    International Workshop on Machine Learning in Medical Imaging

    , pages 259–266. Springer, 2018.
  • [28] Kilian Hett, Vinh-Thong Ta, José V Manjón, Pierrick Coupé, and Alzheimer’s Disease Neuroimaging Initiative. Adaptive fusion of texture-based grading: Application to Alzheimer’s disease detection. In International Workshop on Patch-based Techniques in Medical Imaging, pages 82–89. Springer, 2017.
  • [29] Kilian Hett, Vinh-Thong Ta, José V Manjón, Pierrick Coupé, Alzheimer’s Disease Neuroimaging Initiative, et al. Graph of brain structures grading for early detection of alzheimer’s disease. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 429–436. Springer, 2018.
  • [30] Kilian Hett, TA Vinh-Thong, José V Manjón, and Pierrick Coupé. Adaptive fusion of texture-based grading for alzheimer’s disease classification. Computerized Medical Imaging and Graphics, 2018.
  • [31] Bradley T Hyman, Gary W Van Hoesen, Antonio R Damasio, and Clifford L Barnes. Alzheimer’s disease: cell-specific pathology isolates the hippocampal formation. Science, 225(4667):1168–1170, 1984.
  • [32] Clifford R Jack, Ronald C Petersen, Peter C O’brien, and Eric G Tangalos. Mr-based hippocampal volumetry in the diagnosis of Alzheimer’s disease. Neurology, 42(1):183–183, 1992.
  • [33] Bethany F Jones, Josephine Barnes, Harry BM Uylings, Nick C Fox, Chris Frost, Menno P Witter, and Philip Scheltens. Differential regional atrophy of the cingulate gyrus in alzheimer disease: a volumetric mri study. Cerebral Cortex, 16(12):1701–1708, 2006.
  • [34] Giorgos Karas, Philip Scheltens, Serge Rombouts, Ronald Van Schijndel, Martin Klein, Bethany Jones, Wiesje Van Der Flier, Hugo Vrenken, and Frederik Barkhof. Precuneus atrophy in early-onset alzheimer’s disease: a morphometric structural mri study. Neuroradiology, 49(12):967–976, 2007.
  • [35] GA Kerchner, CP Hess, KE Hammond-Rosenbluth, D Xu, GD Rabinovici, DAC Kelley, DB Vigneron, SJ Nelson, and BL Miller. Hippocampal CA1 apical neuropil atrophy in mild Alzheimer disease visualized with 7-T MRI. Neurology, 75(15):1381–1387, 2010.
  • [36] Geoffrey A Kerchner, Gayle K Deutsch, Michael Zeineh, Robert F Dougherty, Manojkumar Saranathan, and Brian K Rutt. Hippocampal CA1 apical neuropil atrophy and memory performance in Alzheimer’s disease. Neuroimage, 63(1):194–202, 2012.
  • [37] Ronald J Killiany, Mark B Moss, Marilyn S Albert, Tamas Sandor, James Tieman, and Ferenc Jolesz. Temporal lobe regions on magnetic resonance imaging identify patients with early alzheimer’s disease. Archives of neurology, 50(9):949–954, 1993.
  • [38] Daiji Kogure, Hiroshi Matsuda, Takashi Ohnishi, Takashi Asada, Masatake Uno, Toshiyuki Kunihiro, Seigo Nakano, and Masaru Takasaki. Longitudinal evaluation of early alzheimer’s disease using brain perfusion spect. Journal of nuclear medicine, 41(7):1155–1162, 2000.
  • [39] Juha Koikkalainen, Harri Pölönen, Jussi Mattila, Mark Van Gils, Hilkka Soininen, Jyrki Lötjönen, Alzheimer’s Disease Neuroimaging Initiative, et al. Improved classification of alzheimer’s disease data via removal of nuisance variability. PloS one, 7(2):e31112, 2012.
  • [40] Renaud La Joie, Audrey Perrotin, Vincent De La Sayette, Stéphanie Egret, Loïc Doeuvre, Serge Belliard, Francis Eustache, Béatrice Desgranges, and Gaël Chételat. Hippocampal subfield volumetry in mild cognitive impairment, Alzheimer’s disease and semantic dementia. NeuroImage: Clinical, 3:155–162, 2013.
  • [41] Christian Ledig, Andreas Schuh, Ricardo Guerrero, Rolf A Heckemann, and Daniel Rueckert. Structural brain imaging in alzheimer’s disease and mild cognitive impairment: biomarker analysis and shared morphometry database. Scientific reports, 8(1):11258, 2018.
  • [42] Garam Lee, Kwangsik Nho, Byungkon Kang, Kyung-Ah Sohn, and Dokyoon Kim. Predicting alzheimer’s disease progression using multi-modal deep learning approach. Scientific reports, 9(1):1952, 2019.
  • [43] Ya-Di Li, Hai-Bo Dong, Guo-Ming Xie, and Ling-jun Zhang. Discriminative analysis of mild Alzheimer’s disease and normal aging using volume of hippocampal subfields and hippocampal mean diffusivity: an in vivo magnetic resonance imaging study. American Journal of Alzheimer’s Disease & Other Dementias, 28(6):627–633, 2013.
  • [44] Chunfeng Lian, Mingxia Liu, Jun Zhang, and Dinggang Shen. Hierarchical fully convolutional network for joint atrophy localization and alzheimer’s disease diagnosis using structural mri. IEEE transactions on pattern analysis and machine intelligence, 2018.
  • [45] Jun Liu, Shuiwang Ji, Jieping Ye, et al. SLEP: Sparse learning with efficient projections. Arizona State University, 6(491):7, 2009.
  • [46] Manhua Liu, Daoqiang Zhang, Dinggang Shen, and Alzheimer’s Disease Neuroimaging Initiative. Ensemble sparse classification of Alzheimer’s disease. NeuroImage, 60(2):1106–1116, 2012.
  • [47] Rafael Lorente de Nó. Studies on the structure of the cerebral cortex. ii. continuation of the study of the ammonic system. Journal für Psychologie und Neurologie, 1934.
  • [48] José V Manjón and Pierrick Coupé. volBrain: An online MRI brain volumetry system. Frontiers in neuroinformatics, 10, 2016.
  • [49] José V Manjón, Pierrick Coupé, Luis Martí-Bonmatí, D Louis Collins, and Montserrat Robles. Adaptive non-local means denoising of MR images with spatially varying noise levels. Journal of Magnetic Resonance Imaging, 31(1):192–203, 2010.
  • [50] Elaheh Moradi, Antonietta Pepe, Christian Gaser, Heikki Huttunen, Jussi Tohka, Alzheimer’s Disease Neuroimaging Initiative, et al. Machine learning framework for early MRI-based Alzheimer’s conversion prediction in MCI subjects. Neuroimage, 104:398–412, 2015.
  • [51] SG Mueller, L Stables, AT Du, N Schuff, D Truran, N Cashdollar, and MW Weiner. Measurement of hippocampal subfields and age-related changes with high resolution MRI at 4T. Neurobiology of aging, 28(5):719–726, 2007.
  • [52] Ronald C Petersen, Glenn E Smith, Stephen C Waring, Robert J Ivnik, Eric G Tangalos, and Emre Kokmen. Mild cognitive impairment: clinical characterization and outcome. Archives of neurology, 56(3):303–308, 1999.
  • [53] Jeffrey R Petrella, R Edward Coleman, and P Murali Doraiswamy. Neuroimaging and early diagnosis of alzheimer disease: a look to the future. Radiology, 226(2):315–336, 2003.
  • [54] Saima Rathore, Mohamad Habes, Muhammad Aksam Iftikhar, Amanda Shacklett, and Christos Davatzikos.

    A review on neuroimaging-based classification studies and associated feature extraction methods for alzheimer’s disease and its prodromal stages.

    NeuroImage, 155:530–548, 2017.
  • [55] Jose E Romero, Pierrick Coupe, and Jose V Manjon. Hips: A new hippocampus subfield segmentation method. NeuroImage, 163:286–295, 2017.
  • [56] Yossi Rubner, Carlo Tomasi, and Leonidas J Guibas.

    The earth mover’s distance as a metric for image retrieval.

    International journal of computer vision

    , 40(2):99–121, 2000.
  • [57] Jorge Samper-Gonzalez, Ninon Burgos, Simona Bottani, Marie-Odile Habert, Theodoros Evgeniou, Stephane Epelbaum, and Olivier Colliot. Reproducible evaluation of methods for predicting progression to alzheimer’s disease from clinical and neuroimaging data. In SPIE Medical Imaging 2019, 2019.
  • [58] Christopher G Schwarz, Jeffrey L Gunter, Heather J Wiste, Scott A Przybelski, Stephen D Weigand, Chadwick P Ward, Matthew L Senjem, Prashanthi Vemuri, Melissa E Murray, Dennis W Dickson, et al. A large-scale comparison of cortical thickness and volume methods for measuring alzheimer’s disease severity. NeuroImage: Clinical, 11:802–812, 2016.
  • [59] Herbert A Sturges. The choice of a class interval. Journal of the american statistical association, 21(153):65–66, 1926.
  • [60] Heung-Il Suk, Seong-Whan Lee, and Dinggang Shen. Deep ensemble learning of sparse regression models for brain disease diagnosis. Medical image analysis, 37:101–113, 2017.
  • [61] Heung-Il Suk, Seong-Whan Lee, Dinggang Shen, Alzheimer’s Disease Neuroimaging Initiative, et al. Hierarchical feature representation and multimodal fusion with deep learning for ad/mci diagnosis. NeuroImage, 101:569–582, 2014.
  • [62] Heung-Il Suk, Seong-Whan Lee, Dinggang Shen, Alzheimer’s Disease Neuroimaging Initiative, et al. Latent feature representation with stacked auto-encoder for ad/mci diagnosis. Brain Structure and Function, 220(2):841–859, 2015.
  • [63] Tong Tong, Qinquan Gao, Ricardo Guerrero, Christian Ledig, Liang Chen, Daniel Rueckert, and Alzheimer’s Disease Neuroimaging Initiative. A novel grading biomarker for the prediction of conversion from mild cognitive impairment to Alzheimer’s disease. IEEE Transactions on Biomedical Engineering, 64(1):155–165, 2017.
  • [64] Tong Tong, Robin Wolz, Qinquan Gao, Ricardo Guerrero, Joseph V Hajnal, Daniel Rueckert, and Alzheimer’s Disease Neuroimaging Initiative. Multiple instance learning for classification of dementia in brain MRI. Medical image analysis, 18(5):808–818, 2014.
  • [65] Laura Trujillo-Estrada, José Carlos Dávila, Elisabeth Sánchez-Mejias, Raquel Sánchez-Varo, Angela Gomez-Arboledas, Marisa Vizuete, Javier Vitorica, and Antonia Gutiérrez. Early neuronal loss and axonal/presynaptic damage is associated with accelerated amyloid- accumulation in APP/PS1 Alzheimer’s disease mice subiculum. Journal of Alzheimer’s Disease, 42(2):521–541, 2014.
  • [66] Nicholas J Tustison, Brian B Avants, Philip A Cook, Yuanjie Zheng, Alexander Egan, Paul A Yushkevich, and James C Gee. N4ITK: improved N3 bias correction. IEEE transactions on medical imaging, 29(6):1310–1320, 2010.
  • [67] Hongzhi Wang, Sandhitsu R Das, Jung Wook Suh, Murat Altinay, John Pluta, Caryne Craige, Brian Avants, Paul A Yushkevich, Alzheimer’s Disease Neuroimaging Initiative, et al. A learning-based wrapper method to correct systematic errors in automatic image segmentation: consistently improved performance in hippocampus, cortex and brain segmentation. NeuroImage, 55(3):968–985, 2011.
  • [68] Chong-Yaw Wee, Chaoqiang Liu, Annie Lee, Joann S Poh, Hui Ji, Anqi Qiu, Alzheimer’s Disease Neuroimage Initiative, et al.

    Cortical graph neural network for ad and mci diagnosis and transfer learning across populations.

    NeuroImage: Clinical, page 101929, 2019.
  • [69] Chong-Yaw Wee, Pew-Thian Yap, Dinggang Shen, and Alzheimer’s Disease Neuroimaging Initiative. Prediction of Alzheimer’s disease and mild cognitive impairment using cortical morphological patterns. Human brain mapping, 34(12):3411–3425, 2013.
  • [70] Mark J West, Paul D Coleman, Dorothy G Flood, and Juan C Troncoso. Differences in the pattern of hippocampal neuronal loss in normal ageing and Alzheimer’s disease. The Lancet, 344(8925):769–772, 1994.
  • [71] Julie L Winterburn, Jens C Pruessner, Sofia Chavez, Mark M Schira, Nancy J Lobaugh, Aristotle N Voineskos, and M Mallar Chakravarty. A novel in vivo atlas of human hippocampal subfields using high-resolution 3T magnetic resonance imaging. Neuroimage, 74:254–265, 2013.
  • [72] Robin Wolz, Valtteri Julkunen, Juha Koikkalainen, Eini Niskanen, Dong Ping Zhang, Daniel Rueckert, Hilkka Soininen, Jyrki Lötjönen, and Alzheimer’s Disease Neuroimaging Initiative. Multi-method analysis of MRI images in early diagnostics of Alzheimer’s disease. PloS one, 6(10):e25446, 2011.
  • [73] Paul A Yushkevich, Robert SC Amaral, Jean C Augustinack, Andrew R Bender, Jeffrey D Bernstein, Marina Boccardi, Martina Bocchetta, Alison C Burggren, Valerie A Carr, and M Mallar Chakravarty. Quantitative comparison of 21 protocols for labeling hippocampal subfields and parahippocampal subregions in in vivo MRI: towards a harmonized segmentation protocol. Neuroimage, 111:526–541, 2015.
  • [74] Luping Zhou, Yaping Wang, Yang Li, Pew-Thian Yap, Dinggang Shen, Alzheimer’s Disease Neuroimaging Initiative (ADNI, et al. Hierarchical anatomical brain networks for MCI prediction: revisiting volumetric measures. PloS one, 6(7):e21935, 2011.
  • [75] Hui Zou and Trevor Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301–320, 2005.