NormVAE: Normative Modeling on Neuroimaging Data using Variational Autoencoders

Normative modeling is an emerging method for understanding the heterogeneous biology underlying neuropsychiatric and neurodegenerative disorders at the level of the individual participant. Deep autoencoders have been implemented as normative models, where patient-level deviations are modelled as the squared difference between the actual and reconstructed input without any uncertainty estimates in the deviations. In this study, we assessed NormVAE, a novel normative modeling based variational autoencoder (VAE) which calculates subject-level normative abnormality maps (NAM) for quantifying uncertainty in the deviations. Our experiments on brain neuroimaging data of Alzheimer's Disease (AD) patients demonstrated that the NormVAE-generated patient-level abnormality maps exhibit increased sensitivity to disease staging compared to a baseline VAE, which generates deterministic subject-level deviations without any uncertainty estimates.



There are no comments yet.


page 1

page 2

page 3

page 4


Stick-Breaking Variational Autoencoders

We extend Stochastic Gradient Variational Bayes to perform posterior inf...

Variational Information Bottleneck on Vector Quantized Autoencoders

In this paper, we provide an information-theoretic interpretation of the...

Modality Completion via Gaussian Process Prior Variational Autoencoders for Multi-Modal Glioma Segmentation

In large studies involving multi protocol Magnetic Resonance Imaging (MR...

Data Augmentation in High Dimensional Low Sample Size Setting Using a Geometry-Based Variational Autoencoder

In this paper, we propose a new method to perform data augmentation in a...

Synthetic Patient Generation: A Deep Learning Approach Using Variational Autoencoders

Artificial Intelligence in healthcare is a new and exciting frontier and...

retina-VAE: Variationally Decoding the Spectrum of Macular Disease

In this paper, we seek a clinically-relevant latent code for representin...

Certifiably Robust Variational Autoencoders

We introduce an approach for training Variational Autoencoders (VAEs) th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Traditional case-control analyses assume that there is a single pattern that distinguishes the two contrasted groups and focus on 1st order statistics (group means) to estimate it, effectively ignoring disease heterogeneity marquand2019conceptualizing . In contrast, normative modeling explicitly models disease heterogeneity by quantifying how each patient deviates from the expected normative pattern that has been learned from a healthy control distribution marquand2016understanding . As a consequence, normative modeling has been increasingly used to dissect heterogeneity in neurodegenerative huizinga2018spatio ; ziegler2014individualized and neuropsychiatric disorders kia2019neural ; bethlehem2020normative ; wolfers2018mapping .

Traditional normative modeling implementations marquand2016understanding ; ziegler2014individualized ; wolfers2018mapping ; wolfers2020individual ; zabihi2019dissecting have typically relied on Gaussian process regression (GPR) to derive independent models for each brain locus that account for epistemic uncertainty hofer2002approximate

. On the other hand, deep learning techniques, particularly autoencoders, can learn multiple levels of representation across all brain loci. However, recent studies using autoencoders for normative modeling on neuroimaging data have modelled subject-level deviations from the healthy controls in terms of the difference between the actual and predicted brain loci value, which is deterministic without any kind of uncertainty estimates in the deviations

pinaya2019using ; pinaya2020normative . To address this limitation, we propose a novel normative modeling approach termed NormVAE, which uses a variational autoencoder (VAE) kingma2014auto with a Monte-Carlo sampling-based probabilistic reconstruction scheme. NormVAE generates subject-level Normative Abnormality Maps (NAM) that provide uncertainty quantification in how individual disease patients deviate from the normative healthy control population. We hypothesize that the NormVAE-generated NAMs are more sensitive in capturing disease stages, compared to maps derived by focusing on differences between true values and predictions from a vanilla VAE.

2 Methodology

Figure 1: Our proposed approach which consists of 3 steps: (A) generating the normative model based on healthy controls, (B) estimating the probabilistic abnormality or deviation maps of individual patients and (C) producing the Normative Abnormality Maps (NAMs) by identifying the brain areas that exhibit statistically significant deviations after False Discovery Rate (FDR) correction.

Our goal is to dissect heterogeneity amongst Alzheimer’s Disease (AD) patients using deep learning based normative models that provide uncertainty estimates in the patient-level deviations. To achieve this, we first learn the normative model by training a VAE on the neuroimaging data of the healthy controls. The resulting model learns to encode the healthy patterns into a latent distribution and then uses the encoded representation to try to reconstruct the input data as closely as possible to the original. In the second step, we assess the predictive performance of the trained normative model on AD patients and estimate the extent to which they differ from the normative trend. For each subject (), the probabilistic deviation at the -th brain region can be calculated as , where is the true volume value, and

are the predictive mean and variance estimates of the model, and

denotes the variance of the deviations corresponding to healthy controls (normative distribution).

The predictive mean and variance for a single test input (e.g., brain region volume of a single AD patient) can be estimated by a Monte-Carlo sampling approach edupuganti2020uncertainty as follows: (a) the VAE encoder takes the input and produces the mean and variance of the latent code representation, (b) samples are drawn from the latent space and passed through the decoder to produce reconstructions of the latent code samples, (c) the reconstructions are aggregated over samples to provide the mean and variance , which provide the uncertainty estimates in the deviations. The final step of our pipeline involves identifying the brain regions of each patient whose deviations are significantly different from those of the controls (). Since the probabilistic deviations are estimated independently for each brain region for every patient, FDR (False Discovery Rate) correction can applied to control the Type 1 error rate, yielding the final Normative Abnormality Maps (NAMs).

3 Experiments and Results

We tested our methodology on cross-sectional Magnetic Resonance Imaging (MRI) neuroimaging data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI 1,2,3,GO) petersen2010alzheimer .We used the FreeSurfer 5.1 software fischl2002whole on T1 weighted images to estimate brain region volumes ( cortical, subcortical and hippocampal) of subjects. Our dataset included cognitively normal controls and patients: Significant Memory Concerns (SMC), Early Mild Cognitive Impairment (EMCI), Late Mild Cognitive Impairment (LMCI) and Alzheimer’s Disease (AD) patients. In the feature preprocessing step, we normalized the brain region volumes of each subject by their Intracranial Volume (ICV).

To validate our hypothesis that NormVAE-generated NAMs are more sensitive to disease staging, we compared our proposed model to a baseline VAE with no uncertainty estimates. In the baseline model, the deviations for every patient at each brain region are modelled as the standard VAE reconstruction error, z-transformed by the variance of the deviations corresponding to healthy controls

, where

represents the predicted volume of brain loci (decoder output). We also conditioned both the models on the age of patients to ensure that the neuroanatomical deviations reflect disease pathology and differences in age. We trained both NormVAE and the baseline model using the Adam optimizer with model hyperparameters as follows: learning rate =

, batch size = , latent dimension = , size of dense layer = and number of dense layers in each of encoder and decoder = .

Figure 2: A,B: Mean deviation (NAM for NormVAE and z-transformed squared reconstruction error for the baseline) across brain regions stratified by clinical status. The slope value (*) is obtained by fitting a linear model across the mean deviations of each category. Statistical significance between the deviations in each disease category is also annotated (**** - p <0.05, ns - not significant) C: Correlation between the trained SVM (linear kernel) weights and the count of each region being statistically significant from controls. D: Number of times () each sub-cortical brain region volume is significantly different between healthy controls and patients. E: Number of brain regions of each patient that are significant different compared to healthy controls.

The main idea of the normative approach is that since the VAE only learns how to reconstruct the brain region volumes of healthy controls, it will be less precise in reconstructing AD patients, who differ due to the disease pathology. The NAMs reflect the measure of neuroanatomical alteration in the brain due to the AD progression and should ideally include more regions with increasing severity of the disease stage. For both NormVAE and the baseline model, the patients exhibited increasing abnormality with increasing severity of their condition from SMC to AD. Higher magnitude and slope of deviation of NormVAE compared to the baseline suggest that our proposed model is more sensitive to disease staging (Figure 1A, B). The NAMs generated from NormVAE result in higher number of brain regions for each patient whose volumes are significantly different (

) compared to those of the controls. (Figure 1E). We also analyzed the frequency of each of the sub-cortical regions having statistically significant different deviations from the controls (Figure 1D). Similar experiments were performed for the cortical and hippocampal regions as well. To examine if the brain regions found to be statistically significant by the models reflect disease pathology, we trained a Support Vector Machine (SVM)

cortes1995support ; pedregosa2011scikit on the brain volumes to distinguish between Cognitively Normal (CN) and AD patients and analyzed the correlation between the trained SVM weights and the significance count corresponding to each brain region (Figure 1C). The correlation was higher for NormVAE compared to the baseline model. This is because NormVAE identified more regions that deviated significantly from the model and were also assigned high SVM weights. On the other hand, there were several brain regions that were never found to deviate significantly by the baseline method but were assigned high weights by the SVM, thus leading to low correlation.

4 Conclusion and Future Work

We present NormVAE, a normative modeling based VAE with probabilistic subject-level abnormality maps with aleatoric uncertainty estimates, quantifying how much AD patients deviate from the healthy control population. The NormVAE-generated abnormality maps are more sensitive to disease staging within the AD, compared to a vanilla VAE with no uncertainty estimates in the generated deviations. As part of future work, we propose to perform further validations of our proposed model including comparisons with Gaussian Process Regression and other state-of-the-art normative modeling approaches and utilize the subject-level abnormality maps to investigate heterogeneity of neurodegenerative and neuropsychiatric disorders.

5 Negative Societal Impact

Our work NormVAE takes a first step towards addressing heterogeneity in a neurodegenerative disorder like Alzheimer’s Disease. However, one of the potential limitations of our work is the fact that the ADNI dataset used only consists of subjects from the United States. With the goal deploying NormVAE as part of a computer aided diagnostic system, our model may not generalize successfully when applied to different population from other countries. The lack of generalizability to different types of neuroimaging datasets will lead to false positives, which can further result in misdiagnosis and inappropriate treatment.


  • (1) Richard AI Bethlehem, Jakob Seidlitz, Rafael Romero-Garcia, Stavros Trakoshis, Guillaume Dumas, and Michael V Lombardo. A normative modelling approach reveals age-atypical cortical thickness in a subgroup of males with autism spectrum disorder. Communications biology, 3(1):1–10, 2020.
  • (2) Corinna Cortes and Vladimir Vapnik.

    Support-vector networks.

    Machine learning, 20(3):273–297, 1995.
  • (3) Vineet Edupuganti, Morteza Mardani, Shreyas Vasanawala, and John Pauly. Uncertainty quantification in deep mri reconstruction. IEEE Transactions on Medical Imaging, 40(1):239–250, 2020.
  • (4) Bruce Fischl, David H Salat, Evelina Busa, Marilyn Albert, Megan Dieterich, Christian Haselgrove, Andre Van Der Kouwe, Ron Killiany, David Kennedy, Shuna Klaveness, et al. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron, 33(3):341–355, 2002.
  • (5) Eduard Hofer, Martina Kloos, Bernard Krzykacz-Hausmann, Jörg Peschke, and Martin Woltereck. An approximate epistemic uncertainty analysis approach in the presence of epistemic and aleatory uncertainties. Reliability Engineering & System Safety, 77(3):229–238, 2002.
  • (6) Wyke Huizinga, Dirk HJ Poot, Meike W Vernooij, Gennady V Roshchupkin, Esther E Bron, Mohammad Arfan Ikram, Daniel Rueckert, Wiro J Niessen, Stefan Klein, Alzheimer’s Disease Neuroimaging Initiative, et al. A spatio-temporal reference model of the aging brain. NeuroImage, 169:11–22, 2018.
  • (7) Seyed Mostafa Kia and Andre F Marquand. Neural processes mixed-effect models for deep normative modeling of clinical neuroimaging data. In International Conference on Medical Imaging with Deep Learning, pages 297–314. PMLR, 2019.
  • (8) Diederik P Kingma and Max Welling. Auto-encoding variational bayes in 2nd international conference on learning representations. In ICLR 2014-Conference Track Proceedings, 2014.
  • (9) Andre F Marquand, Seyed Mostafa Kia, Mariam Zabihi, Thomas Wolfers, Jan K Buitelaar, and Christian F Beckmann. Conceptualizing mental disorders as deviations from normative functioning. Molecular psychiatry, 24(10):1415–1424, 2019.
  • (10) Andre F Marquand, Iead Rezek, Jan Buitelaar, and Christian F Beckmann. Understanding heterogeneity in clinical cohorts using normative models: beyond case-control studies. Biological psychiatry, 80(7):552–561, 2016.
  • (11) Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. the Journal of machine Learning research, 12:2825–2830, 2011.
  • (12) Ronald Carl Petersen, PS Aisen, Laurel A Beckett, MC Donohue, AC Gamst, Danielle J Harvey, CR Jack, WJ Jagust, LM Shaw, AW Toga, et al. Alzheimer’s disease neuroimaging initiative (adni): clinical characterization. Neurology, 74(3):201–209, 2010.
  • (13) Walter HL Pinaya, Andrea Mechelli, and João R Sato. Using deep autoencoders to identify abnormal brain structural patterns in neuropsychiatric disorders: A large-scale multi-sample study. Human brain mapping, 40(3):944–954, 2019.
  • (14) Walter HL Pinaya, Cristina Scarpazza, Rafael Garcia-Dias, Sandra Vieira, Lea Baecker, Pedro F da Costa, Alberto Redolfi, Giovanni B Frisoni, Michela Pievani, Vince D Calhoun, et al. Normative modelling using deep autoencoders: a multi-cohort study on mild cognitive impairment and alzheimer’s disease. bioRxiv, 2020.
  • (15) Thomas Wolfers, Christian F Beckmann, Martine Hoogman, Jan K Buitelaar, Barbara Franke, and Andre F Marquand. Individual differences v. the average patient: mapping the heterogeneity in adhd using normative models. Psychological Medicine, 50(2):314–323, 2020.
  • (16) Thomas Wolfers, Nhat Trung Doan, Tobias Kaufmann, Dag Alnæs, Torgeir Moberget, Ingrid Agartz, Jan K Buitelaar, Torill Ueland, Ingrid Melle, Barbara Franke, et al. Mapping the heterogeneous phenotype of schizophrenia and bipolar disorder using normative models. JAMA psychiatry, 75(11):1146–1155, 2018.
  • (17) Mariam Zabihi, Marianne Oldehinkel, Thomas Wolfers, Vincent Frouin, David Goyard, Eva Loth, Tony Charman, Julian Tillmann, Tobias Banaschewski, Guillaume Dumas, et al. Dissecting the heterogeneous cortical anatomy of autism spectrum disorder using normative models. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 4(6):567–578, 2019.
  • (18) Gabriel Ziegler, Gerard R Ridgway, Robert Dahnke, Christian Gaser, Alzheimer’s Disease Neuroimaging Initiative, et al. Individualized gaussian process-based prediction and detection of local and global gray matter abnormalities in elderly subjects. NeuroImage, 97:333–348, 2014.