High dimensional Single Index Bayesian Modeling of the Brain Atrophy over time

12/19/2017 ∙ by Arkaprava Roy, et al. ∙ Duke University NC State University 0

We study the effects of gender, APOE genes, age, genetic variation and Alzheimer's disease on the atrophy of the brain regions. In the real data analysis section, we add a subject specific random effect to capture subject inhomogeneity. A nonparametric single index Bayesian model is proposed to study the data with B-spline series prior on the unknown functions and Dirichlet process scale mixture of zero mean normal prior on the distributions of the random effects. Posterior consistency of the proposed model without the random effect is established for a fixed number of regions and time points with increasing sample size. A new Bayesian estimation procedure for high dimensional single index model is introduced in this paper. Performance of the proposed Bayesian method is compared with the corresponding least square estimator in the linear model with LASSO and SCAD penalization on the high dimensional covariates. The proposed Bayesian method is applied on a dataset of 748 individuals with 620,901 SNPs and 6 other covariates for each individual.



There are no comments yet.


page 2

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Alzheimer’s disease (AD) is a progressive neurodegenerative disease that affects approximately 5.5 million people in the United States and about 30 million people worldwide. It is believed to have a prolonged preclinical phase initially characterized by the development of silent pathologic changes when patients appear to be clinically normal, followed by mild cognitive impairment (MCI) and then dementia (AD) (Petrella (2013)). Apart from its manifestation in the impairment of cognitive abilities, disease progression also produces a number of structural changes in the human brain, which includes the deposition of amyloid protein and the shrinkage or atrophy for certain regions of the brain over time (Thompson et al. (2003)). Previous studies have shown that the rate of brain atrophy is significantly modulated by a number of factors, such as gender, age, baseline cognitive status and most markedly, allelic variants in the Apolipoprotein E (APOE) gene (Hostage et al. (2014)). In this paper, we examine if any other genes are also implicated in modulating the rate of brain atrophy along with examining effects of the low dimensional covariate on the rate of atrophy using the data, collected by Alzheimer’s Disease Neuroimaging Initiative (ADNI).

Aging has a significant effect on cerebral atrophy (Nagata et al., 1987). There are some research works on the association of brain atrophy with Alzheimer’s disease (Devanand et al., 2007; Kopelman, 1989; Sabuncu et al., 2011). McDonald et al. (2009) studied longitudinal magnetic resonance imaging (MRI) data to evaluate the effect of age and Alzheimer’s disease on the brain atrophy. Apart from age, genetic variations also have significant effect on brain atrophy (Schott et al., 2016; Gregory et al., 2006; Hostage et al., 2014).

The data, we are using here, have not been extracted directly from the magnetic resonance (MR) images by us. We collect this data directly from ADNI. We have volumetric measurements over six visits of thirteen disjoint brain regions and a total brain measure which is a summary measure of total brain parenchyma, including the Cerebral-Cortex, Cerebellum-Cortex, Thalamus-Proper, CaudatePutamen, Pallidum, Hippocampus, Amygdala, Accumbens-area, VentralDC, Cerebral-White-Matter, Cerebellum-White-Matter, and WM-hypointensities. These visits are roughly around six months apart. Thus the subjects are scanned roughly for three years. The anatomic structures of the brain also differ across different individuals and are assumed to be dependent on subject-specific covariates like genetic variations, gender, age etc. In this paper, we propose a model to study the effects of these different covariates along with Alzheimer’s disease state on brain atrophy in different brain regions. This analysis represents a technical challenge because the genomic data is high dimensional and needs to be incorporated in a model for longitudinal progression of brain volumes measured in multiple parts of the brain in a non-parametric setup. A schematic of the regions, we studied here, are depicted in the Figure 1. This image is obtained from Ahveninen et al. (2012). Although our analysis does not have any association with this paper, an image from it is used to show the brain regions used for the analysis in our paper. We have not collected the data directly from the magnetic resonance (MR) images according to some brain parcellation atlas. We got the data in a preprocessed form from ADNI itself. (Hostage et al., 2014) had also used a similar set of regions of interest.

Figure 1: Anatomic parcellation of cortical surface from different angles showing brain regions used for analysis.

We consider two separate sets of unknown functions of covariates and to model the volumetric measurements of the first visit and rates of changes for different regions. These functions have two inputs. The first consists of high dimensional single-nucleotide polymorphism (SNP) of each individual, and the other consists of low-dimensional covariates like gender, age, disease state, APOE gene status of each individual. These functions {} for the volumetric measurements of the first visit and {} for the rate of change in the th region and the coefficients and

are unit vectors of appropriate dimensions. A finite random series prior is put on the functions based on tensor products of B-splines with appropriate prior distribution on the coefficients. To take care of the high dimensionality of

, the coefficient is assumed to be sparse. We reparametrize in polar coordinates to incorporate sparsity in its prior. We incorporate the effect of time in the modeling by an increasing function which is estimated nonparametrically also using a finite random series of on B-splines with appropriate prior on the coefficients.

Apart from proposing a sophisticated model for studying brain atrophy, the proposed method develops new estimation scheme for a general high dimensional single index model. Estimation for high dimensional single index model is addressed in Zhu and Zhu (2009), Yu and Ruppert (2002), Wang et al. (2012), Peng and Huang (2011), Radchenko (2015) and Luo and Ghosal (2016). All of them used the -penalty and used optimization techniques to get the estimates. In a Bayesian framework, Antoniadis et al. (2004) used the Fisher–von Mises prior to the directional vector. This cannot be easily modified for a high dimensional covariate as then we shall need a prior which favors many zeros in the unit vector. Another paper addressing sparse Bayesian single index model estimation is Alquier and Biau (2013). Even though their method is theoretically attractive, it is difficult to implement for high dimensional covariate due to its high computational complexity. Wang (2009)

developed a Bayesian method for sparse single index model using reversible jump Markov chain Monte Carlo (MCMC) technique which is computationally expensive, especially in the high dimensional situation. We introduce a new way of incorporating sparsity on a unit vector by a spike and slab prior to the polar form. The computation scheme is based on a stochastic search variable selection technique. To make our method more accessible to users, we provide an R package for estimation in single index model with inputs both in high and low dimensional setup.

The rest of the paper is organized in the following manner. The next section discusses the dataset and modeling in more detail. In Section 3, we describe the prior on different parameters of the model. Section 4 describes posterior computation in this setup. We study the posterior rate of contraction in the model in Section 5 under the asymptotic regime that the number of individuals goes to infinity but the number of time points where measurements are taken and the number of regions is fixed. We present a simulation study comparing the proposed Bayesian procedure with its linear counterpart in Section 6. The concentration of the posterior justifies the use of the proposed Bayesian procedure from a frequentist perspective. In Section 7, we present conclusions from the ADNI data on brain atrophy described above using our proposed method. Section 8, concludes the paper with some further remarks.

2 Data description and modeling

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to predict the progression of mild cognitive impairment (MCI) and early Alzheimer’s disease (AD). In the ADNI dataset, the grey matter part of the brain is divided into thirteen disjoint regions. The volume of these regions and the summary measure of the whole brain are recorded over time for individuals. The number of visits is not uniform across individuals but varies between 1 to 6. The volume data of regions which include thirteen brain regions and the summary measure of the whole brain over many time points for the th individual, for is collected where .

In ADNI, the subjects were genotyped using the Human 610-Quad BeadChip (Illumina, Inc., San Diego, CA), yielding a set of 620,901 SNP and copy number variation (CNV) markers. The APOE gene has been the most significant gene in GWAS of Alzheimer’s disease. The corresponding SNPs, rs429358 and rs7412, are not on the Human 610-Quad Bead-Chip. At the time of participant enrollment, APOE genotyping was performed and included in the ADNI database. The two SNPs (rs429358, rs7412) define the epsilon 2, 3, and 4 alleles and therefore were not genotyped using DNA extracted by Cogenics from a 3 mL aliquot of EDTA blood. These alleles are considered separately in the study.

Thus apart from the volumetric measurements, we also have high dimensional SNP data and data on some other covariates for each individual. The other covariates are gender, disease state, age and allele 2 and 4 of the APOE gene. Except for the covariate age, all other low-dimensional covariates are categorical. To represent the categories, we use binary dummy variables. Since the disease status has three states— NC (no cognitive impairment), MCI (mild cognitive impairment) and AD (Alzheimer’s disease), we consider two dummy variables

and respectively standing for the onset of MCI and AD, setting NC at the baseline. Similarly, the dummy variable indicating male gender is introduced setting females at the baseline. Also, we introduce , standing for Alleles 2 and 4 for the two alleles APOEallele2 and APOEallele4 together setting Allele 3 as a baseline for each of the two cases. We consider the age corresponding to the initial visit as a covariate as well. Let stand for the whole vector of covariates. The continuous variable, Age, is standardized.

With time, different brain regions change differently. We study the effects of different attributes to these changes. For every individual, the volume of a brain region on a particular visit should primarily depend on the volume of that region at the zeroth visit and the rate of change of volume for that region with time. Neither of these is uniform across individuals or regions. Hence, it is logical to consider the baseline volume and the rate of change as functions on the subject and brain region. As the geometry of brain structure is complicated, we do not assume a form of standard spatial dependence between measurements across brain regions. Thus we need two sets {} of functions for modeling volume at the initial visit and the rate of change for the th region. These functions are unknown and are modeled nonparametrically. For nonparametric regression problems, single-index models provide a lot of flexibility in estimation and interpretation of the results. Hence we adopt the bivariate single-index model with two inputs for high-dimensional and low-dimensional covariates separately for easy interpretation and computational efficiency. The effect of time is captured through an unknown increasing function , which is bounded in . This is also modeled nonparametrically.

Let is the volume of the th brain region for the th individual at the th time point in the logarithmic scale, is high-dimensional SNP expression of length for the th individual, , and . Then the data generating process can be represented through the following specification


where are all unknown continuous, monotone increasing functions from to and ,

, and N stands for a normal distribution. For identifiability of the functions along with the parameters

and , we assume that , ; here denotes -norm of a vector. We also normalize the covariates for each individual i.e. and for each such that and are one. This is to ensure that and are bounded between for each . For nonparametric function estimation, bounded domain is important for uniform approximation using the basis expansion. The biggest challenge for estimation in this model is the high dimensionality of . To identify important SNPs from , we need to perform variable selection. To do that, we propose a sparse estimation scheme for . First we reparametrize the two unit vector and to their respective polar forms which allow us to work with Euclidean spaces. In this setup, only the directions of and are identifiable. Note that and have the same directions. In the polar setup, for , , and where is the polar angle corresponding to the unit vector . Here for and . Similarly, let be the polar angle corresponding to . Then for , and .

3 Prior specification

In the nonparametric Bayesian setup described above, we induce prior distributions on the smooth functions and in (2.1) through basis expansions in tensor products of B-splines and suitable prior for the corresponding coefficients. Given other parameters in this setup, a normal prior distribution on the coefficients of the tensor products of B-splines will lead to conjugacy and faster sampling via Gibbs sampling scheme. An inverse gamma prior on is an obvious choice due to conjugacy and faster sampling. We also put a B-spline series prior on the smooth increasing function of time . The coefficients for this function would be increasing in the index of the basis functions and lie in (0,1]. To put a prior on an increasing sequence, we introduce a set of latent variables of size equal to one less than the number of B-spline coefficients. Then the B-spline coefficients would be normalized cumulative sum of those latent variables. Other two parameters and are reparametrized into their polar coordinate system. The parameter space of the polar angles will be a hyper-rectangle. It will be easier to put prior on the polar angles. To estimate using the sparsity of , we need to carefully put a shrinkage prior to the polar angles. A polar angle of will ensure that the corresponding coordinate in the unit vector equals to zero. When there is sparsity in the unit vector, most of the polar angles will be . Thus a spike and slab prior to polar angle with a spike of should be able to capture sparsity in the corresponding unit vector. The last polar angle has spike both at 0 and , due to the special structure of the last and the penultimate co-ordinates of a unit vector in the polar form. Since only the directions of and are identifiable, the intercept and slope functions are modeled as even functions i.e. symmetric around zero.

Now we describe the prior in details. Let denote the lowest integer greater than or equal to .

  • Intercept and slope functions: Let , , with , for , and , , for some chosen . Further,

    is given a prior with probability mass function of the form

    , with and .

  • The function of time: Let , with . We put a prior on by reparameterizing as , and putting the prior for

    , where Un stands for the uniform distribution. Further,

    is given a prior with probability mass function of the form with and .

  • Error variance: We put

    , where Ga stands for the gamma distribution.

  • Polar angles of : We let , , and , independently.

  • Polar angles of : We put a spike and slab prior on the polar angles that has spike at for the first polar angle and all the multiples of for the last polar angle. Then the spike distribution would look like Figure 2. The spike and slab prior for , , has density given by

    for and the distribution of is given by

    here Be

    denotes the beta distribution with shape parameters

    and , supported within the interval , and .

The spike distribution on looks like Figure 2. The first plot is for the first angles and the second plot is for the last polar angle.

Note that either geometric or Poisson distribution (respectively

and ) can be chosen as prior on , the number of terms to be used in the B-spline series for the growth function . For , the square, which is the number of terms in the tensor product series representation, can have a geometric or Poisson-like tail.

Model selection

: Polar angles with a posterior probability of selection in the model more than 0.5 are considered in the model.

Figure 2: Spike distribution with and .

4 Computation

Introduce a latent variable for the indicator of the Un(0,1) component of the distribution of , . Now the conditional log-likelihood is given by


involves only the hyperparameters

and the observations but not parameter of the model.

All the B-spline coefficients and are updated using the conjugacy structure. All the posterior updates are discussed in the supplementary materials.

In our computation of the model, we are not using the priors on and i.e. the number of B-spline basis functions as it will require reversible jump MCMC strategy which is computationally expensive. Given the data, we choose these values by following a particular strategy and are kept fixed in the MCMC scheme. These are chosen by minimizing the Akaike Information Criterion (AIC) after fitting the model over a grid of a number of B-spline basis functions for randomly generated 10 different choices of and . We generate 10 different choices for and and then fit the non-linear model for different choices of a number of B-spline basis functions. After taking an average overall 10 AIC values for each case, we take the number as optimal that has the least AIC value.

5 Large-sample Properties

In this section, we examine large sample properties of the proposed Bayesian procedure for the model (2.1). We have observations for fixed number of regions and many time points. We show posterior consistency in the asymptotic regime of increasing sample size and increasing dimension of SNPs.

We study posterior contraction rate with respect to the empirical -distance on the regression function, which is defined as follows. For two sets of parameters and , the empirical -distance is given by

and , .

Since is a high-dimensional parameter, sensible estimation is possible only if it has sparsity, which must be picked up by the prior. In the setting of a spike-and-slab prior for polar coordinates described in Section 3, we need to ensure sufficient concentration near by choosing a large value of the second parameter in the beta spike distribution (depending on the sample size and the dimension ) and a small value of the probability of slab . More precisely, we choose fixed, and . Then the contraction rate will be determined by the smoothnesses of the underlying true functions and , the sparsity of true high dimensional regression coefficient and mildly on the parameter in the prior distribution for the number of basis elements used in the B-spline bases, as shown by the following result.

Theorem 1.

Assume that the true function belongs to the Hölder class of regularity level on and the true coefficient functions are Hölder smooth function of regularity level on . Let the true regression coefficient for the high dimensional covariate have non-zero co-ordinates. Then the posterior contraction rate with respect to the distance is

In the above result, since the observation points are not dense over the domain, posterior contraction on the function is based on its distance with the true function only at the observation points.

The proof of the theorem uses the general theory of posterior contraction (see Ghosal and van der Vaart (2017)) for independent non-identically distributed observations and some estimates for finite random series based on B-splines. The proof is given in the supplementary materials part.

6 Simulation

We compare our method with some common penalization methods based on the following simplified linear model


, , . In the above model, is a sparse vector and all other parameters are unpenalized. The performance of these methods is compared based on MSE values on a test set under the scenarios the linear model is correct and the linear model is false. For sample sizes 200, 500 and 1000, we gather the mean squared error (MSE) values corresponding to those models both for well-specified and misspecified cases. We use half of the sample for training and the remaining half for testing. Among other parameters, we consider thirteen regions in total, five time points and vary the value of as 5000, 10000 and 20000. We include one case for ultra high dimension of and sample size . We set and tune for different cases to ensure a good acceptance rate and desired model size (sum of ’s) across MCMC samples. The results are summarized below for replications and post burn-in samples. The number of basis functions for spline is different across different sample sizes. For the ultra-high dimensional case with and , we consider 10 replications.

Data generation for the non-linear case:

  • Generate a data matrix

    with elements coming from Bernoulli distribution with success probability of the

    th row as .

  • Generate , , from the standard uniform distribution.

  • Generate all the elements of the matrix from N.

  • Generate the sparse vector with 5% elements non-zero. Positions for non-zero elements are chosen first at random by sampling elements from total positions, where is the length of . The non zero elements are generated from mixture distribution of two normals N and N.

  • Set the value of to .

  • Normalize each row of and along with are to the unit norm.

We let the true functions be

, and . After generating the true functional values, the data is generated from N.

Data generation for the linear case:

In this case, the true model is the one given in (6.1). All the steps for generating , , and are similar as above. The coefficients and , , are generated from N. After generating the design matrix, the data is generated from N

For the data generation, we set the error standard deviation


We compare the prediction MSEs across all the methods. We split the whole data into two equal parts for training and testing. We compare the performance of our method with LASSO (Tibshirani (1996)) and SCAD (Fan and Li (2001)) on testing dataset based estimates of parameters from the training dataset. We use the R package glmnet for LASSO and ncvreg for SCAD. To fit our model we vary the the number of B-spline basis function as 8 for , 11 for , and 14 for . These numbers are chosen according to the strategy, described in the Section 4.

Total Dimension SIM LASSO SCAD
sample size of () MSE MSE MSE
200 5000 3.43 11.93 21.32
500 5000 3.53 12.11 20.03
1000 5000 3.30 11.36 22.08
200 10000 3.52 11.32 22.16
500 10000 3.09 11.69 22.80
1000 10000 3.12 12.01 22.13
200 20000 4.42 13.88 22.75
500 20000 3.51 11.54 23.20
1000 20000 3.25 12.12 23.03
Table 1: For non-linear case
Total Dimension SIM LASSO SCAD
sample size of () MSE MSE MSE
200 5000 1.87 1.00 1.01
500 5000 1.28 1.01 1.02
1000 5000 2.20 1.00 1.02
200 10000 1.81 1.02 1.01
500 10000 1.58 1.01 1.02
1000 10000 2.11 1.01 1.02
200 20000 1.24 1.03 1.01
500 20000 1.44 1.02 1.01
1000 20000 1.55 1.01 1.02
Table 2: For linear case

From Table 1, we infer that the performance of the proposed Bayesian method based on the high dimensional single index model is always much better than the LASSO and the SCAD for the non-linear case. For the linear case in Table 2, it is competitive with linearity based methods like the LASSO or the SCAD. This is natural as the LASSO or the SCAD use more precise modeling information which the semiparametric methods cannot use.

7 Real-data analysis

7.1 Modification of the model for real data application

7.1.1 Incorporating random effect and region wise varying effect

As the data are longitudinal time series, it is reasonable to add subject specific random effect () in the model and vary the effect of the low dimensional covariates region-wise. The new modified model will then become


with , , .

Prior on the random effects

We put a Dirichlet process scale mixture of normal prior on the random effect distribution.

7.1.2 Region-wise varying effect with no SNP

To compare the nonlinear model with the linear model of Hostage et al. (2014), we also fit the following model without the SNPs,


with , , .

7.1.3 Corresponding Linear Model

We compare the performance of our above non-linear models with following linear model,

where with , , , and


We have volumetric measurement data for the total thirteen brain regions along with the summary measure of the whole brain over time for 748 individuals. For each individual, we have covariate information which is summarized in Table 3. The standard deviations for age in each group are mentioned in the bracket. The baseline subject for our analysis is a female individual with average age and no cognitive impairment.

Female Male
No cognitive impairment (NC) 99 114
Alzheimer’s disease (AD) 84 94
Mild cognitive impairment (MCI) 122 235
APOEgene allele2 29 29
APOEgene allele4 150 223
Average age 73.51 74.60
(6.67) (6.80)
Table 3: Demographic table

We first fit the model in  (7.2) and the following linear model in (7.3) in accordance with Hostage et al. (2014) with same set of covariates and interactions between APOE and disease states. Then we compare the prediction MSE. Prediction error gives us the predictive performance and fitted relative MSE helps to judge the reliability of inference. We consider 17 basis B-spline functions for univariate and basis functions for bivariate cases.

Linear model gives the prediction error 3.83 whereas that in our non-linear model hugely improves to 0.07. The model in (7.1) with SNPs betters the prediction error to 0.06 which is around 14% improvement. For the prediction error, we divide the whole dataset into training (Tr) and testing (Te). We use stratified sampling using each subject-region pair as stratum so that training will have all the individuals that belong to the testing set. This is important for prediction with a random effect in the model. The formula for prediction error will be ; here denotes total number of elements in test set Te.

After selecting the significant genes, we calculate the Bayesian information criterion (BIC) of models leaving out one of the low dimensional covariates every time with all the genes to compare the significance. The table below gives an ordered list of the significance of low dimensional covariates for different regions in Tables 4 to 6 (broken down into three tables). In Table 7, we show the estimates from the linear model in (7.3) for the whole brain.

Total Brain Ventricles Left Hippocampus Right Hippocampus LINFLATVEN
Age MCI APOE-allele4 MCI APOE-allele4
APOE-allele4 APOE-allele4 Gender APOE-allele4 Gender
APOE-allele2 Gender MCI APOE-allele2 AD
Gender AD APOE-allele2 Gender MCI
MCI APOE-allele2 AD AD APOE-allele2
AD Age Age Age Age
Table 4: Order of significance of low dimensional covariates of the model in 7.2 with selected genes for the first set of regions.
RINFLATVEN Left Medial Right Medial Left Inferior Right Inferior
Temporal Temporal Temporal Temporal
APOE-allele4 MCI Gender APOE-allele4 APOE-allele4
AD Age APOE-allele4 APOE-allele2 MCI
MCI APOE-allele4 APOE-allele2 MCI APOE-allele2
Gender APOE-allele2 Age Age AD
APOE-allele2 Gender MCI Gender Age
Age AD AD AD Gender
Table 5: Order of significance of low dimensional covariates of the model in 7.2 with selected genes for the second set of regions.
Left Fusiform Right Fusiform Left Entorhin Right Entorhin
Gender APOE-allele4 APOE-allele4 APOE-allele4
MCI Age Gender Gender
APOE-allele4 APOE-allele2 Age MCI
APOE-allele2 Gender AD AD
AD AD APOE-allele2 APOE-allele2
Table 6: Order of significance of low dimensional covariates of the model in 7.2 with selected genes for the third set of regions.

We map the significant SNPs from our analysis to corresponding genes using the R package rsnps. We tune the parameter in the model to select 20 most significant SNPs. Among those, we identify 11 of those. The significant genes from our analysis are mentioned in Section 8 along with some previous studies that found the corresponding gene significant for the AD and/or cerebral atrophy.

Value Std.Error DF t-value p-value
time 0.013 0.001 2155 13.208 0.000
APOEallele4:time 0.000 0.001 2155 0.047 0.962
APOEallele2:time 0.001 0.002 2155 0.249 0.804
Gender:time 0.003 0.001 2155 3.077 0.002
MCI:time 0.006 0.001 2155 4.811 0.000
AD:time 0.014 0.002 2155 6.969 0.000
Age:time 0.002 0.000 2155 4.663 0.000
APOEallele4:MCI:time 0.004 0.002 2155 2.689 0.007
APOEallele4:AD:time 0.004 0.002 2155 2.105 0.035
APOEallele2:MCI:time 0.001 0.003 2155 0.292 0.770
APOEallele2:AD:time 0.003 0.006 2155 0.530 0.596
Table 7: Estimates of covariates for Total Brain for slope from linear model.

8 Conclusions and discussion

We fit a bivariate single index model to capture the volumetric change of different cortical regions in the human brain. There are both high and low-dimensional covariates as input to the unknown functions determining initial configuration and rate of change of different regions. To tackle the high dimensional covariate within a single index model, we provide a new technique to assign sparse prior in this paper. Posterior consistency results are also established. An ‘R’ package is attached with this paper as a supplementary material (https://github.com/royarkaprava/SIMBayes). This can be used to fit high and low dimensional single index models. This package can fit a single index model with only one input or two inputs with at most one in shrinkage consideration. The function betaupdate of this package is used to generate MCMC samples for polar angles of and of our model.

In our results on the real dataset, we find that allele 4 of APOE gene is always among the top three significant covariates for almost all the cases in Table 4 to 6. The fact that allele 4 of the APOE gene is significant was established in Hostage et al. (2014). They used a linear model, similar to the model in (7.3). But in our estimates of parameters for the linear case in Table 7, APOE allele 4 is not significant. Such is the case for several other regions in the linear case. Since allele 4 is not found to be significant here in the linear model, for this dataset linear model is not suitable. We identify 11 significant genes. There are some previous studies that also noted the significant genes from our analysis as possible candidates for the AD and/or cerebral atrophy. The genes along with associated future study citing that gene in connection with AD and/or cerebral atrophy are mentioned here SLC6A1 (Carvill et al., 2015), KCNIP4 (Himes et al., 2013), ADGRL3 (Orsini et al., 2016), SORBS2 (Zhang et al., 2016; Lee et al., 2014; Niceta et al., 2015), LPAR3 (Yung et al., 2015), SHROOM3 (Dickson et al., 2015; Freudenberg-Hua et al., 2016), SORCS3 (Breiderhoff et al., 2013; Lane et al., 2012), NPY2R (Lin et al., 2010; Schriemer et al., 2016), CWF19L2 (Lin et al., 2013), PALLD (Nho et al., 2015) and KCNMA1 (Burns et al., 2011; Tabarki et al., 2016)

9 Acknowledgments

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer’s Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

The first author would like to thank Mr. Kushal Kumar Dey for helping him how to build an ‘R’ package.

We are grateful to the anonymous reviewer and the editor for their valuable comments that have greatly helped improve the manuscript.


  • Ahveninen et al. (2012) Ahveninen, J., Jääskeläinen, I. P., Belliveau, J. W., Hämäläinen, M., Lin, F., and Raij, T. (2012). “Dissociable influences of auditory object vs. spatial attention on visual system oscillatory activity.” Public Library of Science One.
  • Alquier and Biau (2013) Alquier, P. and Biau, G. (2013). “Sparse single-index model.”

    Journal of Machine Learning Research

    , 14: 243–280.
  • Antoniadis et al. (2004) Antoniadis, A., Grégoire, G., and McKeague, I. W. (2004). “Bayesian estimation In single-index models.” Statistica Sinica, 14: 1147–1164.
  • Breiderhoff et al. (2013) Breiderhoff, T., Christiansen, G. B., Pallesen, L. T., Vaegter, C., Nykjaer, A., Holm, M. M., Glerup, S., and Willnow, T. E. (2013). “Sortilin-related receptor SORCS3 is a postsynaptic modulator of synaptic depression and fear extinction.” PLoS One, 8(9): e75006.
  • Burke (2014) Burke, J. V. (2014). https://sites.math.washington.edu/ burke/crs/408/lectures/L3-Multivariable-Calc-Review.pdf.”
  • Burns et al. (2011) Burns, L., Minster, R., Demirci, F., Barmada, M., Ganguli, M., Lopez, O., DeKosky, S., and Kamboh, M. (2011). “Replication study of genome-wide associated SNPs with late-onset Alzheimer’s disease.” American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 156(4): 507–512.
  • Carvill et al. (2015) Carvill, G. L., McMahon, J. M., Schneider, A., Zemel, M., Myers, C. T., Saykally, J., Nguyen, J., Robbiano, A., Zara, F., Specchio, N., et al. (2015). “Mutations in the GABA transporter SLC6A1 cause epilepsy with myoclonic-atonic seizures.” The American Journal of Human Genetics, 96(5): 808–815.
  • Devanand et al. (2007) Devanand, D., Pradhaban, G., Liu, X., Khandji, A., De Santi, S., Segal, S., Rusinek, H., Pelton, G., Honig, L., Mayeux, R., et al. (2007). “Hippocampal and entorhinal atrophy in mild cognitive impairment prediction of Alzheimer disease.” Neurology, 68(11): 828–836.
  • Dickson et al. (2015) Dickson, H. M., Wilbur, A., Reinke, A. A., Young, M. A., and Vojtek, A. B. (2015). “Targeted inhibition of the Shroom3–Rho kinase protein–protein interaction circumvents Nogo66 to promote axon outgrowth.” BMC neuroscience, 16(1): 34.
  • Fan and Li (2001) Fan, J. and Li, R. (2001). “Variable selection via nonconcave penalized likelihood and its Oracle properties.” Journal of the American Statistical Association, 96(1).
  • Freudenberg-Hua et al. (2016) Freudenberg-Hua, Y., Li, W., Abhyankar, A., Vacic, V., Cortes, V., Ben-Avraham, D., Koppel, J., Greenwald, B., Germer, S., Consortium, T.-G., et al. (2016). “Differential burden of rare protein truncating variants in Alzheimer’s disease patients compared to centenarians.” Human molecular genetics, 25(14): 3096–3105.
  • Ghosal and van der Vaart (2017) Ghosal, S. and van der Vaart, A. (2017).

    Fundamentals of Nonparametric Bayesian Inference

    Cambridge University Press, Cambridge.
  • Gregory et al. (2006) Gregory, G. C., Macdonald, V., Schofield, P. R., Kril, J. J., and Halliday, G. M. (2006). “Differences in regional brain atrophy in genetic forms of Alzheimer’s disease.” Neurobiology of aging, 27(3): 387–393.
  • Himes et al. (2013) Himes, B. E., Sheppard, K., Berndt, A., Leme, A. S., Myers, R. A., Gignoux, C. R., Levin, A. M., Gauderman, W. J., Yang, J. J., Mathias, R. A., et al. (2013). “Integration of mouse and human genome-wide association data identifies KCNIP4 as an asthma gene.” PloS one, 8(2): e56179.
  • Hostage et al. (2014) Hostage, C. A., Choudhury, K. R., P.M., D., and J.R., P. (2014). “Mapping the effect of the Apolipoprotein E Genotype on 4-Year Atrophy Rates in an Alzheimer Disease–related Brain Network.” Radiology, 271(1).
  • Jones (2008) Jones, G. L. (2008). http://users.stat.umn.edu/ galin/icsprar.pdf.”
  • Kopelman (1989) Kopelman, M. D. (1989). “Remote and autobiographical memory, temporal context memory and frontal atrophy in Korsakoff and Alzheimer patients.” Neuropsychologia, 27(4): 437–460.
  • Lane et al. (2012) Lane, R. F., St George-Hyslop, P., Hempstead, B. L., Small, S. A., Strittmatter, S. M., and Gandy, S. (2012). “Vps10 family proteins and the retromer complex in aging-related neurodegeneration and diabetes.” Journal of Neuroscience, 32(41): 14080–14086.
  • Lee et al. (2014) Lee, J. H., Cheng, R., Vardarajan, B. N., Lantigua, R. A., Reyes-Dumeyer, D., Ortmann, W., Graham, R., Bhangale, T., Behrens, T., Medrano, M., et al. (2014). “SORBS2, SH3RF3, and NPHP1 modify age at onset in carriers of the G206A mutation in PSEN1 with familial Alzheimer’s disease.” Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 10(4): P632.
  • Lin et al. (2010) Lin, J., Li, X., Yuan, F., Lin, L., Cook, C. L., Rao, C. V., and Lei, Z. (2010). “Genetic ablation of luteinizing hormone receptor improves the amyloid pathology in a mouse model of Alzheimer disease.” Journal of Neuropathology & Experimental Neurology, 69(3): 253–261.
  • Lin et al. (2013) Lin, P.-I., Kuo, P.-H., Chen, C.-H., Wu, J.-Y., Gau, S. S., Wu, Y.-Y., and Liu, S.-K. (2013). “Runs of homozygosity associated with speech delay in autism in a taiwanese han population: evidence for the recessive model.” PLoS One, 8(8): e72056.
  • Luo and Ghosal (2016) Luo, S. and Ghosal, S. (2016). “Forward selection and estimation in high dimensional single index models.” Stat Methodology, 33: 172–179.
  • McDonald et al. (2009) McDonald, C., McEvoy, L., Gharapetian, L., Fennema-Notestine, C., Hagler, D., Holland, D., Koyama, A., Brewer, J., Dale, A., Initiative, A. D. N., et al. (2009). “Regional rates of neocortical atrophy from normal aging to early Alzheimer disease.” Neurology, 73(6): 457–465.
  • Nagata et al. (1987) Nagata, K., Basugi, N., Fukushima, T., Tango, T., Suzuki, I., Kaminuma, T., and Kurashina, S. (1987). “A quantitative study of physiological cerebral atrophy with aging.” Neuroradiology, 29(4): 327–332.
  • Nho et al. (2015) Nho, K., Kim, S., Risacher, S. L., Ramanan, V. K., Shen, L., Foroud, T. M., Gibbons, L. E., Crane, P. K., Weiner, M. W., Green, R. C., et al. (2015). “Genome-wide rare variant analysis identifies candidate genes significantly associated with composite scores for memory.” Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 11(7): P251–P252.
  • Niceta et al. (2015) Niceta, M., Stellacci, E., Gripp, K. W., Zampino, G., Kousi, M., Anselmi, M., Traversa, A., Ciolfi, A., Stabley, D., Bruselles, A., et al. (2015). “Mutations impairing GSK3-mediated MAF phosphorylation cause cataract, deafness, intellectual disability, seizures, and a Down syndrome-like facies.” The American Journal of Human Genetics, 96(5): 816–825.
  • Orsini et al. (2016) Orsini, C. A., Setlow, B., DeJesus, M., Galaviz, S., Loesch, K., Ioerger, T., and Wallis, D. (2016). “Behavioral and transcriptomic profiling of mice null for Lphn3, a gene implicated in ADHD and addiction.” Molecular genetics & genomic medicine, 4(3): 322–343.
  • Peng and Huang (2011) Peng, H. and Huang, T. (2011). “Penalized least squares for single index models.” J. Statist. Plann. Inference, 141: 1362–1379.
  • Petrella (2013) Petrella, J. (2013). “Neuroimaging and the search for a cure for Alzheimer disease.” Radiology, 269: 671–691.
  • Radchenko (2015) Radchenko, P. (2015). “High dimensional single index models.”

    Journal of Multivariate Analysis

    , 139: 266–282.
  • Sabuncu et al. (2011) Sabuncu, M. R., Desikan, R. S., Sepulcre, J., Yeo, B. T. T., Liu, H., Schmansky, N. J., Reuter, M., Weiner, M. W., Buckner, R. L., Sperling, R. A., et al. (2011). “The dynamics of cortical and hippocampal atrophy in Alzheimer disease.” Archives of neurology, 68(8): 1040–1048.
  • Schott et al. (2016) Schott, J. M., Crutch, S. J., Carrasquillo, M. M., Uphill, J., Shakespeare, T. J., Ryan, N. S., Yong, K. X., Lehmann, M., Ertekin-Taner, N., Graff-Radford, N. R., et al. (2016). “Genetic risk factors for the posterior cortical atrophy variant of Alzheimer’s disease.” Alzheimer’s & dementia: the journal of the Alzheimer’s Association, 12(8): 862–871.
  • Schriemer et al. (2016) Schriemer, D., Sribudiani, Y., IJpma, A., Natarajan, D., MacKenzie, K. C., Metzger, M., Binder, E., Burns, A. J., Thapar, N., Hofstra, R. M., et al. (2016). “Regulators of gene expression in Enteric Neural Crest Cells are putative Hirschsprung disease genes.” Developmental biology, 416(1): 255–265.
  • Tabarki et al. (2016) Tabarki, B., AlMajhad, N., AlHashem, A., Shaheen, R., and Alkuraya, F. S. (2016). “Homozygous KCNMA1 mutation as a cause of cerebellar atrophy, developmental delay and seizures.” Human genetics, 135(11): 1295–1298.
  • Thompson et al. (2003) Thompson, P., Hayashi, K., and deZubicaray G. (2003). “Dynamics of gray matter loss in Alzheimer’s disease.” Journal of Neuroscience, 23: 994–1005.
  • Tibshirani (1996) Tibshirani, R. (1996). “Regression shrinkage and selection via the Lasso.” Journal of the Royal Statistical Society B, 58: 267–288.
  • Wang (2009) Wang, H. (2009). “Bayesian estimation and variable selection for single index models.” Computational Statistics and Data Analysis, 53: 2617–2627.
  • Wang et al. (2012) Wang, T., Xu, P., and Zhu, L. (2012). “Non-convex penalized estimation in high-dimensional models with single-index structure.” The Journal of Multivariate Analysis, 109: 221–235.
  • Yu and Ruppert (2002) Yu, Y. and Ruppert, D. (2002). “Penalized spline estimation for partially linear single index models.” J. Amer. Statist. Assoc., 97: 1042–1054.
  • Yung et al. (2015) Yung, Y. C., Stoddard, N. C., Mirendil, H., and Chun, J. (2015). “Lysophosphatidic acid signaling in the nervous system.” Neuron, 85(4): 669–682.
  • Zhang et al. (2016) Zhang, Q., Gao, X., Li, C., Feliciano, C., Wang, D., Zhou, D., Mei, Y., Monteiro, P., Anand, M., Itohara, S., et al. (2016). “Impaired dendritic development and memory in Sorbs2 knock-out mice.” Journal of Neuroscience, 36(7): 2247–2260.
  • Zhu and Zhu (2009) Zhu, L. and Zhu, L. (2009). “Nonconcave penalized inverse regression in single-index models with high dimensional predictors.” The Journal of Multivariate Analysis, 100(5): 862–875.