I Introduction
Alzheimer’s disease (AD) is one of the most common neurodegenerative diseases among the elderly [1]. It is reported that the number of people living with AD will drastically rise from 50 million in 2018 to 152 million in 2050 around the world [2]. AD has become a major social problem that endangers public health and hinders economic development in the coming years [3]. However, the exact cause of that disease is unclear, and no treatments or effective drugs are reported to cure the AD [4, 5, 6]. Therefore, much attention is paid to the early diagnosis of AD [7, 8, 9, 10, 11, 12], so that timely intervention can be taken to slow down the progression of AD.
Neuroscience studies [13, 14, 15]
on the brain have shown characteristic changes in brain morphology, structural or functional connections at the early stage of AD. Brain network is suitable to describe these characteristics by employing different imaging modalities, such as T1weighted magnetic resonance imaging (MRI), positron emission tomography (PET), restingstate functional magnetic resonance imaging (fMRI), and diffusion tensor imaging (DTI). Moreover, it can be appropriately characterized by graph theory, where the nodes represent the spatially distributed regions of interest (ROIs) and the edges represent the link relationship between ROIs or subjects
[16]. Previous works based on graph convolutional network (GCN) [17]extracted features for each ROI or subject and then built a classifier for AD diagnosis by using either unimodal or bimodal imaging data. They can achieve good classification performance and analyze the pathological brain regions associated with cognitive disease. However, those works neglect to unravel the underlying interactions among multiple diseaserelated ROIs in the brain. Besides, in order to analyze cognitive disease, hypergraph
[18] based methods were used to improve the classification performance and produce discriminative connections by considering highorder relations among multiple ROIs from unimodal imaging or among multiple subjects from multimodal imaging. The extracted features by these methods neglect the highorder relations of multiple ROIs both within and between multimodal images, which is essential for cognitive disease analysis.Recently, multimodal image fusion has attracted much attention in disease diagnosis, because it can provide complementary information about the disease and thus improve disease detection performance [19, 20, 21, 22]
. Besides, machine learning is widely used in medical image processing
[23, 24, 25, 26]. Generative Adversarial Network (GAN) based on variational inference
[27, 28, 29, 30] is strongly capable of generalization and distribution fitting ability in image analysis [31, 32, 33] and graph representation learning [34], which has been enhanced in terms of robustness by incorporating normal distribution
[35]. Considering that the neurological activities associated with cognition and behavior are a result of the joint mutual interaction among multiple brain regions [36, 37], hypergraph is more appropriate to characterize that highorder relations among multiple ROIs. Inspired by the above observations, in this paper, we propose a novel prior guided adversarial representation learning and hypergraph perceptual network (PGARLHPN) for predicting abnormal brain connections of different cognitive disease stages. The proposed model can automatically learn robust representations and produce discriminative united connectivity representing the overall brain network of multimodal images. Based on anatomical knowledge, specific diseaserelated ROIs are applied to the graph data to estimate prior distribution. A bidirectional adversarial mechanism is introduced to stabilize the representation learning and speed up the convergence for multimodal representation learning by incorporating the estimated prior distribution. The pair samples from data space and representation space are sent to the designed pairwise collaborative discriminator to preserve the sample consistency and distribution consistency. Also, the reconstruction and classification modules are utilized to make representations classdiscriminative and robust. Besides, a hypergraphbased network is employed to generate hyperedges for each imaging modality and then produce fused representation by perceptual convolution, which captures the highorder relations among multiple ROIs within single modality imaging and bridges the highorder relations between multimodal images. As a result, the united connectivitybased features are extracted to retain diseaserelated complementary information and improve abnormal connection prediction performance. The main contributions of this framework are as follows:
A Priori Guided Adversarial Representation Learning (PGARL) module is designed to learn latent representations from multimodal images. It can use the estimated prior distribution to guide the bidirectional adversarial network in an optimal manner, thus stabilizing the representation learning and speeding up the convergence.

The Pairwise Collaborative Discriminator (PCD) is introduced to associate the edge and joint distribution between the input data and latent representations, which can minimize the difference of representation distribution from multimodal images.

Hypergraph Perceptual Network (HPN) is developed to establish highorder relations between and within multimodal images, improving the fusion effects in terms of morphologystructurefunction information and enhancing the discrimination of united connectivitybased features.
The remaining parts of this paper are organized as follows. Section II presents the related works. Section III describes the details of the proposed method. Section IV first introduces experimental settings and competing methods, and then presents results on the public database. Section V discusses the reliability of our results and the limitations of current study. Section VI concludes this paper and future work.
Ii Related Work
The current research of AD diagnosis using the GCN approach can be divided into two categories: groupbased approach and individualbased approach. The first approach constructs one graph for all the subjects by treating every subject as one node. For instance, Parisot et al. [38]
extracted node features by applying convolutional neural networks (CNN) on MRI of each subject and built node connections using nonimage data (e.g., sex, age), then the constructed graph is utilized to refine node features for AD diagnosis through semisupervised learning. In order to make use of complementary information from multimodal images, Yu et al.
[39] presented a multiscale enhanced GCN combining with the fMRI, DTI, and nonimage data for AD study. Another work in [40] established a novel framework that adds extralabel information to build a graph, which effectively improves the performance of Alzheimer’s disease prediction. In the second approach, each subject is built as one graph by predefined ROIs based on a specific atlas in the brain. Yu et al. [41] constructed a brain network model with weighted graph regularized sparse constraints, which yields visually enhanced accuracy in Mild Cognitive Impairment (MCI) classification. Also, the addition of more modal images can help disease diagnosis accuracy. Lei et al. [42] built a selfcalibrated brain network combining with fMRI and DTI to learn functional and structural complementary features. Xing et al. [43] trained a GCN model to study the tissuefunctional complementary characteristics of brain networks by using MRI and fMRI. However, the existing datadriven methods have accomplished the task of sample classification and prediction, which may ignore accurate evaluation on the varied characteristics of brain network and lack of biological explanations.In the field of cognitive disease analysis, the hypergraphbased methods can be separated into two groups: unimodal approach and multimodal approach. Researchers explore highorder relations among brain regions by constructing a hypergraph using single modality imaging in the first approach. For example, Jie et al. [44] utilized fMRI to generate hyperconnectivity features, which significantly improve the performance of MCI diagnosis and further discover valuable biomarkers. The work in [45] improved the construction of hypergraph from fMRI by applying sparse constraints and hyperedge weighting, and it provides additional information about the underlying biomarkers for cognitive study. In the second approach, multimodal images are jointly used to construct multiple hypergraphs for modeling highorder relationships among subjects. In order to overcome the problem of modal incompleteness, Liu et al. [46] presented a hypergraphbased method to bridge relationships among subjects from MRI, PET, and cerebrospinal fluid (CSF) for automatic brain disease diagnosis. Zhu et al. [47] adopted an iterative scheme in hypergraph learning using MRI and PET to identify the diagnostic labels and predict clinical scores of degenerative disease. Li et al. [48] built multimodal hypernetworks from two modalities of fMRI, and it achieves good performance and generates discriminative connections between MCI and normal control (NC). Nevertheless, the primary deficiency of the above methods is that they may not consider the highorder relations both within and between multimodal images, which do not entirely explore the potential complementary information from data.
Iii Method
Iiia Overview
Assuming subjects have the three modal images (i.e., fMRI, DTI, and MRI), our goal is to learn a complex nonlinear mapping network to fuse multimodal images for predicting abnormal brain connections at different stages of Alzheimer’s disease. As illustrated in Fig. 1
, our method consists of three components: 1) prior distribution estimation, 2) adversarial representation learning, and 3) the hypergraph perceptual network. Firstly, a kernel density estimation (KDE) method is introduced to estimate the prior distribution from graph data in terms of subject labels. Secondly, the estimated distribution is incorporated into a bidirectional adversarial learning network for multimodal representation learning. It should be stressed that the designed pairwise collaborative discriminator is used to constrain the representations in a joint embedding space. Then, a hypergraphbased network is developed to fuse the learned representations to produce united connectivitybased features. Finally, our network is trained with the following objective functions: the adversarial loss, the reconstruction loss, the classification loss, and the sparse regularization loss. The goal of loss functions is to improve the representation learning ability and the effect of multimodal image fusion. Details of the architecture and the hybrid objective functions are described in the following sections:
IIIB and IIIC.IiiB Architectures
IiiB1 Prior Distribution Estimation
Suppose an indirect graph is formed with brain ROIs, where and are a set of nodes and edges respectively. Specifically, denotes the node feature derived from fMRI, and represents the edge connectivity computed from DTI. The element in adjacent matrix is defined as if there exists connection between th and th region, otherwise .
Normal distribution cannot represent the distribution of graph data properly, and an appropriate distribution can help the adversarial network to boost representation learning ability. There is no other prior information except for the given graph data (i.e., , ). It is possible to estimate a prior distribution from the graph data. The heterogeneity between and makes it difficult to obtain directly, but they are structurally consistent. A crossdomain prototypes method can eliminate the gap between the node domain (i.e., ) and edge domain (i.e., ). Thus can be replaced with , where means the index set of prototypes.
The Determinantal Point Process(DPP) [49] based prototype learning method is adopted to select a diversified prototype subset . Specifically, based on some certain ROIs that have been verified to be closely related to the disease, a set of nodes is chosen from the connectivity matrix . When given the subset size
, the sampling probability of
is written as:(1) 
Where, , denotes the determinant of a square matrix. Hence, the maximum value of can output a subset of nodes that reflect the main features of the graph.
According to the prototype index set with , the corresponding node features are sampled from to form features matrix
. Then the feature dimension is reduced by Principal Component Analysis (PCA) to get
. is the dimension in latent representation space. A nonparametric estimation method (i.e., KDE) is introduced to estimate prior distribution. Assuming is a latent representation of each node, is defined as(2) 
Where is a multidimensional Gaussian kernel function, denotes the bandwidth that determines the smoothness of the estimated prior distribution.
At last, the approximation of prior distribution is obtained by the following formula
(3) 
IiiB2 Adversarial Representation Learning
The multimodal data is limited, so it is challenging to learn discriminative latent representations for disease prediction. In order to solve this problem, a bidirectional adversarial mechanism is introduced to incorporate the estimated distribution to augment samples, which can enhance the ability of multimodal representation learning. Specifically, we design a pairwise collaborative discriminator for learning latent representations and keeping their unimodal features by applying reconstruction and classification modules. This part has three phases: the adversarial phase, the reconstruction phase, and the classification phase.
In the adversarial phase, the generator accepts and as inputs and outputs a latent representation matrix . Meanwhile, the encoder accepts and as inputs and outputs a latent representation . The data is sampled from the prior distribution , which is sent to the generator to get . After that, three pairs of data, (,), (, ) and (,), are input into the discriminator for adversarial training. and are positive samples. , and are negative samples. In particular, GCN with twolayer is used as the generator , , and the encoder .
The structure of pairwise collaborative discriminator is shown in Fig. 2(a). The developed discriminator consists of two separate subnetworks (upper and lower) and a joint subnetwork (middle). All the subnetworks have three layers. From the first layer to the third layer, the filter size of the upper subnetwork is ; the filter size of the lower subnetwork is ; the filter size of the middle subnetwork are . The number of filters is 16, 16, and 1 for each layer of the subnetworks. In the last layer of each subnetwork, the activation function is used to constrain output value in the range of . The separative subnetworks can supervise the generators to learn robust representations, while the added joint subnetwork significantly improves the optimization efficiency.
The reconstruction phase is used to stabilize the representation learning. is fed to to rebuilt , and reconstruct through matrix inner product operation . is fed to the decoder to construct . Here, the and the decoder are set twolayer GCN with different parameters in hidden layer.
In the classification phase, to obtain more classdiscriminative features, a classifier is designed to constrain the latent representations and . The network of is defined as the same in [51], where the representation (i.e., or ) is first averaged along the dimension direction and then sent to a twolayer MultiLayer Perception (MLP) for classification.
IiiB3 Hypergraph Perceptual Network
Compared with the conventional graph characterized by pairwise relationships between paired nodes, hypergraph is more appropriate for capturing highorder relations among multiple brain regions. Hence, a hypergraphbased network combined with MLP and convolution is utilized to fuse the learned representations (i.e., and ). The obtained united connectivity matrix is sent to the classifier for task learning. The detail of the network is illustrated in Fig. 2(b).
First, a hypergraph is constructed for each learned representation. By defining a hyperedge as one centered node connecting multiple other nodes, we can construct two hypergraphs with a total of
hyperedges. Specifically, we use the Knearest neighbor(KNN) method to select the nodes for each hyperedge based on the Euclidean distance. At last, we can get an incident matrix
and from the representations and , respectively. Assuming hyperedge set , the incident matrix can be represented by(4) 
Then, we obtain the hyperedge features by vertex aggregation. The hypergraph convolution is spitted into convex convolution and hyperedge convolution. The former is a nonparameter operation, which is illustrated with the following formula
(5) 
(6) 
Where, and are the edge degree of and , respectively.
Next, we design a novel network that combines multilayer perception and graph convolution to capture the highorder relations between and within multimodal images. The concatenate hyperedge is fused between multimodal images by the MLP network, and the graph convolution network transforms the fused hyperedge into vertex features. The used MLP is a onelayer network, and the used GCN is a onelayer network. This operation process can be described as follows:
(7) 
Where, , is the node degree of . means concatenating two feature matrices. is the MLP layer parameters.
Then, the fused features are used to construct brain united connectivity with the following form:
(8) 
Finally, the obtained united connectivity is flattened and sent to a classifier for the classification task. is a twolayer MLP network.
IiiC Hybrid loss function
Success in effectively fusing multimodal images requires joint representation learning. The outputs of generator and encoder should have the same distribution despite the heterogeneity of input data. Besides, the learned representations and fused representation must be discriminative concerning class labels, and the output united connectivity matrix of HPN is believed to be sparse. An extensive object function is designed to optimize the network to achieve this goal in the proposed model. There are four types of loss function: the adversarial loss, the reconstruction loss, the classification loss, and the sparse regularization loss. The detailed information will be described below.
The adversarial loss is used to keep the distributionlevel consistency between the input data distribution and the latent representations. Given the node features , the node adjacent matrix , the semantic features and the estimated prior distribution , the loss objective can be expressed as follow
(9) 
(10) 
(11) 
Reconstruction loss functions are added to the adversarial learning network to retain the unimodal imaging information in learned representations. The reconstruction loss can be defined as follows:
(12) 
Where, and , are the reconstructed data, is binary cross entropy function.
In order to ensure the learned representations be discriminative to the corresponding labels, the classification loss is defined by crossentropy operation:
(13) 
Where, is the true label.
In addition, the penalty is introduced to make the united connectivity matrix sparse. It is given below:
(14) 
Here, is the norm.
In Conclusion, the total loss of the proposed frame is:
(15) 
Where, is a hyperparameter that determines the relative importance of sparse loss items.
Iv Experiments
Iva Data description and preprocessing
In this study, a total of 300 subjects are collected from Alzheimer ’s Disease Neuroimaging Initiative (ADNI3)^{1}^{1}1http://adni.loni.usc.edu/ database,including early mild cognitive impairment(EMCI), late mild cognitive impairment (LMCI), Alzheimer Disease(AD), and normal control (NC). The collected subjects with complete three modal images (i.e., fMRI, DTI, T1weighted MRI) are scanned by a 3T MRI scanner at different sites. For fMRI data, the range of image resolution in X and Y dimensions ranges from 2.5mm to 3.75mm, the range of slice thickness is from 2.5mm to 3.4mm. The TR(time of repetition) ranges from 0.607s to 3.0s, and the TE (time of echo) value is in the range of 30ms to 32ms. The total length of scan time is 10 minutes. For DTI data, the range of image resolution in X and Y dimensions ranges from 0.9mm to 2.7mm, the slice thickness is 2.0mm. The TR value is between 3.4s and 17.5s, and the TE value is in the range of 56ms to 105ms. The gradient directions of DTI data are between 6 and 126. For T1weighted MRI, the range of image resolution in X and Y dimensions ranges from 1.0mm to 1.06mm, the range of slice thickness is from 1.0mm to 1.2mm. The TR value is 2.3s, and the TE value is in the range of 2.94ms to 2.98ms. The detailed information is shown in Table I.
Group  NC(78)  EMCI(82)  LMCI(76)  AD(64) 

Male/Female  39M/39F  40M/42F  43M/33F  39M/25F 
Age(meanSD)  76.0/8.0  75.9/7.5  75.8/6.4  74.7/7.6 
For fMRI data preprocessing, we adopt the standard procedures using the GRETNA toolbox [52] to filter the functional timeseries signal. The main steps include magnetization equilibrium calibration, headmotion artifacts correction, spatial normalization, a bandpass filtering between 0.01Hz and 0.08Hz. Based on the Automated Anatomical Labeling(AAL) atlas [53], a total of 90 nonoverlapping ROIs were mapped to segment the brain. After that, we normalize the timeseries signal into a same length and obtain one matrix for each subject with the dimension size , which is the input feature matrix of our model.
For DTI structural brain network, PANDA toolbox [54] is used to perform the prepossessing operation for the determination of brain fractional anisotropy. There are five main steps with default settings, including resampling, skull removal, gap cropping, head movements and eddy currents correction. By setting the tracking conditions, network nodes and tracking stopping conditions, it generates a structural network matrix based on the deterministic fiber tracking method. The obtained matrix with the dimension size is the input adjacent matrix of our model.
The downloaded T1weighted MRI images are preprocessed with the following proprocessing steps, including Brain Extraction Tool(FSLBET) [55] that strip nonbrain tissue of the whole head, and FSLFLIRT [56] that align the images to the standardized template. The output image with a voxel size of in Neuroimaging Informatics Technology Initiative(NIFTI) file format is then input into a designed 40layer DenseNet [57] model to extract semantic vector. The obtained vector feature with the dimension size is the input feature of our model.
Modality  Method  AD vs. NC  LMCI vs. NC  EMCI vs. NC  

Acc  Sen  Spec  Auc  Acc  Sen  Spec  Auc  Acc  Sen  Spec  Auc  
fMRI  N2EN  79.57  70.31  81.81  87.68  74.02  72.36  74.32  80.75  72.50  68.29  75.67  79.57 
Ours  80.98  76.56  80.33  88.92  74.68  76.32  73.41  82.79  73.12  78.04  71.91  81.31  
fMRI &DTI  MPCA  80.28  70.31  88.64  85.37  75.97  73.68  78.20  80.58  74.37  79.27  69.23  83.22 
DCNN  84.51  87.50  82.05  89.40  79.87  77.63  82.05  84.29  76.25  76.83  75.64  82.68  
Ours  88.73  84.37  92.31  97.48  84.42  84.21  84.62  94.01  82.50  82.93  82.05  91.65  
fMRI & DTI & MRI  SPMRM  94.37  95.31  93.59  97.02  90.91  96.05  85.90  89.05  83.75  90.24  76.92  87.74 
Ours  96.47  98.43  94.87  99.59  92.20  96.05  88.46  94.83  87.50  92.68  82.05  92.72 
IvB Experimental settings
In this study, we use three kinds of binary classification tasks, i.e., (1) EMCI vs. NC, (2) LMCI vs. NC, (3) AD vs. NC. 10fold crossvalidation is selected for task learning. In order to demonstrate the superiority of our proposed model compared with other models, we introduce previous methods for comparison. (1) fMRI based methods, including nonnegative elasticnet based method(N2EN) [50] and Ours. (2) fMRI&DTI based methods, including Multilinear Principal Component Analysis (MPCA) [58]
, Diffusion convolutional neural networks (DCNN)
[51] and Ours. (3) fMRI&DTI&MRI based methods: selfpaced sample weighting based multimodal rank minimization (SPMRM) [59] and Ours. In the SPMRM method, we replace the PET and CSF data with structural connectivity and functional time series, respectively.In the experiments, the proposed model is trained on the three modal images. we set the model parameter as follows: . Four certain ROIs (i.e., the left and right hippocampus, the left and right Parahippocampal) are included in the estimation of the prior distribution. Tanh and sigmoid activation functions are used for generators and decoders, respectively. The batch size of the model is set at 8. In the training process, TensorFlow1^{2}^{2}2http://www.tensorflow.org/
is utilized to implement on an NVIDIA TITAN RTX2080 GPU device. It takes about 8 hours to train our model on each fold crossvalidation with a total of 500 epochs. The initial learning rate of the generators, encoder, decoder, and classifiers is
and will decrease to at 100 epochs; while the learning rate of the discriminator is set 0.0001 constantly. The learning rate of the hypergraph perception network is set 0 at the beginning 100 epochs and then decreased by multiplying along with the iteration with an initial value of 0.001. The momentum method with a coefficient of 0.9 was used to optimize the learning processes.In the prediction performance evaluation, we use four metircs to quantitativly evaluate the diagnosis performance, including accuracy(ACC), sensitivity(SEN), specificity(SPE), and area under the receiver operating characteristic(ROC) curve. The Area Under a ROC curve (AUC) is used to comprehensively measure classifier performance (). Note that an AUC value of 0.5 indicates a random classifier.
IvC Effect analysis of prior distribution
The prior information is an essential factor to help diagnose disease. An advantage of the proposed network is integrating estimated prior distribution and adversarial network for representation learning. Considering the structure of our network, we investigate the effect of three conditions on final prediction accuracy: 1) without prior distribution, in this situation, the adversarial network is removed from the proposed model; 2) normal prior distribution; 3) estimated prior distribution. The results in our three prediction tasks are shown in Fig. 3. It can be observed that the network with prior distribution behaves better than the network without prior distribution. Besides, estimating the prior distribution from anatomical knowledge has a more considerable accuracy than the model’s performance using normal prior distribution.
IvD Effect analysis of discriminator structure
The discriminator is significant for the whole adversarial training process. In order to prove the advantages of our designed network in discriminator, we separate it into two individual discriminators for comparison. In this situation, there is no concatenation and summation in Fig. 2(a). After the training stage, the same test set of three modal images are used. As is shown in Fig. 4, it can be seen that adding collaborative network in the discriminator achieves better classification performance in terms of AUC, ACC, SPE, and SEN. Overall, the proposed discriminator network can add the joint distribution variability of input features and learned representations to the adversarial optimization process, which in turn brings better performance.
IvE Effect analysis of hypergraph parameters
In this section, the effects of hypergraph parameters on the prediction results are investigated to evaluate the performance of the proposed HPN module. The parameter controls the number of interacting vertices in hypergraph construction, and the parameter determines the sparsity of the brain united connectivity. As these two parameters both have a great impact on the final results, we evaluated the prediction performance by varying the values of and within their respective ranges, i.e., and . Particularly, the value 0 of means that the HPN module degrades to graph fusion with normal twolayer GCN. Fig. 5 shows the accuracy of the proposed HPN concerning different combinations of values (). The hypergraph method with optimal parameters performs better than the graph method in three classification tasks. The optimal pair of hypergraph parameters are (8,) for EMCI vs. NC, (10,) for LMCI vs. NC, and (10,) for AD vs. NC.
IvF Prediction results
The experimental prediction results of all models are displayed in Table II. It can be seen that the proposed model using fMRI&DTI&MRI achieves the highest mean accuracy of 96.47%, 92.20%, and 87.50% on the prediction of AD vs. NC, LMCI vs. NC, EMCI vs. NC, respectively. It can be concluded that our model performs better than other methods in terms of unimodality, bimodality or triplemodality medical images. In addition, the results of triplemodality imagingbased methods are superior to that of bimodality imagingbased methods. In order to analyze the prediction results, fMRI&DTI based methods are selected to compare the learned latent features. Fig. 6 shows the projection of learned latent features on a twodimensional plane using tSNE tools [60]. As can be seen in this figure, the features obtained by our model give the best discriminative map among the three methods.
IvG Quantitative analysis of important brain regions
To evaluate the influence of different brain regions on the prediction tasks, we shield one brain region and get one importance score for that brain region by using a 10fold crossvalidation strategy. The importance score is defined as one minus the mean accuracy. Hence, a considerable value of importance score means a critical brain region. According to the 90 ROIs in the AAL atlas, we sort the importance scores and obtain ten important brain regions for each prediction task. Fig. 7, Fig. 8 and Fig. 9 shows the top 10 ROIs in terms of their importance scores for EMCI vs. NC, LMCI vs. NC and AD vs. NC, respectively. The 10 important brain regions of EMCI vs. NC are PoCG.L, SPG.R, FFG.L, PCUN.L, SPG.L, PCG.R, ROL.L, SFGmed.R, SMA.R and ITG.L. The ten important brain regions of LMCI vs. NC are AMYG.L, PoCG.L, PCL.L, PreCG.L, PHG.R, SPG.R, IPL.L, THA.L, TPOmid.L, and HIP.L. The 10 important brain regions of AD vs. NC classification are SFGmed.L, CAU.R, ORBinf.L, SFGmed.R, IPL.R, AMYG.L, ACG.L, ORBsup.L, THA.L and IFGtriang.R.
IvH Quantitative analysis of abnormal brain connections
To uncover the mechanism of cognitive disease, we investigate the obtained united connectivity (UC) using statistical tests between two different groups. In particular, we constructed a UC for each subject using fMRI&DTI&MRI and then estimated the significance of each connection between two groups using the standard twosample ttest method. Fig.
10, Fig. 11 and Fig. 12 show the values of connections between each pair of ROIs for EMCI vs. NC, LMCI vs. NC and AD vs. NC, respectively. For ease of visualization, a threshold value with 0.05 is chosen to display the significant connections. For each prediction task, we count the significant connections for each ROI and then obtain 10 top discriminative ROIs with the highest occurrence frequency. The significant connections and ROIs of EMCI vs. NC are mainly located in SMA.R, OLF.R, ORBinf.R, PHG.R, ITG.R, PCL.R, ROL.L, SFGmed.R, PoCG.L, ANG.R. The significant brain regions for the task of LMCI vs. NC are OLF.R, AMYG.L, ITG.R, ORBinf.R, PHG.R, PCL.L, HIP.L, PCL.R, FFG.R, and DCG.R. The significant brain regions for the task of AD vs. NC are AMYG.L, ORBinf.R, FFG.R, DCG.R, SPG.R, OLF.R, SFGmed.R, PHG.L, LING.R, and SMG.R. The list regions are partly overlapped with the results in the above section, and these findings are partly consistent with previous studies [39, 48]. In order to visualize the important abnormal connections at different stages, we choose values smaller than 0.001 to select significant connections between two groups. Those important abnormal connections are displayed in Fig. 13 using the BrainNet Viewer [61].Based on the significant values, we calculate the mean connection strength for each group to analyze the characteristics of subjects at different stages. We mean the UCs for each group and then subtract the mean UC of the NC group from the mean UC of the patient’s group (i.e., EMCI, LMCI, AD). The subsequent connectivity means altered connections. As is displayed in Fig. 14, the altered connections differ in different stages. Specifically, the number of reduced connections rises from EMCI to AD stage, while the number of increased connections rises from EMCI stage to LMCI stage and then drops down at AD stage. In addition, most of the altered connections are between internetworks, regardless of the connection number or the connection strength. Fig. 15 displayed the normalized connection strength of internetworks and intranetworks in different prediction tasks. The reduced connection strength of the AD stage is the largest in the three stages, while the most considerable increased connection strength emerges at the LMCI stage.
V Discussion
In the work of abnormal connection prediction, the characteristics change in connections strength suggest a compensate mechanism [62, 63] at the MCI stage and a deteriorate progression at the AD stage. As shown in Fig. 15, at the early stage of AD, some brain connections are damaged while some other connections emerged or are strengthened to compensate for functional activities. With the progression of the disease, more extensive and severe network damage occurs at the late stage of AD, which leads to the degeneration of the whole brain function and, finally, severe cognitive decline. Most of the reduced connections occur in internetwork, and the increased connections are likely to emerge in intranetwork. Furthermore, part of our discovered abnormal connections at the MCI stage is consistent with the neuroscience findings [64]. Relative to NC, the left hippocampus at the LMCI stage lose connections to the left parahippocampal gyrus and left angular gyrus while making new connections to the left posterior cingulate gyrus and left inferior occipital gyrus. Other reduced connections could be identified between the right parahippocampal gyrus and the right temporal pole: superior temporal gyrus, as well as the right temporal pole: middle temporal gyrus. Also, the rest identified increased connections are between the right hippocampus and the right posterior cingulate gyrus, and additionally between the left parahippocampal gyrus and left medial superior frontal gyrus. The Fig. 13 also suggests that reduced connections tend to have longer distances than increased connections, which is consistent with the work in [65, 66].
There are still two limitations in the current study. One is that we introduce only four known closely diseasedrelated ROIs to estimate the prior distribution. Some diseasedunrelated ROIs will be considered in future work. Another one is that the data used in this study are mainly neuroimaging data. Since genetic modality is correlated with cognitive disease [67], it is interesting to add single nucleotide polymorphism (SNP) data for abnormal connection prediction in future studies.
Vi Conclusion
In this paper, we proposed a PGARLHPN for predicting abnormal brain connections at different stages of AD, which effectively integrates fMRI, DTI, and MRI. As an attempt to make use of prior distribution estimated from anatomical knowledge, a bidirectional adversarial mechanism with a novel pairwise collaborative discriminator was designed to help the generator learn joint representations while keeping the learned representations in the same distribution. Also, the HPN module was developed to effectively fuse the learned representations and capture the complementary multimodal information. Quantitatively experimental results suggest that our proposed model performs better than other related methods in analyzing and predicting Alzheimer’s disease progression. Moreover, the proposed model can evaluate characteristics of abnormal brain connections at different stages of Alzheimer’s disease, where part of the identified abnormal connections are consistent with previous neuroscience discoveries. The identified abnormal connections may help us understand the underlying mechanisms of neurodegenerative diseases and provide biomarkers for early cognitive disease treatment.
Acknowledgment
This work was supported by the National Natural Science Foundations of China under Grant 61872351 and the International Science and Technology Cooperation Projects of Guangdong under Grant 2019A050510030.
References
 [1] A.s. Association, “2019 Alzheimer’s disease facts and figures,” Alzheimer’s & dementia, vol. 15, no. 3, pp. 321387, 2019.
 [2] C. Patterson, “The state of the art of dementia research: New frontiers,” World Alzheimer Report 2018, 2018.
 [3] B. Lei, M. Yang, P. Yang, F. Zhou, W. Hou, W. Zou, X. Li, T. Wang, X. Xiao, and S. Wang, “Deep and joint learning of longitudinal data for Alzheimer’s disease prediction,” Pattern Recognition, vol. 102, pp. 107247, 2020.

[4]
X. Hao, Y. Bao, Y. Guo, M. Yu, D. Zhang, S.L. Risacher, A.J. Saykin, X. Yao, L. Shen, and Alzheimer’s Disease Neuroimaging Initiative, “Multimodal neuroimaging feature selection with consistent metric constraint for diagnosis of Alzheimer’s disease,”
Medical image analysis, vol. 60, pp. 101625, 2020.  [5] W. Yu, B. Lei, M.K. Ng, A.C. Cheung, Y. Shen, and S. Wang, “Tensorizing GAN with highorder pooling for Alzheimer’s disease assessment,” IEEE Transactions on Neural Networks and Learning Systems, doi: 10.1109/TNNLS.2021.3063516, 2021.

[6]
B. Lei, E. Liang, M. Yang, P. Yang, F. Zhou, E.L. Tan, Y. Lei, C.M. Liu, T. Wang, X. Xiao, and S. Wang, “Predicting Clinical Scores for Alzheimer’s Disease Based on Joint and Deep Learning,”
Expert Systems with Applications, pp. 115966, 2021.  [7] S. Wang, Y. Shen, W. Chen, T. Xiao, and J. Hu, “Automatic recognition of mild cognitive impairment from mri images using expedited convolutional neural networks,” In International Conference on Artificial Neural Networks, 2017, pp. 373380.
 [8] S. Wang, H. Wang, Y. Shen, and X. Wang, “Automatic recognition of mild cognitive impairment and alzheimers disease using ensemble based 3d densely connected convolutional networks,” In 2018 17th IEEE International conference on machine learning and applications (ICMLA), 2018, pp. 517523.
 [9] X. Xing, Q. Li, H. Wei, M. Zhang, Y. Zhan, X.S. Zhou, Z. Xue, and F. Shi, “Dynamic spectral graph convolution networks with assistant task training for early mci diagnosis,” In International Conference on Medical Image Computing and ComputerAssisted Intervention, 2019, pp. 639646.
 [10] M. Ghanbari, L.M. Hsu, Z. Zhou, A. Ghanbari, Z. Mo, P.T. Yap, H. Zhang, and D. Shen, “A New Metric for Characterizing Dynamic Redundancy of Dense Brain Chronnectome and Its Application to Early Detection of Alzheimer ’s Disease,” In International Conference on Medical Image Computing and ComputerAssisted Intervention, 2020, pp. 312.
 [11] S. Wang, H. Wang, A.C. Cheung, Y. Shen, and M. Gan, “Ensemble of 3D Densely Connected Convolutional Network for Diagnosis of Mild Cognitive Impairment and,” Deep learning applications, vol. 1098, pp. 5373, 2020.

[12]
Y. Qiu, S. Yu, Y. Zhou, D. Liu, X. Song, T. Wang, and B. Lei, “Multichannel Sparse Graph Transformer Network for Early Alzheimer ’s Disease Identification,” In
2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 2021, pp. 17941797.  [13] N. Schuff, N. Woerner, L. Boreta, T. Kornfield, L. M. Shaw, J.Q. Trojanowski, P.M. Thompson, C.R. Jack Jr, M.W. Weiner, and Alzheimer’s Disease Neuroimaging Initiative, “MRI of hippocampal volume loss in early Alzheimer’s disease in relation to ApoE genotype and biomarkers,” Brain, vol. 132, no. 4, pp. 10671077, 2009.
 [14] J. B. Pereira, D. Van Westen, E. Stomrud, T. O. Strandberg, G. Volpe, E. Westman, and O. Hansson, “Abnormal structural brain connectome in individuals with preclinical Alzheimer’s disease,” Cerebral cortex, vol. 28, no. 10, pp. 36383649, 2018.
 [15] M. Tahmasian, L. Pasquini, M. Scherr, C. Meng, S. Forster, S. M. Bratec, K. Shi, I. Yakushev, M. Schwaiger, T. Grimmer, and J. DiehlSchmid, “The lower hippocampus global connectivity, the higher its local metabolism in Alzheimer disease,” Neurology, vol. 84, no. 19, pp. 19561963, 2015.
 [16] E. Bullmore and O. Sporns, “Complex brain networks: Graph theoretical analysis of structural and functional systems,” Nature reviews neuroscience, vol. 10, no. 3, pp. 186198, 2009.
 [17] T. N. Kipf and M. Welling, “Semisupervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
 [18] Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2019, vol. 33, no. 01, pp. 35583565.
 [19] T. Zhou, K.H. Thung, X. Zhu, and D. Shen, “Effective feature learning and fusion of multimodality data using stagewise deep neural network for dementia diagnosis,” Human brain mapping, vol. 40, no. 3, pp. 10011016, 2019.
 [20] S. Hu, J. Yuan, and S. Wang, “Crossmodality synthesis from MRI to PET using adversarial Unet with different normalization,” In 2019 International Conference on Medical Imaging Physics and Engineering (ICMIPE), 2019, pp. 15.
 [21] S. Hu, Y. Shen, S. Wang, and B. Lei, “Brain MR to PET Synthesis via Bidirectional Generative Adversarial Network,” In International Conference on Medical Image Computing and ComputerAssisted Intervention, 2020, pp. 698707.
 [22] S. Hu, B. Lei, S. Wang, Y. Wang, Z. Feng, and Y. Shen, “Bidirectional Mapping Generative Adversarial Networks for Brain MR to PET Synthesis,” IEEE Transactions on Medical Imaging, doi: 10.1109/TMI.2021.3107013, 2021.
 [23] S.Q. Wang, X. Li, J.L. Cui, H.X. Li, K.D. Luk, and Y. Hu, “Prediction of myelopathic level in cervical spondylotic myelopathy using diffusion tensor imaging,” Journal of Magnetic Resonance Imaging, vol. 41, no. 6, pp. 16821688, 2015.
 [24] S. Wang, Y. Shen, C. Shi, P. Yin, Z. Wang, P.W.H. Cheung, J.P.Y Cheung, K.D.K. Luk, and Y. Hu, “Skeletal maturity recognition using a fully automated system with convolutional neural networks,” IEEE Access, vol. 6, pp. 2997929993, 2018.
 [25] S. Wang, Y. Hu, Y. Shen, and H. Li, “Classification of diffusion tensor metrics for the diagnosis of a myelopathic cord using machine learning,” International journal of neural systems, vol. 28, no. 02, pp. 1750036, 2018.
 [26] S. Wang, X. Wang, Y. Shen, B. He, X. Zhao, P.W.H. Cheung, J.P.Y. Cheung, K.D.K. Luk, and Y. Hu, “An ensemblebased denselyconnected deep learning system for assessment of skeletal maturity,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020.
 [27] S.Q. Wang, and J.H. He, “Variational iteration method for solving integrodifferential equations,” Physics letters A, vol. 367, no. 3, pp. 188191, 2007.
 [28] S.Q. Wang, and J.H. He, “Variational iteration method for a nonlinear reactiondiffusion process,” International Journal of Chemical Reactor Engineering, vol. 6, no. 1, 2008.
 [29] L.F. Mo, and S.Q. Wang, “A variational approach to nonlinear twopoint boundary value problems,” Nonlinear Analysis: Theory, Methods & Applications, vol. 71, no. 12, pp. e834e838, 2009.
 [30] S.Q. Wang, “A variational approach to nonlinear twopoint boundary value problems,” Computers & Mathematics with Applications, vol. 58, no. 1112, pp. 24522455, 2009.
 [31] S. Hu, W. Yu, Z. Chen, and S. Wang, “Medical Image Reconstruction Using Generative Adversarial Network for Alzheimer Disease Assessment with ClassImbalance Problem,” In 2020 IEEE 6th International Conference on Computer and Communications (ICCC), 2020, pp. 13231327.
 [32] S. Wang, X. Wang, Y. Hu, Y. Shen, Z. Yang, M. Gan, and B. Lei, “Diabetic retinopathy diagnosis using multichannel generative adversarial network with semisupervision,” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 2, pp. 574585, 2020.
 [33] B. Lei, Z. Xia, F. Jiang, X. Jiang, Z. Ge, Y. Xu, J. Qin, S. Chen, T. Wang, and S. Wang, “Skin lesion segmentation via generative adversarial networks with dual discriminators,” Medical Image Analysis, vol. 64, pp. 101716, 2020.
 [34] H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo, “Graphgan: Graph representation learning with generative adversarial nets,” in Proceedings of the AAAI conference on artificial intelligence, 2018, vol. 32, no. 1.
 [35] Q. Dai, Q. Li, J. Tang, and D. Wang, “Adversarial network embedding,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2018, vol. 32, no. 1.
 [36] S. Yu, H. Yang, H. Nakahara, G. S. Santos, D. Nikoli, and D. Plenz, “Higherorder interactions characterized in cortical activity,” Journal of neuroscience, vol. 31, no. 48, pp. 1751417526, 2011.
 [37] M. Wang, X. Hao, J. Huang, W. Shao, and D. Zhang, “Discovering network phenotype between genetic risk factors and disease status via diagnosisaligned multimodality regression method in Alzheimers disease,” Bioinformatics, vol. 35, no. 11, pp. 19481957, 2019.
 [38] S. Parisot, S.I. Ktena, E. Ferrante, M. Lee, R. Guerrero, B. Glocker, and D. Rueckert, “Disease prediction using graph convolutional networks: application to autism spectrum disorder and Alzheimer’s disease,” Medical image analysis, vol. 48, pp. 117130, 2018.
 [39] S. Yu, S. Wang, X. Xiao, J. Cao, G. Yue, D. Liu, T. Wang, Y. Xu, and B. Lei, “Multiscale enhanced graph convolutional network for early mild cognitive impairment detection,” in International Conference on Medical Image Computing and ComputerAssisted Intervention, 2020, pp. 228237.
 [40] X. Song, F. Zhou, A.F. Frangi, J. Cao, X. Xiao, Y. Lei, T. Wang and B. Lei, “Graph convolution network with similarity awareness and adaptive calibration for diseaseinduced deterioration prediction,” Medical Image Analysis, vol. 69, pp. 101947, 2021.
 [41] R. Yu, L. Qiao, M. Chen, S.W. Lee, X. Fei, and D. Shen, “Weighted graph regularized sparse brain network construction for MCI identification,” Pattern recognition, vol. 90, pp. 220231, 2019.
 [42] B. Lei, N. Cheng, A.F. Frangi, E.L. Tan, J. Cao, P. Yang, A. Elazab, J. Du, Y. Xu, and T. Wang, “Selfcalibrated brain network estimation and joint nonconvex multitask learning for identification of early Alzheimer’s disease,” Medical image analysis, vol. 61, pp. 101652, 2020.
 [43] X. Xing, Q. Li, H. Wei, M. Zhang, Y. Zhan, X.S. Zhou, Z. Xue, and F. Shi, “October. Dynamic spectral graph convolution networks with assistant task training for early mci diagnosis,” in International Conference on Medical Image Computing and ComputerAssisted Intervention, 2019, pp. 639646.
 [44] B. Jie, C.Y. Wee, D. Shen, and D. Zhang, “Hyperconnectivity of functional networks for brain disease diagnosis,” Medical image analysis, vol. 32, pp. 84100, 2016.
 [45] L. Xiao, J. Wang, P.H. Kassani, Y. Zhang, Y. Bai, J.M. Stephen, T.W. Wilson, V.D. Calhoun, and Y.P. Wang, “Multihypergraph learningbased brain functional connectivity analysis in fMRI data,” IEEE transactions on medical imaging, vol. 39, no. 5, pp. 17461758, 2019.
 [46] M. Liu, Y. Gao, P.T. Yap, and D. Shen, “Multihypergraph learning for incomplete multimodality data,” IEEE journal of biomedical and health informatics, vol. 22, no. 4, pp. 11971208, 2017.
 [47] Y. Zhu, X. Zhu, M. Kim, J. Yan, D. Kaufer, and G. Wu, “Dynamic hypergraph inference framework for computerassisted diagnosis of neurodegenerative diseases,” IEEE transactions on medical imaging, vol. 38, no. 2, pp. 608616, 2018.
 [48] Y. Li, J. Liu, X. Gao, B. Jie, M. Kim, P.T. Yap, C.Y. Wee, and D. Shen, “Multimodal hyperconnectivity of functional networks using functionallyweighted LASSO for MCI classification,” Medical image analysis, vol. 52, pp. 8096, 2019.
 [49] A. Kulesza, B. Taskar, “Fixedsize determinantal point processes,” in Proceedings of the 28th international conference on Machine learning, 2011, pp. 11931200.
 [50] Q. Zhu, J. Huang, X. Xu, “Nonnegative discriminative brain functional connectivity for identifying schizophrenia on restingstate fmri,” Biomedical Engineering Online, vol. 17, no. 1, pp. 115, 2018. Zhu, Qi, Jiashuang Huang, and Xijia Xu. ”Nonnegative discriminative brain functional connectivity for identifying schizophrenia on restingstate fMRI.” Biomedical engineering online 17.1 (2018): 115.
 [51] J. Atwood, D. Towsley, “Diffusionconvolutional neural networks,” in Advances in neural information processing systems, 2016, pp. 19932001.
 [52] J. Wang, X. Wang, M. Xia, X. Liao, A. Evans, and Y. He, “GRETNA: a graph theoretical network analysis toolbox for imaging connectomics,” Frontiers in human neuroscience, vol. 9, pp. 386, 2015.
 [53] N. TzourioMazoyer, B. Landeau, D. Papathanassiou, F. Crivello, O. Etard, N. Delcroix, B. Mazoyer, and M. Joliot, “Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI singlesubject brain,” Neuroimage, vol. 15, no. 1, pp. 273289, 2002.
 [54] Z. Cui, S. Zhong, P. Xu, G. Gong, and Y. He, “PANDA: a pipeline toolbox for analyzing brain diffusion images,” Frontiers in human neuroscience, vol. 7, pp. 42, 2013.
 [55] M.W. Woolrich , S. Jbabdi , B. Patenaude , M. Chappell , S. Makni , T. Behrens ,C. Beckmann , M. Jenkinson , S.M. Smith , “Bayesian analysis of neuroimaging data in FSL,” Neuroimage, vol. 45, no. 1, pp. S173 –S186, 2009.
 [56] M. Jenkinson , P. Bannister , M. Brady , S. Smith , “Improved optimization for the robust and accurate linear registration and motion correction of brain images,” Neuroimage, vol. 17, no. 2, pp. 825841, 2002.

[57]
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” In
Proceedings of the IEEE conference on computer vision and pattern recognition
, 2017, pp. 47004708.  [58] H. Lu, K. N. Plataniotis and A. N. Venetsanopoulos, “MPCA: Multilinear Principal Component Analysis of Tensor Objects,” IEEE Transactions on Neural Networks, vol. 19, no. 1, pp. 1839, 2008.
 [59] Q. Zhu, N. Yuan, J. Huang, X. Hao, and D. Zhang, “Multimodal AD classification via selfpaced latent correlation analysis,” Neurocomputing, vol. 355, pp. 143154, 2019.
 [60] L. Van der Maaten and G. Hinton, “Visualizing data using tSNE,” Journal of machine learning research, vol. 9, no. 11, 2008.
 [61] M. Xia, J. Wang, and Y. He,“BrainNet Viewer: a network visualization tool for human brain connectomics,” PloS one, vol. 8, no. 7, pp. e68910, 2013.
 [62] Z. Qi, X. Wu, Z. Wang, N. Zhang, H. Dong, L. Yao, and K. Li, “Impairment and compensation coexist in amnestic MCI default mode network,” Neuroimage, vol. 50, no. 1, pp. 4855, 2010.
 [63] M. Montembeault, I. Rouleau, J. S. Provost, and S. M. Brambati, “Altered gray matter structural covariance networks in early stages of Alzheimer’s disease,” Cerebral cortex, vol. 26, no. 6, pp. 26502662, 2016.
 [64] D. Berron, D. van Westen, R. Ossenkoppele, O. Strandberg, and O. Hansson, “Medial temporal lobe connectivity and its associations with cognition in early Alzheimer’s disease,” Brain, vol. 143, no. 4, pp. 12331248, 2020.
 [65] A.L.W. Bokde, P. LopezBayo, T. Meindl, S. Pechler, C. Born, F. Faltraco, S.J. Teipel, H.J. Möller, and H. Hampel, “Functional connectivity of the fusiform gyrus during a facematching task in subjects with mild cognitive impairment,” Brain, vol. 129, no. 5, pp. 11131124, 2006.
 [66] Y. He, Z. Chen, and A. Evans, “Structural insights into aberrant topological patterns of largescale cortical networks in Alzheimer’s disease,” Journal of Neuroscience, vol. 28, no. 18, pp. 47564766, 2008.
 [67] J.C. Bis, C. DeCarli, A. V. Smith, F. Van Der Lijn, F. Crivello, M. Fornage, S. Debette, J.M. Shulman, H. Schmidt, V. Srikanth, and M. Schuur, “Common variants at 12q14 and 12q24 are associated with hippocampal volume,” Nature genetics, vol. 44, no. 5, pp. 545, 2012.
Comments
There are no comments yet.