Encoding Multi-Resolution Brain Networks Using Unsupervised Deep Learning

08/13/2017 ∙ by Arash Rahnama, et al. ∙ Middle East Technical University 0

The main goal of this study is to extract a set of brain networks in multiple time-resolutions to analyze the connectivity patterns among the anatomic regions for a given cognitive task. We suggest a deep architecture which learns the natural groupings of the connectivity patterns of human brain in multiple time-resolutions. The suggested architecture is tested on task data set of Human Connectome Project (HCP) where we extract multi-resolution networks, each of which corresponds to a cognitive task. At the first level of this architecture, we decompose the fMRI signal into multiple sub-bands using wavelet decompositions. At the second level, for each sub-band, we estimate a brain network extracted from short time windows of the fMRI signal. At the third level, we feed the adjacency matrices of each mesh network at each time-resolution into an unsupervised deep learning algorithm, namely, a Stacked De- noising Auto-Encoder (SDAE). The outputs of the SDAE provide a compact connectivity representation for each time window at each sub-band of the fMRI signal. We concatenate the learned representations of all sub-bands at each window and cluster them by a hierarchical algorithm to find the natural groupings among the windows. We observe that each cluster represents a cognitive task with a performance of 93 Index. We visualize the mean values and the precisions of the networks at each component of the cluster mixture. The mean brain networks at cluster centers show the variations among cognitive tasks and the precision of each cluster shows the within cluster variability of networks, across the subjects.



There are no comments yet.


page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The data produced by functional Magnetic Resonance Imaging (fMRI) is high-dimensional and sometimes not suitable for analyzing the cognitive states [1]. Learning efficient low-dimensional features from high-dimensional complex input spaces is crucial for the decoding of cognitive processes. In this paper, we explore deep learning algorithms in order to i) find a compact representation of connectivity patterns embedded in fMRI signals, ii) detect natural groupings of these patterns and, iii) use these natural groups to extract brain networks to represent cognitive tasks.

Our framework is built upon our previous work in the area [2], where we decompose fMRI signals into various frequency sub-bands using their wavelet transforms. We further utilize the signals at different sub-bands to form multi-resolution brain networks. Recent studies have shown that brain networks formed by the correlation of voxel pairs’ in fMRI signals provide more information for brain decoding compared to the temporal information of single voxels [3, 4]. Moreover, there has been a shift in the literature toward brain decoding algorithms that are based on the connectivity patterns in the brain motivated by the belief that these patterns provide more information about cognitive tasks than the isolated behavior of individual anatomic regions [5, 6, 7].

Contrary to the methods suggested in [3, 4]

where supervised learning algorithms are employed for brain decoding, in this paper, we investigate the common groupings in HCP task data set to find out if these natural groups correspond to the cognitive tasks. This approach enables us to find shared network representations of a cognitive task together with its variations across the subjects. Additionally, multi-resolution representation of the fMRI signals enables us to observe the variations of networks among different frequency sub-bands.

After constructing the brain networks representing the connectivity patterns among the anatomic regions of the brain at each sub-level, a Stacked De-noising Auto-Encoder (SDAE) algorithm is employed to learn shared connectivity features associated with a task based on the estimated mesh networks at different sub-bands. We concatenate the learned connectivity patterns from several wavelet sub-bands and utilize them in a hierarchical clustering algorithm with a distance matrix based on their correlations. The main reason behind concatenation of the feature matrices is that the detected patterns in the brain at different frequencies provide complementary information in regard to the overall cognitive state of the brain.

Our results show that the mesh network representation of cognitive tasks is superior compared to fMRI time-series representation. We observe that SDAE successfully learns a set of connectivity patterns which provide an increased clustering performance. The performances are further improved by fusing the learned representations from multiple time-resolutions. This shows that the modeling of the connectivity of brain in multiple sub-bands of the fMRI signal leads to diverse mesh networks carrying complementary information for representing the cognitive tasks. The high rand index

obtained at the output of the clustering algorithm proves the existence of natural groups with low within-cluster-variances and high between-class-variances among the tasks.

In order to analyze the similarities and distinctions among the network topologies of fMRI signal, we visualize the networks and their precisions at the cluster centers. The cluster precisions, indicate shared connectivities among the subjects, whereas the mesh networks at the cluster center show a representative network for each cognitive task. It is observed that there are high inter-subject variances in the mesh networks.

Ii Experimental Setup

We use the fMRI data from HCP for subjects performing specific tasks. A subject performs seven distinct cognitive tasks during the experiment given in Table I [8]. Each task consists of scans of the brain volume representing the changes in the brain during the task (the underlying cognitive process). The duration and the number of scans are task-dependent but the same for all participants. The total number of scans is and we use anatomical regions of AAL after removing the anatomical regions in Cerebellum and Vermis.

Emotion Gambling Language Motor Relational Social WM Scans 176 253 316 284 232 274 405 Durations 2:16 3:12 3:57 3:34 2:56 3:27 5:01
TABLE I: Scans per Task and the Duration for each Task (min:sec).

Representative time-series data points are attained by spatially averaging the signals associated with voxels () residing in the same region () in the brain,

where represents the total number of voxels in region .

Iii Hierarchical Multi-Resolution Mesh Networks (HMMNs)

Fig. 1: An Overview of the Proposed Deep Learning Framework.

In our work, we utilize the representative time series obtained for each anatomic region to build a set of local meshes. The local meshes estimated around each anatomic region are ensembled to form a mesh network. This is motivated by the fact that the structure of the brain is highly interconnected and that neurons influence each other based on the strengths of their synaptic connections

[9]. HMMNs model cognitive tasks by estimating a mesh network at each frequency sub-band. It is expected that the brain network at each sub-band provides supplementary information about the underlying brain activities. We will show that our modeling of the brain with HMMNs greatly enhances the brain decoding performance by allowing us to look at cognitive states of the brain regions in multiple time-resolutions.

As the first step, the representative time-series , for each anatomic region are decomposed into a set of signals in different time-resolutions. This allows us to estimate and analyze how the anatomical regions process information in different frequency resolutions [10]. We adopt Discrete Wavelet Transform (DWT) as our main tool [11]. We apply the DWT to for all brain regions to decompose the signals into sub-bands where (). At sub-band level , we attain two sets of orthonormal components named as sets of approximation coefficients and detail coefficients where represents the location of the wavelet waveform in discrete-time [11]. These coefficients then may be utilized to reconstruct the fMRI signals at each frequency level, yielding the total of fMRI time-series. Formally, the representative time-series at sub-band () may be defined as,

where and are called the mother wavelet and the father wavelet. More details on our approach are given in [2].

Now, we can construct a mesh network at each sub-band to represent cognitive tasks in terms of the relationships among anatomic regions. The construction of these networks help us analyze the topological properties of the brain and extract connectivity patterns associated with a cognitive process at each sub-band. In order to demonstrate the benefits of our approach, we propose an unsupervised clustering framework which can successfully take advantage of the connectivity patterns and distinguish between different cognitive tasks at multiple sub-bands. For this purpose, we divide the entire experiment session ( number of scans) for a subject into unlabeled windows of length consisting of discrete scans, where for each subject for the entire experiment. The length of the window is determined empirically, as the shortest time-interval which provides the highest rand index, at the output of clustering. It is important to note that the windows are unlabeled and may consist of overlapping data points from different cognitive tasks.

The nodes of the mesh networks are connected to their -nearest neighbors to form a star mesh around a region. The nearest neighbors for a certain node are the ones having the largest Pearson correlation coefficients with the node. For each mesh formed around an anatomic region , the arc weights for the window are estimated at the sub-band using the following regularized linear model,


where the regularization parameter is . The mesh arc weights , defined in the neighborhood of region , are estimated by minimizing the error .

is a vector representing the average voxel time-series in region

at sub-band for the window , such that,

The relation defined in (1) is solved for each region with its neighbors separately. In other words, we obtain an independent local mesh around each region . After estimating all the mesh arc weights, we put them under the vector , called Mesh Arc Descriptors (MADs). We represent as an ensemble of all local meshes. Lastly, the mesh networks are estimated for the original fMRI signal, and its approximation and detail parts of different resolutions. Consequently, we form distinct mesh networks for the frequency sub-bands .

The multi-resolution mesh network for a subject is defined by a connectivity graph, , for each unlabeled window and for each sub-band . The set of vertices corresponds to the ordered set of anatomic regions and is of size . Vertex attributes are the time-series contained in the window , at the sub-band . The arc weights, between regions and , for each window are obtained from the local meshes of the representative time-series data points at sub-band . This process results in distinct mesh networks represented by an adjacency matrix of size made up of () for each window (). We concatenate the arc weights under a vector () of size and embed the brain network for the window at sub-band . This means that for each level and each subject, we represent the entire experiment by a large unlabeled matrix of size i.e. . Next, we will introduce a deep learning algorithm which learns a set of compact connectivity patters from the embedded brain networks and consequently, cluster the windows of similar connectivity patterns. Each cluster of similar connectivity patterns represent a specific cognitive task (see Table I).

Iv The Deep Learning Architecture

The embedded mesh networks model the connectivity among the anatomic regions at different sub-bands of fMRI signal under each window for each subject. Next, we utilize a deep learning architecture to extract a set of compact connectivity patterns from the mesh networks. We will show that the learned connectivity patterns form natural clusters corresponding to cognitive states. To meet this goal, we design a multi-layer stacked de-noising sparse auto-encoder [12]. For each sub-band , we train a SDAE that takes the windows in the embedded brain network associated with subject i.e. as its input, and produces a vector of size . Recall that there are a total of cognitive tasks. The learned features represent the connectivity patterns at sub-band for subject as follows,

with the auto-encoder parameter set where is the collection of weights , is the collection of biases at each neuron and

represents the activation function

. Our sparse auto-encoder design includes an input layer of size with three hidden layers and an output layer of size and the sparsity parameter . The output of each neuron may be represented as , where and ’s indicate the total number of neurons and the neurons’ outputs from the previous layer. The objective function

is to minimize the mean-squared loss function

in the presence of an -Ridge regularization with parameter which adds stability and robustness to the learning process,

In order to deal with the possible noise in the input data points, we follow a dropout training procedure based on which at each learning epoch,

of the data points are removed. It has been shown that this de-noising procedure will control for over-fitting, as reported in [13]. After training the above auto-encoder, one can extract the feature matrices for subject at sub-band to attain . Our results will show that our proposed deep learning algorithm is capable of removing the large intra-variance amongst input data points and can give an effective representation of the brain networks in a low-dimensional space. This can be considered as a non-linear mapping model from a high-dimensional space to a low-dimensional space suited for clustering.

V Hierarchical Clustering

The main objective behind our work is to design a data driven cluster analysis that is suitable for discriminating between distinct connectivity patterns associated with given cognitive states at different frequency levels. We perform a hierarchical clustering on a combination of features from different frequency levels attained from the deep learning algorithm to show that the framework is capable of detecting the cognitive tasks (given in Table

I) based on their connectivity manifestation in the brain networks and their learned features after the deep learning architecture.

The clustering algorithm clusters a subject’s brain features matrix consisting of the concatenation of the feature matrices from different frequency levels selected from the the frequency sub-bands . This is to show that each frequency level carries complementary information in regard to cognitive tasks performed during the experiment. Given the discrete-time windows, the clustering algorithm attempts to divide the data points into clusters (, ), by minimizing the following cost function,

where the distance matrix is based on the Pearson Correlation between data points which closely models the functional connectivity pattern in the brain from one task to another. The exact relation between the distance matrix and the correlation matrix is,

The entries of the correlation matrix indicate the degree to which window is correlated with window . The above relation can capture the time-varying coupling between windows and consequently closely model the flow of change in brain features from one cognitive state to another [14]. Fig. 1 depicts the entire deep learning framework. After clustering the unlabeled windows into different clusters, in the next section, we will compare the resulting clusters with the labeled data points given in Table I in order to examine the performance of our proposed approach. We utilize Rand Index (RI) and Adjusted Rand Index (ARI) as performance measures for our algorithm [15].

Vi Experimental Results

In this section, we test the validity of the suggested deep learning architecture in two groups of experiments. The first set of experiments measures the cluster validity by clustering the fMRI signal, mesh arc weights of single and multi-resolution signals and utilizing the measures of performance RI and ARI. The second group of experiments visualizes the mesh networks obtained across subjects and cognitive tasks to observe the inter-task and inter-subject variabilities. We perform within-subject clustering analyses based on the fMRI signals collected from subjects (described in Section II). The design parameters are selected empirically through a cross validation process based on performance. We search for the optimal design parameters based on the sets, , , , and . We select the design parameters, and for the mesh networks (Section III), and for the SDAE design (Section IV) as optimal values. The RI and ARI values given in the tables for each experiment describe the average clustering performance for all subjects.

Table II gives a performance comparison between the clustering of the raw fMRI data (i.e. representative time series of each anatomic region) and the clustering of the arc weights of mesh networks (MADs). Note that, clustering the MADs increases the rand index from to . This substantial improvement shows that connectivity patterns are much more informative then the average voxel time-series.

Our next analysis involves the representation power of individual frequency sub-bands, where we examine the performance of each sub-band in detecting MADs among anatomic regions for the given tasks. This may also be translated as the amount of complementary information each sub-band carries in regard to the functional connectivity of the given cognitive states. For further comparison, we cluster the data after attaining the MADs at each sub-band (Section III) and also after the deep learning architecture (Section IV). The results are stated in Table III. The high rand indices for all individual sub-bands confirm the benefits of analyzing the fMRI signals in multiple time-resolutions as it shows that each sub-band carries important information in regard to the mesh network arc weights and the connectivity patterns learned at the output of stacked de-noising auto-encoders. This leads to a clustering performance between the range of across all sub-bands. Note that, the sub-bands A5 to A11, D5 to D7 and D9 to D11 show relatively higher performances indicating that these frequency bands are more informative then the rest.

In order to boost the RI and ARI values given in Table III, we fuse the learned connectivity patterns based on a combination of sub-bands to obtain a better representation. In our last set of clustering analyses, we examine the clustering performance by ensembling multiple sub-bands. RI and ARI values for the ensembled sub-bands, given in Table IV point to a substantial increase compared to the best single sub-band clustering performance of at sub-band A10 to a performance of for the fusion of all sub-bands. This shows that not only the brain networks constructed at multiple time-resolutions provide complementary information for the clustering algorithm but that the proposed deep architecture is capable of detecting distinct connectivity patterns in the brain for a given cognitive task, independent of subjects.

The rather high ARI values in Table IV confirm that utilizing the complementary information gained from different time-resolutions result in clusters with relatively low within-cluster variances and high between-cluster variances. This claim is backed by the high ARI values that result from combining the information from different sub-bands before clustering. Further, by increasing the number of subjects to in our data set, and by fusing the brain networks obtained from the entire sub-bands and clustering their connectivity pattern extracted by the SDAE platform, we are able to achieve the performance of RI and ARI. This experiment shows that increasing the number of subjects does not decrease the clustering performance.

Fig. 2: Mesh Networks of 3 Subjects: the top row shows subject 29, the middle row shows subject 40, bottom row shows subject 56. The mesh networks at each row from left to right indicate tasks in the following order: emotion, gambling, language, motor, relational, social and working-memory.
Fig. 3: Precision of the Mesh Networks of a Subset of Subjects. The mesh networks from left to right indicate tasks in the following order: emotion, gambling, language, motor, relational, social and working-memory.
Rand A. Rand Rand A. Rand Raw fMRI Data 0.68 -0.07 MAD 0.84 0.37
TABLE II: Clustering Performance Comparison.
MAD Rand A. Rand SDAE Rand A. Rand A0 0.84 0.37 A0 0.78 0.11 A1 0.83 0.34 A1 0.76 0.02 D1 0.81 0.28 D1 0.75 -0.04 A2 0.77 0.15 A2 0.74 -0.06 D2 0.86 0.47 D2 0.76 0.11 A3 0.75 0.12 A3 0.74 0.07 D3 0.72 0.15 D3 0.74 -0.34 A4 0.68 0.06 A4 0.77 0.06 D4 0.77 0.24 D4 0.78 0.15 A5 0.68 0.08 A5 0.80 0.17 D5 0.74 0.17 D5 0.80 0.16 A6 0.75 0.18 A6 0.81 0.20 D6 0.75 0.17 D6 0.80 0.20 A7 0.87 0.50 A7 0.80 0.21 D7 0.84 0.37 D7 0.82 0.26 A8 0.85 0.37 A8 0.80 0.16 D8 0.82 0.27 D8 0.79 0.14 A9 0.85 0.39 A9 0.83 0.30 D9 0.82 0.28 D9 0.80 0.12 A10 0.82 0.29 A10 0.86 0.41 D10 0.83 0.30 D10 0.84 0.20 A11 0.79 0.20 A11 0.82 0.25 D11 0.81 0.26 D11 0.83 0.29
TABLE III: Clustering Performance for Individual Sub-bands.
MAD Rand A. Rand SDAE Rand A. Rand All Sub-bands 0.91 0.64 All Sub-bands 0.93 0.71 Sub-bands 7-9 0.92 0.66 Subbands 7-9 0.90 0.59 Sub-bands 7-11 0.92 0.66 Subbands 7-11 0.91 0.60 Sub-bands 3-8 0.89 0.57 Subbands 3-8 0.91 0.64 Sub-bands 3-11 0.90 0.59 Subbands 3-11 0.91 0.63
TABLE IV: Clustering Performance for Combinations of Sub-bands.

Finally, we visualize the mesh networks obtained in the original fMRI signal to observe the inter-task and inter-subject variability of the brain networks. The motivation behind performing within-subject clustering rather than across-subject clustering in this study is the well-known inter-subject variability, which may prevent the clustering algorithm from finding natural groupings in the data. In order to illustrate the inter-subject variability, we plot the mesh networks of subjects in Fig. 2 and Fig 3 for each cognitive task. These subjects have the RI of , which indicates that the proposed model has successfully estimated the natural groupings for each one of these subjects. The networks shown in the aforementioned figures represent the medoids of the clusters which correspond to each one of the different tasks. The mesh networks corresponding to each of the subjects are pruned by eliminating the mesh arc weights with values less then a threshold to reach sparsity for simplification. A close analysis of the mesh networks corresponding to each task for the subjects shows that the mesh networks corresponding to the same task show small similarities across the subjects. This validates our prior claim on the existence of high inter-subject variabilities. To further investigate the inter-subject variability, we select a group of subjects with rand indices higher than from the HCP task data set of individuals. We define the precision of the mesh network across the set of subjects as the inverse of variance and calculate this value for the selected subjects. Fig. 3 shows the pruned precision of the mesh networks of the aforementioned set of subjects with sparsity. The thickness and the colors of the edges are proportional to their corresponding precision values. One may observe from Fig. 3

that the majority of the edges are thin-blue with only few of them thick-red. This indicates that the majority of the mesh network connections have high standard deviations across subjects.

Vii Conclusion

In this paper, we proposed a framework for constructing a set of brain networks in multiple time-resolutions in order to model the connectivity patterns amongst the anatomic regions for different cognitive states. We proposed an unsupervised deep learning architecture that utilized these brain networks in multiple frequency sub-bands to learn the natural groupings of connectivity patterns in the human brain for given cognitive tasks. We showed that our suggested deep learning algorithm is capable of clustering the representative groupings into their associated cognitive states. We examined our suggested architecture on a task data set from HCP and achieved the clustering performance of Rand Index and Adjusted Rand Index for subjects. Lastly, we visualized the mean values and the precisions of the mesh networks at each component of the cluster mixture. We showed that the mean mesh networks at cluster centers have high inter-subject variabilities.


We would like to thank Dr. Itir Onal Erturul, Arman Afrasiabi and Omer Ekmekci of Middle East Technical University and Professor Mete Ozay of Tohoku University for supporting us throughout many fruitful discussions. The work is supported by TUBITAK (Scientific and Technological Research Council of Turkey) under the grant No: 116E091.


  • [1] O. Firat, E. Aksan, I. Oztekin, and F. T. Y. Vural, “Learning Deep Temporal Representations for fMRI Brain Decoding,” in Medical Learning Meets Medical Imaging.   Springer, 2015, pp. 25–34.
  • [2] I. Onal, M. Ozay, and F. T. Y. Vural, “A Hierarchical Multi-Resolution Mesh Network for Brain Decoding,” arXiv preprint arXiv:1607.07695.
  • [3] M. A. Lindquist, “The Statistical Analysis of fMRI Data,” Statistical Science, pp. 439–464, 2008.
  • [4]

    J. Richiardi, S. Achard, H. Bunke, and D. Van De Ville, “Machine Learning with Brain Graphs: Predictive Modeling Approaches for Functional Imaging in Systems Neuroscience,”

    IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 58–70, 2013.
  • [5] W. Shirer, S. Ryali, E. Rykhlevskaia, V. Menon, and M. Greicius, “Decoding Subject-Driven Cognitive States with Whole-Brain Connectivity Patterns,” Cerebral Cortex, vol. 22, no. 1, pp. 158–165, 2012.
  • [6] M. Ekman, J. Derrfuss, M. Tittgemeyer, and C. J. Fiebach, “Predicting Errors from Reconfiguration Patterns in Human Brain Networks,” Proceedings of the National Academy of Sciences, vol. 109, no. 41, pp. 16 714–16 719, 2012.
  • [7] I. Onal, M. Ozay, E. Mizrak, I. Oztekin, and F. Yarman-Vural, “A New Representation of fMRI Signal by a Set of Local Meshes for Brain Decoding,” IEEE Transactions on Signal and Information Processing over Networks, 2017.
  • [8] D. M. Barch, G. C. Burgess, M. P. Harms, S. E. Petersen, B. L. Schlaggar, M. Corbetta, M. F. Glasser, S. Curtiss, S. Dixit, C. Feldt et al., “Function in the Human Connectome: Task-fMRI and Individual Differences in Behavior,” Neuroimage, vol. 80, pp. 169–189, 2013.
  • [9] S. P. Pantazatos, A. Talati, P. Pavlidis, and J. Hirsch, “Decoding Unattended Fearful Faces with Whole-Brain Correlations: An Approach to Identify Condition-Dependent Large-Scale Functional Connectivity,” PLoS Computational Biology, vol. 8, no. 3, p. e1002441, 2012.
  • [10] W. H. Thompson and P. Fransson, “The Frequency Dimension of fMRI Dynamic Connectivity: Network Connectivity, Functional Hubs and Integration in the Resting Brain,” NeuroImage, vol. 121, pp. 227–242, 2015.
  • [11] E. Bullmore, J. Fadili, V. Maxim, L. Şendur, B. Whitcher, J. Suckling, M. Brammer, and M. Breakspear, “Wavelets and functional Magnetic Resonance Imaging of the Human Brain,” Neuroimage, vol. 23, pp. S234–S249, 2004.
  • [12] C. Poultney, S. Chopra, Y. L. Cun et al.

    , “Efficient Learning of Sparse Representations with an Energy-Based Model,” in

    Advances in neural information processing systems, 2007, pp. 1137–1144.
  • [13] S. Wager, S. Wang, and P. S. Liang, “Dropout Training as Adaptive Regularization,” in Advances in neural information processing systems, 2013, pp. 351–359.
  • [14] V. D. Calhoun, R. Miller, G. Pearlson, and T. Adalı, “The Chronnectome: Time-Varying Connectivity Networks as the Next Frontier in fMRI Data Discovery,” Neuron, vol. 84, no. 2, pp. 262–274, 2014.
  • [15] G. W. Milligan and M. C. Cooper, “A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis,” Multivariate Behavioral Research, vol. 21, no. 4, pp. 441–458, 1986.