The data produced by functional Magnetic Resonance Imaging (fMRI) is high-dimensional and sometimes not suitable for analyzing the cognitive states . Learning efficient low-dimensional features from high-dimensional complex input spaces is crucial for the decoding of cognitive processes. In this paper, we explore deep learning algorithms in order to i) find a compact representation of connectivity patterns embedded in fMRI signals, ii) detect natural groupings of these patterns and, iii) use these natural groups to extract brain networks to represent cognitive tasks.
Our framework is built upon our previous work in the area , where we decompose fMRI signals into various frequency sub-bands using their wavelet transforms. We further utilize the signals at different sub-bands to form multi-resolution brain networks. Recent studies have shown that brain networks formed by the correlation of voxel pairs’ in fMRI signals provide more information for brain decoding compared to the temporal information of single voxels [3, 4]. Moreover, there has been a shift in the literature toward brain decoding algorithms that are based on the connectivity patterns in the brain motivated by the belief that these patterns provide more information about cognitive tasks than the isolated behavior of individual anatomic regions [5, 6, 7].
where supervised learning algorithms are employed for brain decoding, in this paper, we investigate the common groupings in HCP task data set to find out if these natural groups correspond to the cognitive tasks. This approach enables us to find shared network representations of a cognitive task together with its variations across the subjects. Additionally, multi-resolution representation of the fMRI signals enables us to observe the variations of networks among different frequency sub-bands.
After constructing the brain networks representing the connectivity patterns among the anatomic regions of the brain at each sub-level, a Stacked De-noising Auto-Encoder (SDAE) algorithm is employed to learn shared connectivity features associated with a task based on the estimated mesh networks at different sub-bands. We concatenate the learned connectivity patterns from several wavelet sub-bands and utilize them in a hierarchical clustering algorithm with a distance matrix based on their correlations. The main reason behind concatenation of the feature matrices is that the detected patterns in the brain at different frequencies provide complementary information in regard to the overall cognitive state of the brain.
Our results show that the mesh network representation of cognitive tasks is superior compared to fMRI time-series representation. We observe that SDAE successfully learns a set of connectivity patterns which provide an increased clustering performance. The performances are further improved by fusing the learned representations from multiple time-resolutions. This shows that the modeling of the connectivity of brain in multiple sub-bands of the fMRI signal leads to diverse mesh networks carrying complementary information for representing the cognitive tasks. The high rand index
obtained at the output of the clustering algorithm proves the existence of natural groups with low within-cluster-variances and high between-class-variances among the tasks.
In order to analyze the similarities and distinctions among the network topologies of fMRI signal, we visualize the networks and their precisions at the cluster centers. The cluster precisions, indicate shared connectivities among the subjects, whereas the mesh networks at the cluster center show a representative network for each cognitive task. It is observed that there are high inter-subject variances in the mesh networks.
Ii Experimental Setup
We use the fMRI data from HCP for subjects performing specific tasks. A subject performs seven distinct cognitive tasks during the experiment given in Table I . Each task consists of scans of the brain volume representing the changes in the brain during the task (the underlying cognitive process). The duration and the number of scans are task-dependent but the same for all participants. The total number of scans is and we use anatomical regions of AAL after removing the anatomical regions in Cerebellum and Vermis.
Representative time-series data points are attained by spatially averaging the signals associated with voxels () residing in the same region () in the brain,
where represents the total number of voxels in region .
Iii Hierarchical Multi-Resolution Mesh Networks (HMMNs)
In our work, we utilize the representative time series obtained for each anatomic region to build a set of local meshes. The local meshes estimated around each anatomic region are ensembled to form a mesh network. This is motivated by the fact that the structure of the brain is highly interconnected and that neurons influence each other based on the strengths of their synaptic connections. HMMNs model cognitive tasks by estimating a mesh network at each frequency sub-band. It is expected that the brain network at each sub-band provides supplementary information about the underlying brain activities. We will show that our modeling of the brain with HMMNs greatly enhances the brain decoding performance by allowing us to look at cognitive states of the brain regions in multiple time-resolutions.
As the first step, the representative time-series , for each anatomic region are decomposed into a set of signals in different time-resolutions. This allows us to estimate and analyze how the anatomical regions process information in different frequency resolutions . We adopt Discrete Wavelet Transform (DWT) as our main tool . We apply the DWT to for all brain regions to decompose the signals into sub-bands where (). At sub-band level , we attain two sets of orthonormal components named as sets of approximation coefficients and detail coefficients where represents the location of the wavelet waveform in discrete-time . These coefficients then may be utilized to reconstruct the fMRI signals at each frequency level, yielding the total of fMRI time-series. Formally, the representative time-series at sub-band () may be defined as,
where and are called the mother wavelet and the father wavelet. More details on our approach are given in .
Now, we can construct a mesh network at each sub-band to represent cognitive tasks in terms of the relationships among anatomic regions. The construction of these networks help us analyze the topological properties of the brain and extract connectivity patterns associated with a cognitive process at each sub-band. In order to demonstrate the benefits of our approach, we propose an unsupervised clustering framework which can successfully take advantage of the connectivity patterns and distinguish between different cognitive tasks at multiple sub-bands. For this purpose, we divide the entire experiment session ( number of scans) for a subject into unlabeled windows of length consisting of discrete scans, where for each subject for the entire experiment. The length of the window is determined empirically, as the shortest time-interval which provides the highest rand index, at the output of clustering. It is important to note that the windows are unlabeled and may consist of overlapping data points from different cognitive tasks.
The nodes of the mesh networks are connected to their -nearest neighbors to form a star mesh around a region. The nearest neighbors for a certain node are the ones having the largest Pearson correlation coefficients with the node. For each mesh formed around an anatomic region , the arc weights for the window are estimated at the sub-band using the following regularized linear model,
where the regularization parameter is . The mesh arc weights , defined in the neighborhood of region , are estimated by minimizing the error .
is a vector representing the average voxel time-series in regionat sub-band for the window , such that,
The relation defined in (1) is solved for each region with its neighbors separately. In other words, we obtain an independent local mesh around each region . After estimating all the mesh arc weights, we put them under the vector , called Mesh Arc Descriptors (MADs). We represent as an ensemble of all local meshes. Lastly, the mesh networks are estimated for the original fMRI signal, and its approximation and detail parts of different resolutions. Consequently, we form distinct mesh networks for the frequency sub-bands .
The multi-resolution mesh network for a subject is defined by a connectivity graph, , for each unlabeled window and for each sub-band . The set of vertices corresponds to the ordered set of anatomic regions and is of size . Vertex attributes are the time-series contained in the window , at the sub-band . The arc weights, between regions and , for each window are obtained from the local meshes of the representative time-series data points at sub-band . This process results in distinct mesh networks represented by an adjacency matrix of size made up of () for each window (). We concatenate the arc weights under a vector () of size and embed the brain network for the window at sub-band . This means that for each level and each subject, we represent the entire experiment by a large unlabeled matrix of size i.e. . Next, we will introduce a deep learning algorithm which learns a set of compact connectivity patters from the embedded brain networks and consequently, cluster the windows of similar connectivity patterns. Each cluster of similar connectivity patterns represent a specific cognitive task (see Table I).
Iv The Deep Learning Architecture
The embedded mesh networks model the connectivity among the anatomic regions at different sub-bands of fMRI signal under each window for each subject. Next, we utilize a deep learning architecture to extract a set of compact connectivity patterns from the mesh networks. We will show that the learned connectivity patterns form natural clusters corresponding to cognitive states. To meet this goal, we design a multi-layer stacked de-noising sparse auto-encoder . For each sub-band , we train a SDAE that takes the windows in the embedded brain network associated with subject i.e. as its input, and produces a vector of size . Recall that there are a total of cognitive tasks. The learned features represent the connectivity patterns at sub-band for subject as follows,
with the auto-encoder parameter set where is the collection of weights , is the collection of biases at each neuron and
represents the activation function. Our sparse auto-encoder design includes an input layer of size with three hidden layers and an output layer of size and the sparsity parameter . The output of each neuron may be represented as , where and ’s indicate the total number of neurons and the neurons’ outputs from the previous layer. The objective function
is to minimize the mean-squared loss functionin the presence of an -Ridge regularization with parameter which adds stability and robustness to the learning process,
In order to deal with the possible noise in the input data points, we follow a dropout training procedure based on which at each learning epoch,of the data points are removed. It has been shown that this de-noising procedure will control for over-fitting, as reported in . After training the above auto-encoder, one can extract the feature matrices for subject at sub-band to attain . Our results will show that our proposed deep learning algorithm is capable of removing the large intra-variance amongst input data points and can give an effective representation of the brain networks in a low-dimensional space. This can be considered as a non-linear mapping model from a high-dimensional space to a low-dimensional space suited for clustering.
V Hierarchical Clustering
The main objective behind our work is to design a data driven cluster analysis that is suitable for discriminating between distinct connectivity patterns associated with given cognitive states at different frequency levels. We perform a hierarchical clustering on a combination of features from different frequency levels attained from the deep learning algorithm to show that the framework is capable of detecting the cognitive tasks (given in TableI) based on their connectivity manifestation in the brain networks and their learned features after the deep learning architecture.
The clustering algorithm clusters a subject’s brain features matrix consisting of the concatenation of the feature matrices from different frequency levels selected from the the frequency sub-bands . This is to show that each frequency level carries complementary information in regard to cognitive tasks performed during the experiment. Given the discrete-time windows, the clustering algorithm attempts to divide the data points into clusters (, ), by minimizing the following cost function,
where the distance matrix is based on the Pearson Correlation between data points which closely models the functional connectivity pattern in the brain from one task to another. The exact relation between the distance matrix and the correlation matrix is,
The entries of the correlation matrix indicate the degree to which window is correlated with window . The above relation can capture the time-varying coupling between windows and consequently closely model the flow of change in brain features from one cognitive state to another . Fig. 1 depicts the entire deep learning framework. After clustering the unlabeled windows into different clusters, in the next section, we will compare the resulting clusters with the labeled data points given in Table I in order to examine the performance of our proposed approach. We utilize Rand Index (RI) and Adjusted Rand Index (ARI) as performance measures for our algorithm .
Vi Experimental Results
In this section, we test the validity of the suggested deep learning architecture in two groups of experiments. The first set of experiments measures the cluster validity by clustering the fMRI signal, mesh arc weights of single and multi-resolution signals and utilizing the measures of performance RI and ARI. The second group of experiments visualizes the mesh networks obtained across subjects and cognitive tasks to observe the inter-task and inter-subject variabilities. We perform within-subject clustering analyses based on the fMRI signals collected from subjects (described in Section II). The design parameters are selected empirically through a cross validation process based on performance. We search for the optimal design parameters based on the sets, , , , and . We select the design parameters, and for the mesh networks (Section III), and for the SDAE design (Section IV) as optimal values. The RI and ARI values given in the tables for each experiment describe the average clustering performance for all subjects.
Table II gives a performance comparison between the clustering of the raw fMRI data (i.e. representative time series of each anatomic region) and the clustering of the arc weights of mesh networks (MADs). Note that, clustering the MADs increases the rand index from to . This substantial improvement shows that connectivity patterns are much more informative then the average voxel time-series.
Our next analysis involves the representation power of individual frequency sub-bands, where we examine the performance of each sub-band in detecting MADs among anatomic regions for the given tasks. This may also be translated as the amount of complementary information each sub-band carries in regard to the functional connectivity of the given cognitive states. For further comparison, we cluster the data after attaining the MADs at each sub-band (Section III) and also after the deep learning architecture (Section IV). The results are stated in Table III. The high rand indices for all individual sub-bands confirm the benefits of analyzing the fMRI signals in multiple time-resolutions as it shows that each sub-band carries important information in regard to the mesh network arc weights and the connectivity patterns learned at the output of stacked de-noising auto-encoders. This leads to a clustering performance between the range of across all sub-bands. Note that, the sub-bands A5 to A11, D5 to D7 and D9 to D11 show relatively higher performances indicating that these frequency bands are more informative then the rest.
In order to boost the RI and ARI values given in Table III, we fuse the learned connectivity patterns based on a combination of sub-bands to obtain a better representation. In our last set of clustering analyses, we examine the clustering performance by ensembling multiple sub-bands. RI and ARI values for the ensembled sub-bands, given in Table IV point to a substantial increase compared to the best single sub-band clustering performance of at sub-band A10 to a performance of for the fusion of all sub-bands. This shows that not only the brain networks constructed at multiple time-resolutions provide complementary information for the clustering algorithm but that the proposed deep architecture is capable of detecting distinct connectivity patterns in the brain for a given cognitive task, independent of subjects.
The rather high ARI values in Table IV confirm that utilizing the complementary information gained from different time-resolutions result in clusters with relatively low within-cluster variances and high between-cluster variances. This claim is backed by the high ARI values that result from combining the information from different sub-bands before clustering. Further, by increasing the number of subjects to in our data set, and by fusing the brain networks obtained from the entire sub-bands and clustering their connectivity pattern extracted by the SDAE platform, we are able to achieve the performance of RI and ARI. This experiment shows that increasing the number of subjects does not decrease the clustering performance.
Finally, we visualize the mesh networks obtained in the original fMRI signal to observe the inter-task and inter-subject variability of the brain networks. The motivation behind performing within-subject clustering rather than across-subject clustering in this study is the well-known inter-subject variability, which may prevent the clustering algorithm from finding natural groupings in the data. In order to illustrate the inter-subject variability, we plot the mesh networks of subjects in Fig. 2 and Fig 3 for each cognitive task. These subjects have the RI of , which indicates that the proposed model has successfully estimated the natural groupings for each one of these subjects. The networks shown in the aforementioned figures represent the medoids of the clusters which correspond to each one of the different tasks. The mesh networks corresponding to each of the subjects are pruned by eliminating the mesh arc weights with values less then a threshold to reach sparsity for simplification. A close analysis of the mesh networks corresponding to each task for the subjects shows that the mesh networks corresponding to the same task show small similarities across the subjects. This validates our prior claim on the existence of high inter-subject variabilities. To further investigate the inter-subject variability, we select a group of subjects with rand indices higher than from the HCP task data set of individuals. We define the precision of the mesh network across the set of subjects as the inverse of variance and calculate this value for the selected subjects. Fig. 3 shows the pruned precision of the mesh networks of the aforementioned set of subjects with sparsity. The thickness and the colors of the edges are proportional to their corresponding precision values. One may observe from Fig. 3
that the majority of the edges are thin-blue with only few of them thick-red. This indicates that the majority of the mesh network connections have high standard deviations across subjects.
In this paper, we proposed a framework for constructing a set of brain networks in multiple time-resolutions in order to model the connectivity patterns amongst the anatomic regions for different cognitive states. We proposed an unsupervised deep learning architecture that utilized these brain networks in multiple frequency sub-bands to learn the natural groupings of connectivity patterns in the human brain for given cognitive tasks. We showed that our suggested deep learning algorithm is capable of clustering the representative groupings into their associated cognitive states. We examined our suggested architecture on a task data set from HCP and achieved the clustering performance of Rand Index and Adjusted Rand Index for subjects. Lastly, we visualized the mean values and the precisions of the mesh networks at each component of the cluster mixture. We showed that the mean mesh networks at cluster centers have high inter-subject variabilities.
We would like to thank Dr. Itir Onal Erturul, Arman Afrasiabi and Omer Ekmekci of Middle East Technical University and Professor Mete Ozay of Tohoku University for supporting us throughout many fruitful discussions. The work is supported by TUBITAK (Scientific and Technological Research Council of Turkey) under the grant No: 116E091.
-  O. Firat, E. Aksan, I. Oztekin, and F. T. Y. Vural, “Learning Deep Temporal Representations for fMRI Brain Decoding,” in Medical Learning Meets Medical Imaging. Springer, 2015, pp. 25–34.
-  I. Onal, M. Ozay, and F. T. Y. Vural, “A Hierarchical Multi-Resolution Mesh Network for Brain Decoding,” arXiv preprint arXiv:1607.07695.
-  M. A. Lindquist, “The Statistical Analysis of fMRI Data,” Statistical Science, pp. 439–464, 2008.
J. Richiardi, S. Achard, H. Bunke, and D. Van De Ville, “Machine Learning with Brain Graphs: Predictive Modeling Approaches for Functional Imaging in Systems Neuroscience,”IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 58–70, 2013.
-  W. Shirer, S. Ryali, E. Rykhlevskaia, V. Menon, and M. Greicius, “Decoding Subject-Driven Cognitive States with Whole-Brain Connectivity Patterns,” Cerebral Cortex, vol. 22, no. 1, pp. 158–165, 2012.
-  M. Ekman, J. Derrfuss, M. Tittgemeyer, and C. J. Fiebach, “Predicting Errors from Reconfiguration Patterns in Human Brain Networks,” Proceedings of the National Academy of Sciences, vol. 109, no. 41, pp. 16 714–16 719, 2012.
-  I. Onal, M. Ozay, E. Mizrak, I. Oztekin, and F. Yarman-Vural, “A New Representation of fMRI Signal by a Set of Local Meshes for Brain Decoding,” IEEE Transactions on Signal and Information Processing over Networks, 2017.
-  D. M. Barch, G. C. Burgess, M. P. Harms, S. E. Petersen, B. L. Schlaggar, M. Corbetta, M. F. Glasser, S. Curtiss, S. Dixit, C. Feldt et al., “Function in the Human Connectome: Task-fMRI and Individual Differences in Behavior,” Neuroimage, vol. 80, pp. 169–189, 2013.
-  S. P. Pantazatos, A. Talati, P. Pavlidis, and J. Hirsch, “Decoding Unattended Fearful Faces with Whole-Brain Correlations: An Approach to Identify Condition-Dependent Large-Scale Functional Connectivity,” PLoS Computational Biology, vol. 8, no. 3, p. e1002441, 2012.
-  W. H. Thompson and P. Fransson, “The Frequency Dimension of fMRI Dynamic Connectivity: Network Connectivity, Functional Hubs and Integration in the Resting Brain,” NeuroImage, vol. 121, pp. 227–242, 2015.
-  E. Bullmore, J. Fadili, V. Maxim, L. Şendur, B. Whitcher, J. Suckling, M. Brammer, and M. Breakspear, “Wavelets and functional Magnetic Resonance Imaging of the Human Brain,” Neuroimage, vol. 23, pp. S234–S249, 2004.
C. Poultney, S. Chopra, Y. L. Cun et al.
, “Efficient Learning of Sparse Representations with an Energy-Based Model,” inAdvances in neural information processing systems, 2007, pp. 1137–1144.
-  S. Wager, S. Wang, and P. S. Liang, “Dropout Training as Adaptive Regularization,” in Advances in neural information processing systems, 2013, pp. 351–359.
-  V. D. Calhoun, R. Miller, G. Pearlson, and T. Adalı, “The Chronnectome: Time-Varying Connectivity Networks as the Next Frontier in fMRI Data Discovery,” Neuron, vol. 84, no. 2, pp. 262–274, 2014.
-  G. W. Milligan and M. C. Cooper, “A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis,” Multivariate Behavioral Research, vol. 21, no. 4, pp. 441–458, 1986.