I Introduction
The data produced by functional Magnetic Resonance Imaging (fMRI) is highdimensional and sometimes not suitable for analyzing the cognitive states [1]. Learning efficient lowdimensional features from highdimensional complex input spaces is crucial for the decoding of cognitive processes. In this paper, we explore deep learning algorithms in order to i) find a compact representation of connectivity patterns embedded in fMRI signals, ii) detect natural groupings of these patterns and, iii) use these natural groups to extract brain networks to represent cognitive tasks.
Our framework is built upon our previous work in the area [2], where we decompose fMRI signals into various frequency subbands using their wavelet transforms. We further utilize the signals at different subbands to form multiresolution brain networks. Recent studies have shown that brain networks formed by the correlation of voxel pairs’ in fMRI signals provide more information for brain decoding compared to the temporal information of single voxels [3, 4]. Moreover, there has been a shift in the literature toward brain decoding algorithms that are based on the connectivity patterns in the brain motivated by the belief that these patterns provide more information about cognitive tasks than the isolated behavior of individual anatomic regions [5, 6, 7].
Contrary to the methods suggested in [3, 4]
where supervised learning algorithms are employed for brain decoding, in this paper, we investigate the common groupings in HCP task data set to find out if these natural groups correspond to the cognitive tasks. This approach enables us to find shared network representations of a cognitive task together with its variations across the subjects. Additionally, multiresolution representation of the fMRI signals enables us to observe the variations of networks among different frequency subbands.
After constructing the brain networks representing the connectivity patterns among the anatomic regions of the brain at each sublevel, a Stacked Denoising AutoEncoder (SDAE) algorithm is employed to learn shared connectivity features associated with a task based on the estimated mesh networks at different subbands. We concatenate the learned connectivity patterns from several wavelet subbands and utilize them in a hierarchical clustering algorithm with a distance matrix based on their correlations. The main reason behind concatenation of the feature matrices is that the detected patterns in the brain at different frequencies provide complementary information in regard to the overall cognitive state of the brain.
Our results show that the mesh network representation of cognitive tasks is superior compared to fMRI timeseries representation. We observe that SDAE successfully learns a set of connectivity patterns which provide an increased clustering performance. The performances are further improved by fusing the learned representations from multiple timeresolutions. This shows that the modeling of the connectivity of brain in multiple subbands of the fMRI signal leads to diverse mesh networks carrying complementary information for representing the cognitive tasks. The high rand index
obtained at the output of the clustering algorithm proves the existence of natural groups with low withinclustervariances and high betweenclassvariances among the tasks.
In order to analyze the similarities and distinctions among the network topologies of fMRI signal, we visualize the networks and their precisions at the cluster centers. The cluster precisions, indicate shared connectivities among the subjects, whereas the mesh networks at the cluster center show a representative network for each cognitive task. It is observed that there are high intersubject variances in the mesh networks.
Ii Experimental Setup
We use the fMRI data from HCP for subjects performing specific tasks. A subject performs seven distinct cognitive tasks during the experiment given in Table I [8]. Each task consists of scans of the brain volume representing the changes in the brain during the task (the underlying cognitive process). The duration and the number of scans are taskdependent but the same for all participants. The total number of scans is and we use anatomical regions of AAL after removing the anatomical regions in Cerebellum and Vermis.
Representative timeseries data points are attained by spatially averaging the signals associated with voxels () residing in the same region () in the brain,
where represents the total number of voxels in region .
Iii Hierarchical MultiResolution Mesh Networks (HMMNs)
In our work, we utilize the representative time series obtained for each anatomic region to build a set of local meshes. The local meshes estimated around each anatomic region are ensembled to form a mesh network. This is motivated by the fact that the structure of the brain is highly interconnected and that neurons influence each other based on the strengths of their synaptic connections
[9]. HMMNs model cognitive tasks by estimating a mesh network at each frequency subband. It is expected that the brain network at each subband provides supplementary information about the underlying brain activities. We will show that our modeling of the brain with HMMNs greatly enhances the brain decoding performance by allowing us to look at cognitive states of the brain regions in multiple timeresolutions.As the first step, the representative timeseries , for each anatomic region are decomposed into a set of signals in different timeresolutions. This allows us to estimate and analyze how the anatomical regions process information in different frequency resolutions [10]. We adopt Discrete Wavelet Transform (DWT) as our main tool [11]. We apply the DWT to for all brain regions to decompose the signals into subbands where (). At subband level , we attain two sets of orthonormal components named as sets of approximation coefficients and detail coefficients where represents the location of the wavelet waveform in discretetime [11]. These coefficients then may be utilized to reconstruct the fMRI signals at each frequency level, yielding the total of fMRI timeseries. Formally, the representative timeseries at subband () may be defined as,
where and are called the mother wavelet and the father wavelet. More details on our approach are given in [2].
Now, we can construct a mesh network at each subband to represent cognitive tasks in terms of the relationships among anatomic regions. The construction of these networks help us analyze the topological properties of the brain and extract connectivity patterns associated with a cognitive process at each subband. In order to demonstrate the benefits of our approach, we propose an unsupervised clustering framework which can successfully take advantage of the connectivity patterns and distinguish between different cognitive tasks at multiple subbands. For this purpose, we divide the entire experiment session ( number of scans) for a subject into unlabeled windows of length consisting of discrete scans, where for each subject for the entire experiment. The length of the window is determined empirically, as the shortest timeinterval which provides the highest rand index, at the output of clustering. It is important to note that the windows are unlabeled and may consist of overlapping data points from different cognitive tasks.
The nodes of the mesh networks are connected to their nearest neighbors to form a star mesh around a region. The nearest neighbors for a certain node are the ones having the largest Pearson correlation coefficients with the node. For each mesh formed around an anatomic region , the arc weights for the window are estimated at the subband using the following regularized linear model,
(1) 
where the regularization parameter is . The mesh arc weights , defined in the neighborhood of region , are estimated by minimizing the error .
is a vector representing the average voxel timeseries in region
at subband for the window , such that,The relation defined in (1) is solved for each region with its neighbors separately. In other words, we obtain an independent local mesh around each region . After estimating all the mesh arc weights, we put them under the vector , called Mesh Arc Descriptors (MADs). We represent as an ensemble of all local meshes. Lastly, the mesh networks are estimated for the original fMRI signal, and its approximation and detail parts of different resolutions. Consequently, we form distinct mesh networks for the frequency subbands .
The multiresolution mesh network for a subject is defined by a connectivity graph, , for each unlabeled window and for each subband . The set of vertices corresponds to the ordered set of anatomic regions and is of size . Vertex attributes are the timeseries contained in the window , at the subband . The arc weights, between regions and , for each window are obtained from the local meshes of the representative timeseries data points at subband . This process results in distinct mesh networks represented by an adjacency matrix of size made up of () for each window (). We concatenate the arc weights under a vector () of size and embed the brain network for the window at subband . This means that for each level and each subject, we represent the entire experiment by a large unlabeled matrix of size i.e. . Next, we will introduce a deep learning algorithm which learns a set of compact connectivity patters from the embedded brain networks and consequently, cluster the windows of similar connectivity patterns. Each cluster of similar connectivity patterns represent a specific cognitive task (see Table I).
Iv The Deep Learning Architecture
The embedded mesh networks model the connectivity among the anatomic regions at different subbands of fMRI signal under each window for each subject. Next, we utilize a deep learning architecture to extract a set of compact connectivity patterns from the mesh networks. We will show that the learned connectivity patterns form natural clusters corresponding to cognitive states. To meet this goal, we design a multilayer stacked denoising sparse autoencoder [12]. For each subband , we train a SDAE that takes the windows in the embedded brain network associated with subject i.e. as its input, and produces a vector of size . Recall that there are a total of cognitive tasks. The learned features represent the connectivity patterns at subband for subject as follows,
with the autoencoder parameter set where is the collection of weights , is the collection of biases at each neuron and
represents the activation function
. Our sparse autoencoder design includes an input layer of size with three hidden layers and an output layer of size and the sparsity parameter . The output of each neuron may be represented as , where and ’s indicate the total number of neurons and the neurons’ outputs from the previous layer. The objective functionis to minimize the meansquared loss function
in the presence of an Ridge regularization with parameter which adds stability and robustness to the learning process,In order to deal with the possible noise in the input data points, we follow a dropout training procedure based on which at each learning epoch,
of the data points are removed. It has been shown that this denoising procedure will control for overfitting, as reported in [13]. After training the above autoencoder, one can extract the feature matrices for subject at subband to attain . Our results will show that our proposed deep learning algorithm is capable of removing the large intravariance amongst input data points and can give an effective representation of the brain networks in a lowdimensional space. This can be considered as a nonlinear mapping model from a highdimensional space to a lowdimensional space suited for clustering.V Hierarchical Clustering
The main objective behind our work is to design a data driven cluster analysis that is suitable for discriminating between distinct connectivity patterns associated with given cognitive states at different frequency levels. We perform a hierarchical clustering on a combination of features from different frequency levels attained from the deep learning algorithm to show that the framework is capable of detecting the cognitive tasks (given in Table
I) based on their connectivity manifestation in the brain networks and their learned features after the deep learning architecture.The clustering algorithm clusters a subject’s brain features matrix consisting of the concatenation of the feature matrices from different frequency levels selected from the the frequency subbands . This is to show that each frequency level carries complementary information in regard to cognitive tasks performed during the experiment. Given the discretetime windows, the clustering algorithm attempts to divide the data points into clusters (, ), by minimizing the following cost function,
where the distance matrix is based on the Pearson Correlation between data points which closely models the functional connectivity pattern in the brain from one task to another. The exact relation between the distance matrix and the correlation matrix is,
The entries of the correlation matrix indicate the degree to which window is correlated with window . The above relation can capture the timevarying coupling between windows and consequently closely model the flow of change in brain features from one cognitive state to another [14]. Fig. 1 depicts the entire deep learning framework. After clustering the unlabeled windows into different clusters, in the next section, we will compare the resulting clusters with the labeled data points given in Table I in order to examine the performance of our proposed approach. We utilize Rand Index (RI) and Adjusted Rand Index (ARI) as performance measures for our algorithm [15].
Vi Experimental Results
In this section, we test the validity of the suggested deep learning architecture in two groups of experiments. The first set of experiments measures the cluster validity by clustering the fMRI signal, mesh arc weights of single and multiresolution signals and utilizing the measures of performance RI and ARI. The second group of experiments visualizes the mesh networks obtained across subjects and cognitive tasks to observe the intertask and intersubject variabilities. We perform withinsubject clustering analyses based on the fMRI signals collected from subjects (described in Section II). The design parameters are selected empirically through a cross validation process based on performance. We search for the optimal design parameters based on the sets, , , , and . We select the design parameters, and for the mesh networks (Section III), and for the SDAE design (Section IV) as optimal values. The RI and ARI values given in the tables for each experiment describe the average clustering performance for all subjects.
Table II gives a performance comparison between the clustering of the raw fMRI data (i.e. representative time series of each anatomic region) and the clustering of the arc weights of mesh networks (MADs). Note that, clustering the MADs increases the rand index from to . This substantial improvement shows that connectivity patterns are much more informative then the average voxel timeseries.
Our next analysis involves the representation power of individual frequency subbands, where we examine the performance of each subband in detecting MADs among anatomic regions for the given tasks. This may also be translated as the amount of complementary information each subband carries in regard to the functional connectivity of the given cognitive states. For further comparison, we cluster the data after attaining the MADs at each subband (Section III) and also after the deep learning architecture (Section IV). The results are stated in Table III. The high rand indices for all individual subbands confirm the benefits of analyzing the fMRI signals in multiple timeresolutions as it shows that each subband carries important information in regard to the mesh network arc weights and the connectivity patterns learned at the output of stacked denoising autoencoders. This leads to a clustering performance between the range of across all subbands. Note that, the subbands A5 to A11, D5 to D7 and D9 to D11 show relatively higher performances indicating that these frequency bands are more informative then the rest.
In order to boost the RI and ARI values given in Table III, we fuse the learned connectivity patterns based on a combination of subbands to obtain a better representation. In our last set of clustering analyses, we examine the clustering performance by ensembling multiple subbands. RI and ARI values for the ensembled subbands, given in Table IV point to a substantial increase compared to the best single subband clustering performance of at subband A10 to a performance of for the fusion of all subbands. This shows that not only the brain networks constructed at multiple timeresolutions provide complementary information for the clustering algorithm but that the proposed deep architecture is capable of detecting distinct connectivity patterns in the brain for a given cognitive task, independent of subjects.
The rather high ARI values in Table IV confirm that utilizing the complementary information gained from different timeresolutions result in clusters with relatively low withincluster variances and high betweencluster variances. This claim is backed by the high ARI values that result from combining the information from different subbands before clustering. Further, by increasing the number of subjects to in our data set, and by fusing the brain networks obtained from the entire subbands and clustering their connectivity pattern extracted by the SDAE platform, we are able to achieve the performance of RI and ARI. This experiment shows that increasing the number of subjects does not decrease the clustering performance.
Finally, we visualize the mesh networks obtained in the original fMRI signal to observe the intertask and intersubject variability of the brain networks. The motivation behind performing withinsubject clustering rather than acrosssubject clustering in this study is the wellknown intersubject variability, which may prevent the clustering algorithm from finding natural groupings in the data. In order to illustrate the intersubject variability, we plot the mesh networks of subjects in Fig. 2 and Fig 3 for each cognitive task. These subjects have the RI of , which indicates that the proposed model has successfully estimated the natural groupings for each one of these subjects. The networks shown in the aforementioned figures represent the medoids of the clusters which correspond to each one of the different tasks. The mesh networks corresponding to each of the subjects are pruned by eliminating the mesh arc weights with values less then a threshold to reach sparsity for simplification. A close analysis of the mesh networks corresponding to each task for the subjects shows that the mesh networks corresponding to the same task show small similarities across the subjects. This validates our prior claim on the existence of high intersubject variabilities. To further investigate the intersubject variability, we select a group of subjects with rand indices higher than from the HCP task data set of individuals. We define the precision of the mesh network across the set of subjects as the inverse of variance and calculate this value for the selected subjects. Fig. 3 shows the pruned precision of the mesh networks of the aforementioned set of subjects with sparsity. The thickness and the colors of the edges are proportional to their corresponding precision values. One may observe from Fig. 3
that the majority of the edges are thinblue with only few of them thickred. This indicates that the majority of the mesh network connections have high standard deviations across subjects.
Vii Conclusion
In this paper, we proposed a framework for constructing a set of brain networks in multiple timeresolutions in order to model the connectivity patterns amongst the anatomic regions for different cognitive states. We proposed an unsupervised deep learning architecture that utilized these brain networks in multiple frequency subbands to learn the natural groupings of connectivity patterns in the human brain for given cognitive tasks. We showed that our suggested deep learning algorithm is capable of clustering the representative groupings into their associated cognitive states. We examined our suggested architecture on a task data set from HCP and achieved the clustering performance of Rand Index and Adjusted Rand Index for subjects. Lastly, we visualized the mean values and the precisions of the mesh networks at each component of the cluster mixture. We showed that the mean mesh networks at cluster centers have high intersubject variabilities.
Acknowledgment
We would like to thank Dr. Itir Onal Erturul, Arman Afrasiabi and Omer Ekmekci of Middle East Technical University and Professor Mete Ozay of Tohoku University for supporting us throughout many fruitful discussions. The work is supported by TUBITAK (Scientific and Technological Research Council of Turkey) under the grant No: 116E091.
References
 [1] O. Firat, E. Aksan, I. Oztekin, and F. T. Y. Vural, “Learning Deep Temporal Representations for fMRI Brain Decoding,” in Medical Learning Meets Medical Imaging. Springer, 2015, pp. 25–34.
 [2] I. Onal, M. Ozay, and F. T. Y. Vural, “A Hierarchical MultiResolution Mesh Network for Brain Decoding,” arXiv preprint arXiv:1607.07695.
 [3] M. A. Lindquist, “The Statistical Analysis of fMRI Data,” Statistical Science, pp. 439–464, 2008.

[4]
J. Richiardi, S. Achard, H. Bunke, and D. Van De Ville, “Machine Learning with Brain Graphs: Predictive Modeling Approaches for Functional Imaging in Systems Neuroscience,”
IEEE Signal Processing Magazine, vol. 30, no. 3, pp. 58–70, 2013.  [5] W. Shirer, S. Ryali, E. Rykhlevskaia, V. Menon, and M. Greicius, “Decoding SubjectDriven Cognitive States with WholeBrain Connectivity Patterns,” Cerebral Cortex, vol. 22, no. 1, pp. 158–165, 2012.
 [6] M. Ekman, J. Derrfuss, M. Tittgemeyer, and C. J. Fiebach, “Predicting Errors from Reconfiguration Patterns in Human Brain Networks,” Proceedings of the National Academy of Sciences, vol. 109, no. 41, pp. 16 714–16 719, 2012.
 [7] I. Onal, M. Ozay, E. Mizrak, I. Oztekin, and F. YarmanVural, “A New Representation of fMRI Signal by a Set of Local Meshes for Brain Decoding,” IEEE Transactions on Signal and Information Processing over Networks, 2017.
 [8] D. M. Barch, G. C. Burgess, M. P. Harms, S. E. Petersen, B. L. Schlaggar, M. Corbetta, M. F. Glasser, S. Curtiss, S. Dixit, C. Feldt et al., “Function in the Human Connectome: TaskfMRI and Individual Differences in Behavior,” Neuroimage, vol. 80, pp. 169–189, 2013.
 [9] S. P. Pantazatos, A. Talati, P. Pavlidis, and J. Hirsch, “Decoding Unattended Fearful Faces with WholeBrain Correlations: An Approach to Identify ConditionDependent LargeScale Functional Connectivity,” PLoS Computational Biology, vol. 8, no. 3, p. e1002441, 2012.
 [10] W. H. Thompson and P. Fransson, “The Frequency Dimension of fMRI Dynamic Connectivity: Network Connectivity, Functional Hubs and Integration in the Resting Brain,” NeuroImage, vol. 121, pp. 227–242, 2015.
 [11] E. Bullmore, J. Fadili, V. Maxim, L. Şendur, B. Whitcher, J. Suckling, M. Brammer, and M. Breakspear, “Wavelets and functional Magnetic Resonance Imaging of the Human Brain,” Neuroimage, vol. 23, pp. S234–S249, 2004.

[12]
C. Poultney, S. Chopra, Y. L. Cun et al.
, “Efficient Learning of Sparse Representations with an EnergyBased Model,” in
Advances in neural information processing systems, 2007, pp. 1137–1144.  [13] S. Wager, S. Wang, and P. S. Liang, “Dropout Training as Adaptive Regularization,” in Advances in neural information processing systems, 2013, pp. 351–359.
 [14] V. D. Calhoun, R. Miller, G. Pearlson, and T. Adalı, “The Chronnectome: TimeVarying Connectivity Networks as the Next Frontier in fMRI Data Discovery,” Neuron, vol. 84, no. 2, pp. 262–274, 2014.
 [15] G. W. Milligan and M. C. Cooper, “A Study of the Comparability of External Criteria for Hierarchical Cluster Analysis,” Multivariate Behavioral Research, vol. 21, no. 4, pp. 441–458, 1986.
Comments
There are no comments yet.