1 Introduction
The cerebral cortex is essential to a wide range of cognitive functions. Automated algorithms for brain surface analysis thus play an important role in understanding the structure and working of this complex organ. Nowadays, deep learning models such as convolutional neural networks (CNNs) have achieved stateoftheart performance for most image analysis tasks including image classification, registration and segmentation
[1]. However, these models typically require large annotated data for training, which are often expensive to obtain in medical applications. This is especially true for the task of cortical segmentation, also known as parcellation, where generating ground truth data requires labelling possibly thousands of nodes on a highlyconvoluted surface. This burden also explains why datasets for such task are relatively small. For instance, the largest publiclyavailable dataset for cortical parcellation, MindBoggle [14], contains only 101 manuallyannotated brain surfaces. Moreover, another common problem of deep learning models is their lack of robustness to differences in the distribution of training and test data. Hence, a CNN model trained on the data from a source domain usually fails to generalize to samples from other domains, i.e., the target domains.Domain adaptation [18] has proven to be a powerful approach for making algorithms trained on source data to generalize on data from a target domain, without having explicit labels for target samples. Generative adversarial networks (GANs) [9]
leverage adversarial training to produce realistic images. In such approach, a discriminator network classifies images produced by a generator network as real or fake, and the generator improves by learning to fool the discriminator. Following the success of GANs, adversarial techniques have later been proposed to improve the learning capability of CNNs across different domains. In adversarial domain adaptation methods for segmentation
[21, 23, 8, 20, 22, 12], two main task are considered: the first one involves learning a fully supervised segmentator on the source domain and, in the second, a discriminator network forces the segmentator to have a similar prediction on both, the source and the target domains. These adversarial techniques usually rely on either feature space adaptation or output space adaptation. Initial works [16, 7] focused on matching the distributions of features from source and target domains such that the learning generalizes across domains for classification tasks. As the output of CNNs for segmentation contains rich semantic information, [19] proposed a method that, instead, leverages output space adaptation. Various work on pixelwise domain adaptation has been developed for natural color images [7, 11]. In medical image analysis, [13] proposed an adversarial neural network for MRI image segmentation without any additional labels on the test domain. Likewise, [12] presented a vessel segmentation approach for fundus images, which uses a gradient reversal layer for adversarial training. Recent work [2] also addressed the problem of domain adaptation by adding a differentiable penalty on the target domain. However, these domain adaptation techniques focus on data lying in the Euclidean space (natural or medical images), and are not suitable for graph structures such as surface meshes.The image space is inadequate to capture the varying geometry of the brain surface. Differences in surface geometry hinder statistical frameworks from exploiting spatial information in Euclidean space. The extension of standard convolutions to nonEuclidean spaces like manifolds and graphs has led to the development of various geometric deep learning frameworks [3, 17]. A recent work [5] proposed to use geometric deep learning for segmenting three cortical regions by relying on the spatial representation of the brain mesh. Later, based on the spectral representation of brain meshes, [10] developed a graph convolution network (GCN) to parcellate the cerebral cortex. Despite offering more flexibility than Euclideanbased approaches, these methods are domaindependent and would fail to generalize to new datasets (domains) without explicit retraining. Adding to the challenges of the field, obtaining annotations for these new datasets is also in practice particularly difficult.
In this paper, we address the limitations of existing techniques for cortical parcellation and propose an adversarial domain adaptation method on surface graphs. Specifically, we focus on a problem shared by most GCNbased approaches, which is the need for a common basis to represent and operate on graphs. For instance, spectral GCNs [4, 6] require computing the eigendecomposition of the graph Laplacian matrix in order to embed graphs in a space defined by a fixed eigenbasis. As described in [10]
, separate graphs may have different eigenbases. Furthermore, the eigenvectors obtained for a given graph are only defined up to a sign (i.e.,
), and up to rotation if different eigenvectors share close eigenvalues, typically observed in spectral graph analysis. Due to these ambiguities, spectral GCNs cannot be used to compare multiple graphs directly and need an explicit alignment of graph eigenbases as an additional preprocessing step, which brings its own ambiguities. Here, we focus on generalizing parcellation across multiple brain surface domains by removing the dependency to these domain alignments.
The contributions of our work are multifold:

[itemsep=3pt,topsep=3pt]

We present, to the best of our knowledge, the first adversarial graph domain adaptation method for surface segmentation. Our novel method trains two networks in an adversarial manner, a fullyconvolutional GCN segmentator and a GCN domain discriminator, both of which operate on the spectral components of surface graphs.

Compared to existing approaches, our surface segmentation method offers greater robustness to differences in domainspecific alignments. Hence, our method offers a better generalization on targetdomain datasets where surface data are aligned differently, without requiring an explicit alignment or manual annotations of these surfaces.

We demonstrate the potential of our method for alignmentinvariant parcellation of brain surfaces, using data from MindBoggle, the largest publiclyavailable manuallylabeled surface dataset. Our results show a mean Dice improvement of 8% over using the same segmentation network without adversarial training.
In the next section, we detail the fundamentals of our graph domain adaptation method for surface segmentation, followed by experiments validating the advantages of our method and a discussion of results.
2 Method
An overview of our proposed method is shown in Fig. 1. In the initial step, the cortical brain graph is embedded into the spectral domain using the graph Laplacian operator. Next, samples only from the source domain are aligned to a reference template using the Iterative Closest Point (ICP) algorithm. Finally, a graph domain adaptation network is trained to perform alignmentindependent parcellation. The segmentator network learns a generic mapping from input features of surface data, for instance, the spectral coordinates and sulcal depth of cortical points, to cortical parcel labels.
2.1 Spectral embedding of brain graphs
We start by describing the spectral graph convolution model used in this work. Denote as a brain surface graph with node set , such that , and edge set . Each node
has a feature vector
representing its 3D coordinates. We map to a lowdimension manifold using the normalized graph Laplacian operator , where is the weighted adjacency matrix and the diagonal degree matrix. Here, we consider weighted edges and measure the weight between two adjacent nodes as the inverse of their Euclidean distance, i.e. where is a small positive constant. Letting be the eigendecomposition of , the normalized spectral coordinates of nodes are given by .Denote the neighbors of node as . The convolution operation used in our spectral GCN is defined as
(1)  
where is the feature of node in the th feature map of layer , is the weight in the th convolution filter between feature maps and of subsequent layers, is the bias of feature map at layer , and
is a nonlinear activation function. The information of the spectral embedding relating nodes
and is included via a symmetric kernel parameterized by . In this work, we follow [10] and use a Gaussian kernel: .2.2 Graph domain adaptation
Our graph domain adaptation algorithm contains two blocks: a segmentator GCN performing cortical parcellation and a discriminator GCN which predicts a given parcellation comes from a source or target graph. Let be the set of source graphs and the set of unlabeled domain graphs, with the entire set of graphs available in training. In the first step, we optimize the segmentator GCN using labeled source graphs . We feed the segmentation prediction’s to the discriminator whose role is to identify the input’s domain (i.e., source or target). The gradients computed from an adversarial loss on target domain graphs is backpropagated from to , forcing the segmentation to be similar for both the source and target domain graphs.
As in other adversarial approaches, we define the learning task as a minimax problem between the segmentator and discriminator networks,
(2) 
where is the supervised segmentation loss on labeled source graphs, and is the discriminator loss on both source and target graphs, which is optimized in an adversarial manner for and .
Segmentator loss For each input graph, the segmentator network outputs a parcellation prediction where
is the probability that node
belongs to parcel . In this work, we define the supervised segmentation loss as a combination of weighted Dice loss and weighted crossentropy (CE),(3) 
with
being a onehot encoding of the reference segmentation and
a small constant to avoid zerodivision. The weights balances the loss for parcels by increasing the importance given to smallersized regions. In the loss of Eq. (3), CE improves overall accuracy of node classification while Dice helps to have structured output for each parcel.Discriminator loss Since the discriminator is a domain classifier, we define its loss as the binary crossentropy between its domain prediction (i.e., for source or for target):
(4) 
As mentioned before, this loss is maximized while updating the segmentator’s parameters and minimized when updating the discriminator. Thus, the segmentator learns to produce surface parcellations that are domaininvariant.
2.3 Network architecture
We now define the architecture of both the segmentator and discriminator GCN.
Segmentator: The segmentator GCN network is a fullyconvolutional network comprised of 3 graph convolution layers with respective feature map sizes of 256, 128, and 32. At the input, each node has 4 features: 3 spectral coordinates and an additional scalar measuring sulcal depth. All layer have Gaussian kernels similar to [10]
. Since the output has 32 parcels, our last layer size is set to 32. In the last layer, softmax operation is applied for parcellation prediction, and the remaining layers employ Leaky ReLU as activation function to obtain filter responses in Eq. (
1).Discriminator: Similar to the segmentator network, we use 2 graph convolution layers, an average pooling layer and 3 fully connected (linear) layers for classifying the segmentation domain. The first graph convolution layer takes a segmentation predictions with 32 feature maps as input. Moreover, the output sizes of the first two layers output are 128 and 64, respectively. Average pooling is used to reduce the input graph to a 1D vector for the classification task. Three fullyconnected layers are placed at the end of the network, with respective sizes of 32, 16 and 1. Each graph convolution layer has Gaussian kernels, as in [10]. Sigmoid activation is applied to the last linear layer to predict the input domain of the graph sample and the remaining layers use Leaky ReLU.
3 Results
We evaluate the performance our method using MindBoggle [14], the largest manually labelled brain surface dataset. This dataset contains the cortical mesh data of 101 subjects aggregated from multiple sites. Each brain surface includes 32 manually labeled parcels. For each subject, we subsample the mesh into 25 smaller subgraphs with 10k nodes each. All experiments are carried out using this reduced graph on an i7 desktop computer with 16GB of RAM and a Nvidia Titan X GPU. First, we assess the impact on segmentation performance of parameter which controls the relative importance of the supervised segmentation loss and adversarial loss in Eq. (2). Second, we benchmark our domain adaptation algorithm against other learning frameworks for cortical parcellation.
3.1 Effect of on segmentation
The loss function for graph adversarial training involves the hyperparameter
, that controls the effect of adversarial loss on training the segmentator GCN network. We measure the parcellation performance of this network on the target domain surfaces and the discriminator accuracy over epochs. Our aim is to study how the performance varies with different values of . The mean Dice overlap over epochs for different values is reported in Fig. 2 (left(. Furthermore, the right Fig. 2 shows the classification accuracy of the discriminator for the same values.Results of this experiment indicate that is the best choice for training the adversarial GCN. When using a too small , we observe segmentation performance drop over the unseen target domain surfaces, which illustrates that a stronger adversarial learning is required to align the source and target domains. The dissimilarity between the segmentation predicted for source and target graphs is also evident from the high discriminator accuracy in Fig. 2 (right), i.e. the discriminator is not fool in this case. On the other hand, when using a too large , the model focuses mostly on fooling the discriminator, leading to a poor segmentation Dice overlap. Based on this analysis, we will use for the rest of our experiments.
3.2 Comparison with the stateoftheart
Method  Test data alignment  

Source  None  Target 1  Target 2  Target 3  Target 4  
Spectral RF [15]  81.9 3.4  65.4 9.0  60.0 1.8  55.3 2.1  60.2 4.0  55.2 3.0 
SegGCN  86.5 2.8  71.4 7.9  67.8 2.0  58.8 2.8  63.5 3.2  60.1 3.6 
AdvGCN (ours)  85.7 3.5  73.8 6.0  73.5 2.0  71.8 2.6  71.0 2.8  71.7 3.3 
Average percentage dice overlap and standard deviation on test data. Each domain (column) is generated by aligning the eigenbases of the samples to a reference template. Column 2 is alignned to same source domain reference. Column 3 (None) is unaligned and completely ambiguous. The test set for target domains in columns 47 are aligned to random reference from test set.
We now compare our method with other surface parcellation approaches based on graphs. The average Dice overlap is measured to assess the performance of each model. In Table 1
, we report the performance on unseen test dataset. The different target domains are generated by aligning the eigenbases of test brain graphs either to the same template as source (column 2 – Source) or completely ambiguous (column 3 – None) or eigenbases of a random brain graphs from the test set (columns 4 to 7 – Target 1 to 4). First, we show the limitation of pointbased approaches which ignore the relationship between nodes when predicting labels. Toward this goal, we follow the spectral random forest (RF) approach in
[15] and train a random forest with 50 trees using the same input as given to GCN networks (i.e., spectral coordinates and sulcul depth). As shown in Table 1, this Spectral RF approach achieves a mean Dice overlap of 81.9 with the aligned set and only 65.4 on unaligned set. The random forest does not consider neighborhood information for parcellation and thus obtains low performance on the unaligned brain graphs. A graph segmentation network without additional discriminator (SegGCN) network yield an average percentage Dice overlap of 86.5 on aligned set and 71.4 on unaligned set. We achieve an improvement in performance of 4.6 and 6.0 on aligned and unaligned domain respectively with only additional neighborhood information used by GCN segmentation network. Further, our GCN network trained in an adversarial setting (AdvGCN) produces generic segmentation maps on both aligned and unaligned brain graphs. An average percentage Dice overlap of 85.7 on aligned set and 73.8 on unaligned set. Our proposed model AdvGCN has an increased in performance over unaligned set with equivalent performance on aligned set to SegGCN. To better understand the significance of our graph domain adaptation network, we evaluate our method against multiple aligned domains. The Table shows the our model achieves an improved performance across unaligned domain and different target aligned domains. The Figure 3 shows qualitative results for different graph segmentation methods.4 Conclusion
In this paper, we present a novel adversarial domain adaptation framework for brain surface graphs. The proposed algorithm leverages a adversarial training mechanism to obtain a generalized brain surface segmentation. The reported experiments illustrate the advantages of our approach for brain surface segmentation. This method overcomes the limitations of spectral GCNs [4, 6] that require finding an explicit alignment of graph eigenbases. The Table 1 shows a clear improvement in performance over the latest spectral GCN [4, 6] as well as the forestbased [15] approaches. Our method improves the average Dice performance for parcellation by over unaligned domains and a maximum of different over multiple domain alignments. The performance and time complexity of our method is similar to SegGCN [10] on test sets for a source domain. The Fig. 3 illustrates the qualitative comparison of our adversarial GCN. The potential of our adversarial graph domain adaptation technique is demonstrated on surface segmentation, but can also be used for other surface segmentation problems. For example, domain adaptations for semisupervised segmentation, thereby mitigating the requirement of large amounts of labelled surfaces.
Acknowledgments
This research work was partly funded by the Fonds de Recherche du Quebec (FQRNT) and Natural Sciences and Engineering Research Council of Canada (NSERC). We gratefully acknowledge the support of NVIDIA Corporation for the donation of the Titan X Pascal GPU used for this research.
References
 [1] (2017) Single subject prediction of brain disorders in neuroimaging: promises and pitfalls. Neuroimage. Cited by: §1.
 [2] (2019) Constrained domain adaptation for segmentation. In International Conference on Medical Image Computing and ComputerAssisted Intervention, Cited by: §1.
 [3] (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Processing. Cited by: §1.
 [4] (2014) Spectral networks and locally connected networks on graphs. In ICLR, Cited by: §1, §4.
 [5] (2018) Convolutional neural networks for meshbased parcellation of the cerebral cortex. In MIDL, Cited by: §1.
 [6] (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In NIPS, Cited by: §1, §4.

[7]
(2015)
Unsupervised domain adaptation by backpropagation
. InInternational Conference on Machine Learning
, Cited by: §1.  [8] (2017) Transfer learning for domain adaptation in mri: application in brain lesion segmentation. In International conference on medical image computing and computerassisted intervention, Cited by: §1.
 [9] (2014) Generative adversarial nets. In Advances in neural information processing systems, Cited by: §1.
 [10] (2019) Graph convolutions on spectral embeddings for cortical surface parcellation. Medical image analysis. Cited by: §1, §1, §2.1, §2.3, §2.3, §4.
 [11] (2016) Fcns in the wild: pixellevel adversarial and constraintbased adaptation. arXiv preprint arXiv:1612.02649. Cited by: §1.
 [12] (2018) Domain adaptation for biomedical image segmentation using adversarial training. In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Cited by: §1.
 [13] (2017) Unsupervised domain adaptation in brain lesion segmentation with adversarial networks. In International conference on information processing in medical imaging, Cited by: §1.
 [14] (2017) Mindboggling morphometry of human brains. PLOS Computational Biology. Cited by: §1, §3.
 [15] (2015) Spectral forests: Learning of surface data, application to cortical parcellation. In MICCAI, Cited by: Figure 3, §3.2, Table 1, §4.
 [16] (2015) Learning transferable features with deep adaptation networks. In International Conference on Machine Learning, Cited by: §1.
 [17] (2017) Geometric deep learning on graphs using mixture model CNNs. In CVPR, Cited by: §1.
 [18] (2019) Embracing imperfect datasets: a review of deep learning solutions for medical image segmentation. Medical Image Analysis. Cited by: §1.

[19]
(2018)
Learning to adapt structured output space for semantic segmentation.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, Cited by: §1.  [20] (2019) Advent: adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Cited by: §1.
 [21] (2017) Curriculum domain adaptation for semantic segmentation of urban scenes. In Proceedings of the IEEE International Conference on Computer Vision, Cited by: §1.
 [22] (2018) Task driven generative modeling for unsupervised domain adaptation: application to xray image segmentation. In International Conference on Medical Image Computing and ComputerAssisted Intervention, Cited by: §1.
 [23] (2018) Unsupervised domain adaptation for semantic segmentation via classbalanced selftraining. In Proceedings of the European conference on computer vision (ECCV), Cited by: §1.
Comments
There are no comments yet.