1 Introduction
As a fundamental task in medical image analysis, deformable image registration aims to establish dense, nonlinear spatial correspondences between a pair of images (denoted as the source/moving image and the target/fixed image) [1]. For example, while it is difficult to compare brain magnetic resonance imaging (MRI) scans of different subjects due to significant anatomical variability [2, 3], deformable registration enables direct comparison of anatomical structures across scans, and thus, is crucial for understanding variability across populations and the longitudinal evolution of brain anatomy for individuals with brain diseases [4, 5, 6].
Over the past few decades, a variety of nonlearningbased deformable registration methods have been proposed for medical image analysis [7, 6, 8, 9]. Typically, these methods iteratively optimize a similarity function for each pair of images to nonlinearly align voxels with the similar appearance, while encouraging local smoothness on the registration mapping [1]. Since the similarity function needs to be optimized from scratch for each pair of unseen images, i.e.
, the inherent registration patterns shared across different images are ignored, these methods are usually slow to perform deformable registration in practical applications. To address these issues, supervised learningbased and deeplearningbased approaches have been developed for deformable image registration
[10, 11, 12]. These methods typically rely on taskspecific groundtruth registration to train a regression model for image registration. However, the difficulty of collecting groundtruth information often limits their utility in realworld applications. In addition, the mapping learned by these supervised methods might be biased by the selected groundtruth registration.Recently, several unsupervised deeplearning methods have been proposed for medical image registration [13, 14, 15, 16, 17] without using any predefined supervision information for network training, thus maintaining the unsupervised nature of deformable registration. Although these methods achieve better registration performance in comparison to traditional supervisedlearningbased methods, the output transformations (e.g., displacement field or flow) are usually asymmetric, i.e., the inherent inverseconsistent property of transformations between a pair of images is ignored. Here, the inverseconsistent property means that tobelearned optimal transformations would encourage that a pair of images are symmetrically deformed toward each another, and the two bidirectionally deformed images are finally matched. Unfortunately, previous studies usually independently estimate the transformation from an image to an image or from to , thus failing to ensure these transformations be inverse mappings for each another. Besides, most of the existing (supervised or unsupervised) algorithms utilize solely a spatial smoothness penalty to constrain the transformation, which could not completely avoid foldings (typically indicating errors) in the registration mapping. As an illustration, in Fig. 1, we show two flows generated by a stateoftheart deeplearning method [17] trained with different (i.e., strong vs. weak) contributions from the smoothness constraint. As shown in Fig. 1 (a), if we excessively encourage local smoothness of the tobeestimated flow by using a large weight for the smoothness constraint, the obtained registration will be inaccurate due to global errors. Otherwise, there will be a lot of foldings in the learned flow (see Fig. 1 (b)), thus generating wrong registration due to local defects. That is, it is challenging to properly tune the contribution of the smoothness constraint to simultaneously avoid foldings in the estimated flow and maintain high registration accuracy.
To address these two issues, in this paper, we propose an InverseConsistent deep neural Network (ICNet) for unsupervised deformable image registration. Specifically, in ICNet, we develop an inverseconsistent constraint to encourage that a pair of images are symmetrically deformed toward each another in multiple passes, until the bidirectionally deformed images are matched to achieve correct registration. Besides using the conventional smoothness constraint, we also develop an antifolding constraint to avoid foldings in the tobeestimated flow. The proposed ICNet method does not require any supervision information, which also encourages the diffeomoprhic property of the transformations via the proposed inverseconsistent and antifolding constraints. We evaluate our method in both tasks of tissue segmentation and anatomical landmark detection with 3D T1weighted brain MRI scans. The experimental results demonstrate the superior performance of the proposed method over several stateoftheart methods in deformable image registration.
The rest of this paper is organized as follows. We briefly introduce relevant studies in Section 2. In Section 3, we present the proposed network, inverseconsistent constraint, and antifolding constraint in detail. We then describe studied materials, competing methods, experimental settings and results in Section 4. We further analyze the influence of several essential strategies used in the proposed method in Section 5. We finally conclude this paper in Section 6.
2 Related Work
2.1 Deformable Image Registration
Deformable image registration refers to an nonlinear process of revealing the voxelwise spatial correspondence between source and target images. Let denotes the source image, and represents the target image. We assume is the tobelearned flow (i.e., displacement field) that warps to . The optimization problem is typically defined as
(1) 
where denotes the transformation operator that warps the source image to the target image using the flow . The first term in Eq. 1 is a similarity/matching/distance criterion, which is used to quantify the level of alignment between the warped source image and the target image . The second term is a regularizer that constrains the transformation to favor a specific property, such as encouraging the estimated flow to be locally smooth. The optimization problem consists of either maximizing or minimizing the objective function, depending on how the first term is defined.
Different algorithms for deformable registration mainly differ in deformable models, similarity criteria, and numerical optimization [18, 19, 20, 21, 22, 23, 1]. In the literature, many types of similarity metrics have been proposed for image registration, such as mean squared distance (MSD) [24, 25], sumofsquares distance (SSD) [26, 27, 28], normalized cross correlation (CC) [29, 30, 6, 11], and normalized mutual information (MI) [31, 32, 33]. Besides, there are various regularization terms developed to penalize undesired deformations, such as topology preservation [34, 35, 36, 36, 37], volume preservation [38, 39, 40, 41, 42, 43], and rigidity constraints [44, 18, 45]. In general, the existing registration algorithms can be roughly divided into two categories, including 1) nonlearningbased methods, and 2) learningbased methods. We now introduce relevant studies in these two categories.
2.2 Nonlearningbased Registration Methods
Nonlearningbased registration algorithms typically optimize a transformation iteratively based on an energy function in the form of Eq. 1
. Based on how to compute the similarity between the warped source image and the target image, there are two types of registration approaches, including 1) volumebased methods where the voxel intensities in the whole volume are used to drive the registration process, and 2) landmarkbased methods where features extracted at anatomical landmarks are employed to guide the matching of the local correspondence during registration.
The most popular nonlearningbased methods for deformable registration include automatic image registration (AIR) [24], automatic registration toolbox (ART) [29], HAMMER [6], Demons [26], diffeomorphic Demons [27, 46], statistical parametric mapping (SPM) [25], deformable registration via attribute matching and mutualsaliency weighting (DRAMMS) [47], DROP [48], CC/MI/SSDFFD [8], FNIRT [28], and symmetric normalization (SyN) [5]. Most of these methods require iterative optimization algorithms for parameter tuning [1]. Also, the registration performance of these methods would degrade when the source and target images have large variations in anatomical appearance. Therefore, robust and tuningfree deformable registration methods are highly desired for dealing with different data and registration tasks.
2.3 Learningbased Registration Methods
Many learningbased methods have been developed for deformable image registration [49]
, such as those based on random forest
[50], support vector regression
[51], sparse representation [52], and deep neural networks [16, 17, 12]. In these methods, deformable registration is often formulated as a learning problem to estimate the registration parameters. Compared with nonlearning methods, learningbased approaches can predict the transformation efficiently for unseen testing images based on pretrained models. According to whether supervision information is needed, existing learningbased methods for image registration can be further categorized into two types: 1) supervised learning methods, and 2) unsupervised learning methods.
2.3.1 Supervised Methods
In supervised methods for deformable image registration, taskspecific supervision information (e.g., groundtruth registration) is usually required for model training. For instance, random forest has been applied for infant brain MRI registration and multimodal image registration [50, 53, 54], based on handengineered imaging features and predefined groundtruth registration. However, the registration performance of such traditional learningbased methods could be suboptimal, since the process of feature extraction is independent of the model training.
Several deeplearningbased methods have been recently developed, by incorporating feature extraction and registration model learning into a unified framework. Cao et al. [12]
proposed a convolutional neural network (CNN) based regression model to directly learn the mapping from the input image pair (
i.e., target and source images) to their corresponding deformation fields. Yang et al. [55] developed a fully convolutional network (FCN) to predict 3D deformable registration, followed by a correction network to further refine the predicted transformation. Rohé [56] proposed an FCN model to learn the stationary velocity field, which consists of a contracting path to extract taskrelevant features and a symmetric expanding path to output the transformation parameters. Sokooti et al. [57] proposed to predict displacement vectors by CNN models. Krebs et al. [58]adopted a deep reinforcement learning framework to estimate deformation fields. These methods usually require
taskspecific groundtruth registration for model learning. Since groundtruth information is difficult to collect, supervised methods usually have limited utility in practice. Also, the performance of supervised methods is largely determined by the quality of predefined groundtruth registration.2.3.2 Unsupervised Methods
By maintaining the unsupervised nature of deformable registration, unsupervised deeplearning methods have also been applied for medical image registration [13, 14, 15, 16, 17]. That is, these methods do not rely on any predefined supervision information for network training. For example, Miao et al. [14] developed a CNN model to learn transformation parameters for 2D/3D images, where imaging features for parameter regression should be predefined, which means this network cannot be trained in an endtoend manner. Wu et al. [13] proposed an unsupervised deeplearning algorithm for image registration based on image patches. Although this method can automatically extract features from images, it requires additional postprocessing that cannot be handled inside CNNs. Shan et al. [15] developed an unsupervised endtoend deeplearning model for 2D tissue registration, by directly predicting deformation field via a CNN. In [16], an endtoend unsupervised deeplearning model, consisting of a CNNbased regressor, a spatial transformer, and a resampler, was developed for deformable registration. Then, Guha et al. [17]
proposed an unsupervised CNN model, in which a spatial transformer network (STN) is also used to reconstruct one image from another while imposing smoothness constraints on the registration field. This method has achieved superior accuracy for 3D image registration in comparison to previous methods.
It is worth noting that most of the existing deeplearning methods ignore the inherent inverseconsistent property of transformations between a pair of images [15, 16, 17]. That is, by independently estimating the transformation from an image to an image and that from to , these methods are unable to ensure that these transforms are inverse mappings of one another. Note that several studies tackle this shortcoming by jointly estimating the transformations from both to and to , under the consistency constraint that these transformations are inverses of one another [34, 59, 60]. However, these methods are not learningbased methods and require humanengineered feature representations (e.g., image intensity with Fourier series parameterization) of input images, which may not extracted in a taskoriented manner. Motivated by these studies, we develop an unsupervised deep network with an inverseconsistent constraint to encourage the inverseconsistent property of transformations between a pair of input images. Besides, considering using the smoothness constraint alone (as previous studies did) is not sufficient to guarantee that there is no folding in the estimated transformations, we further develop an antifolding constraint to avoid foldings in the learned transformations.
3 Proposed Method
In this section, we first introduce the notations used in this paper. Then, we describe the proposed inverseconsistent deep neural network, as well as the objective function (with both the proposed inverseconsistent and antifolding constraints) for network training. We finally introduce the implementation details for the proposed method.
3.1 Notations
In image registration, a pair of images are usually referred to as the source image and the target image. Since we do not rely on particular target images in our method, in this paper, we denote one input image as and the other as . These two images are defined in the image domain . The transformation is a mapping function of the image domain to itself, which spatially deforms any point locations to other locations. Also, we assume that the image is deformed to match the image according to a dense flow (i.e., discrete displacement field) defined in the space, while the image is deformed to match the image via another dense flow defined in the space. Note that each element in a flow is a dimensional vector (corresponding to three axes, i.e., ), indicating the displacement of a particular voxel from its original location to a new location. In addition, the deformed/warped images of and are denoted as and , respectively.
3.2 Inverseconsistent Unsupervised Neural Network
Figure 2 illustrates the proposed unsupervised deep network for deformable image registration. As shown in Fig. 2
(a), we employ a fully convolutional network (FCN) to model two dense, nonlinear transformations (
i.e., and ) from a pair of input images (i.e., and ) to their warped images (i.e., and ). There are two FCN modules in our proposed network. The first one is used to align the image to (as the target image) using the flow , generating the warped image . In contrast, the second FCN is designed to model the registration mapping from the image to (as the target image) via the flow , yielding the warped image . It is worth noting that these two FNCs share network structure and parameters.As shown in Fig. 2 (b), the FCN we used here follows a UNet architecture [61] to capture and combine both global and local structural information of input images. Specifically, the input data contain two channels (with each channel corresponds to a particular input image), and is the number of starting filter channels of the FCN. The FCN contains a contracting path for image downsampling and an expanding path for image upsampling. Every step in the contracting path contains a
convolution with a stride of
, and a convolution with a stride of for downsampling. Besides, each step in the expanding path consists of a deconvolution with a stride of for upsampling, followed by a concatenation process to combine upsampled feature maps with the corresponding feature maps from the contracting path, and then a convolution with a stride of. In this network, each convolution is followed by a rectified linear unit (ReLU) activation, while the output of the last layer (having
filter channels that are corresponding to the , , and axis) is constrained into . That is, we first use a function to normalize the output of the last layer to , followed by multiplying by a constant (i.e., the maximum displacement magnitude). Since both input images are treated equally in the proposed ICNet framework, the two FCNs in Fig. 2 (b) share the same parameters.Besides, we can see from Fig. 2 (a) that a grid sampling module is utilized to generate the warped image (e.g., ), based on the input image (e.g., ) and the learned flow (e.g., ). Specifically, such grid sampling is implemented via the fullydifferentiable spatial transformer network (STN) [62]
, containing a regular spatial grid generator and a sampler. The flow (displacement field) predicted by our image registration network is used to transform the regular spatial grid into a sampling grid. Then, the sampler uses the sampling grid to warp the input image. Bilinear interpolation is used during the sampling process, making STN fully differentiable for back propagation.
Furthermore, an inverse network is developed to generate an estimated inverse flow (e.g., ) of each transformation learned by the FCN (e.g., ), based on which an inverseconsistent loss (i.e., and ) is further adopted to encourage the inverseconsistent property of two transformations. As shown in Fig. 2 (c), we utilize the grid sampling strategy to generate the estimated inverse flow , based on both and .
3.3 Proposed Objective Function
3.3.1 Inverseconsistent Constraint
Existing deeplearning methods typically ignore the inverseconsistent property of transformations between a pair of images [15, 16, 17]. Motivated by previous nonlearningbased inverseconsistent methods [34, 59, 60], we propose to simultaneously estimate the transformation from to (i.e., ) and the transformation from to (i.e., ), and enforce the consistency constraint that these bidirectional transformations are inverse mappings of one another.
Specifically, we propose an inverseconsistent regularization term to penalize the difference between two transformations from the respective inverse mappings. As shown in Fig. 2 (c), we rely on an inverse network to generate the inverse mapping (e.g., ) of each transformation (e.g., ). Specifically, for the flow , we first obtain its negative flow (i.e., ) in the space. We then feed both and to the grid sampling module (via a STN) to obtain the estimated inverse flow (i.e., in the space) of . Similarly, we feed both and its negative flow to the grid sampling module, and hence can obtain the estimated inverse flow (i.e., ) of . Mathematically, the proposed inverseconsistent constraint can be defined as follows
(2) 
with
(3)  
where is the mapping generated by the grid sampling module (via a STN), and represents the Frobenius norm of a matrix. The two terms in Eq. 2 correspond to the notations and in Fig. 2 (a). By minimizing Eq. 2, we concurrently encourage both the difference between the flows and (i.e., the inverse of ) and that between and (i.e., the inverse of ) to be small. In this way, the inverseconsistent property of the tobeestimated transformations can be explicitly modeled in the proposed network.
3.3.2 Antifolding Constraint
As mentioned in Section 1 (e.g., Fig. 1), if we excessively encourage local smoothness of the tobeestimated flow by using a large weight for the smoothness constraint, the registration results will be inaccurate. Otherwise, there will be possible foldings in the flow, thus yielding unreasonable registration. To deal with this issue, besides using the conventional smoothness constraint, we also develop an antifolding constraint as
(4)  
where is the gradient of the flow along the th () axis at the location of the voxel . Besides, the term is an index function used to penalize the gradient of the flow at the locations with foldings. That is, if , ; and , otherwise.
The purpose of Eq. 4 can be explained as follows. If there is a folding at the location of along the th axis (i.e., ), we enforce the penalty on the gradient at this location, requiring this gradient to be small. In contrast, if (i.e., no folding at the location of along the th axis), we do not penalize the gradient at this location. More detailed explanation can be found in the Appendix.
3.3.3 Smoothness Constraint
In previous studies, the tobeestimated deformation field is generally to be locally smoothed via a smoothness constraint on its spatial gradients [16, 17]. Here, we also use such smoothness constraint in the objective function as
(5) 
where is the gradient of the flow at the voxel , while denotes the gradient of the flow at the voxel . The operation represents the norm of a vector. Here, we approximate the spatial gradients using the differences between neighboring voxels.
3.3.4 Objective Function
In this work, we utilize the mean squared distance (MSD) as the similarity metric to compare the alignment between the warped image and its corresponding target image. Specifically, the MSDbased symmetric similarity is employed to measure the shape differences between the deformed image and image and the differences between the deformed image and image , which is defined as
(6) 
where and , and is the mapping function learned in the grid sampling module.
Accordingly, the loss function of our proposed ICNet for deformable registration is formulated as follows
(7)  
where the parameters , and are used to balance the contributions of the smoothness, inverseconsistent and antifolding regularizers, respectively.
3.4 Implementation
We implement the proposed deep network with Pytorch
[63]. The objective function in Eq. 7 is optimized by the Adam algorithm [64] combined with a backpropagation algorithm for computing gradients as well as updating network parameters. The learning rate for Adam is empirically set to . In Fig. 3, we report the change curves of both training (red) and validation (green) losses on the public Alzheimer’s Disease Neuroimaging Initiative (ADNI1) database [65], where % subjects are randomly selected as the validation data, and the remaining subjects are used as the training data. This figure indicates that the proposed ICNet method generalizes well with almost no overfitting issue, and the proposed objective functions converges quickly within iterations. For readers’ convenience, our code and trained model will be made publicly available online.4 Experiments
In this section, we first introduce the studied materials, competing methods, and experimental settings. Then, we present results of brain tissue segmentation and anatomical landmark detection based on the warped MR images achieved by different registration methods. We finally analyze the computational costs of different methods.
4.1 Materials and Image Preprocessing
We perform experiments on subjects from two subsets of the public ADNI database^{1}^{1}1http://adni.loni.usc.edu [65], i.e., 1) ADNI1, and 2) ADNI2. To be specific, there are subjects with the baseline structural brain MRI scans in ADNI1, while the remaining subjects with the baseline structural MRI data are randomly selected from ADNI2. Since several subjects participated in both ADNI1 and ADNI2, we simply remove these subjects from ADNI2, to ensure that ADNI1 and ADNI2 are independent datasets in the experiments. Notably, the studied subjects from ADNI1 have T T1weighted MRI scans, while those in ADNI2 have T T1weighted MRI data.
For all structural brain MR images, we perform both spatial normalization and intensity normalization for image normalization. For spatial normalization, we first perform skull stripping and cerebellar removal for all brain MRIs, and then linearly align them to a common Colin27 [66] template. We further resample all linearly aligned images to have the same spatial resolution (i.e., ), followed by cropping them to have the same image size (i.e.,
). For intensity normalization, we first match the intensity histogram of each brain MRI to that of the Colin27 template by using a histogram matching algorithm. Then, we also perform the zscore normalization to make the mean intensity of each image is zero and the standard deviation is one.
In the experiments, we perform two tasks to evaluate the registration performance of different methods, including 1) brain tissue segmentation and 2) anatomical landmark detection. In the task of brain tissue segmentation, the groundtruth segmentation is generated by using first FAST in FSL [67] to obtain the tissue segmentation map, followed by further manual correction. As illustrated in Fig. 4 (b), three tissues are segmented from each brain MR image, including cerebrospinal fluid (CSF), gray matter (GM) and white matter (WM). In the task of anatomical landmark detection, the groundtruth landmarks are manually annotated by an experienced radiologist. As shown in Fig. 4 (c), five anatomical landmarks are annotated in each brain MR image, which are mainly located in ventricles.
4.2 Competing Methods
We compare the proposed ICNet method with three stateoftheart methods for deformable image registration, including 1) Demons with the symmetric local correlation coefficient used as the similarity metric [46], 2) symmetric normalization (denoted as SyN) [5], and 3) a unsupervised deeplearning (denoted as DL) method with a minimum mean squared error (MMSE) loss and a smoothness constraint [17]. Among them, Demons and SyN are nonlearningbased methods, while DL is an unsupervised learning method. For the fair comparison, the network architecture of the DL method is similar to our ICNet, while its objective function is different from ours. Specifically, only the first two terms in Eq. 7 is included in the objective function of DL, and hence, DL can be regarded as a degenerated variant of ICNet.
Tissue  Metric  Demons  SyN  DL  ICNet1  ICNet2  ICNet 
CSF  DSC (%)  
SEN (%)  
PPV (%)  .  
ASD  
HD  
GM  DSC (%)  
SEN (%)  
PPV (%)  
ASD  
HD  
WM  DSC (%)  
SEN (%)  
PPV (%)  
ASD  
HD  
To evaluate the effectiveness of the proposed two constraints (i.e., the inverseconsistent constraint and the antifolding constraint), we further compare ICNet with its two variants. The first variant is denoted as ICNet1, in which the proposed inverseconsistent constraint is removed. Similarly, the second variant is denoted as ICNet2, in which the proposed antifolding constraint is not used.
4.3 Experimental Settings
Since the six methods for comparison are all unsupervised, we do not need to generate any groundtruth registration for them. For the nonlearningbased methods (i.e., Demons and SyN), we utilize their recommended parameter settings in the experiments. For the learningbased methods (i.e., DL, ICNet1, ICNet2, and ICNet), we treat the ADNI1 and ADNI2 as the training set and testing set, respectively. We randomly select subjects from the ADNI1 as the validation data, while the remaining subjects are used as the training data.
In the training stage, we randomly select a pair of MR images from the ADNI1 as the input for each learningbased method. In the testing stage for all competing methods, to perform deformable registration, we first select MR images from the ADNI2 as atlas images, while the remaining MR images are used as testing images. By using different deformable registration algorithms, we first warp each of the atlases to a particular testing image, and thus generating warped atlas images based on this testing image. Then, we employ a multiatlas based segmentation algorithm with a majority voting strategy [68] to perform brain tissue segmentation in each testing image. Similarly, for landmark detection, each landmark in the atlases is first mapped onto a particular testing image via the corresponding deformable transformation [69]. Thus, given a testing image, we obtain warped landmark positions for each landmark, followed by averaging these positions to generate the final location of this landmark. It is worth noting that, to evaluate the performance of the proposed ICNet for deformable image registration, we only utilize atlasbased methods for tissue segmentation and landmark detection, while other supervised learning methods are beyond the scope of this paper.
In ICNet, the parameter is empirically set to to avoid folding in the flow as much as possible, and the other two parameters (i.e., and ) are determined via grid search within the range of on the validation set. Similarly, parameters in the three competing methods (i.e., in DL, in ICNet1, and in ICNet2) are also selected from the same range through crossvalidation. Besides, we set in ICNet1, and in ICNet2. The number of starting filter channels (see Fig. 2 (b)) for ICNet and its two variants (i.e., ICNet1 and ICNet2) is empirically set to . In each of the four deeplearning methods (i.e., DL, ICNet1, ICNet2, and ICNet), the output of the last layer (with filter channels corresponding to the , , and axis) is constrained into the range . Following [70], we empirically set in this work, considering the displacement magnitude is usually less than in the deformable registration of brain MRI scans.
Index  Demons  SyN  DL  ICNet1  ICNet2  ICNet 
Landmark #  
Landmark #  
Landmark #  
Landmark #  
Landmark #  
Average 
In the experiments of brain tissue segmentation, five complementary metrics are used for quantitative evaluation of segmentation performance, including 1) dice similarity coefficient (DSC), 2) sensitivity (SEN), 3) positive predictive value (PPV), 4) average symmetric surface distance (ASD), and 5) Hausdorff distance (HD). In the experiments of anatomical landmark detection, for each landmark, we report the landmark detection error by computing the Euclidean distance between the estimated landmark location (achieved by a specific method) and its groundtruth location. For the evaluation metrics of ACC, SEN and PPV, higher values indicate better performance. For the remaining three metrics (
i.e., ASD, HD, and detection error), lower values denote better performance.4.4 Results of Brain Tissue Segmentation
In the first group of experiments, we perform the segmentation of three types of brain tissues (i.e., CSF, GM and WM), based on the warped atlas images generated by six different methods. The experimental results are shown in Table I.
From Table I, one could have the following observations. First, in most cases, the proposed methods (i.e., ICNet1, ICNet2, and ICNet) achieve the overall best performance (regarding DSC, SEN, PPV, ASD, and HD) for segmenting all the three types of tissues. For instance, the DSC value achieved by ICNet for CSF segmentation is , while the DSC produced by the conventional deeplearning method (i.e., DL) is only . Second, even though no supervision information is required, four unsupervised deeplearningbased methods (i.e., DL, ICNet1, ICNet2, and ICNet) usually outperform two nonlearningbased methods (i.e., Demons and SyN). The underlying reason may be that deeplearningbased methods can extract taskoriented features via neural networks, while conventional methods simply employ handengineered features of brain MRIs. Besides, we can see that our ICNet method usually outperform its two variants (i.e., ICNet1 and ICNet2). Note that ICNet1 does not use the proposed inverseconsistent constraint, and ICNet2 does not utilize the proposed antifolding constraint. This implies that including both the inverseconsistent and antifolding constraints to train our ICNet could boost the deformable image registration performance.
Given a testing image, we further visually compare the registration results for a source brain MR image achieved by different methods in Fig. 5. From Fig. 5, we can see that the proposed ICNet method brings impressive improvement for the registration results, compared with the competing methods. For instance, it is obvious that the regions of the left and right planum temporale are more accurately registered to the target image using ICNet, as indicated by the red arrow (left planum temporale) and the yellow arrow (right planum temporale) in Fig. 5.
4.5 Results of Anatomical Landmark Detection
In the second group of experiments, we perform landmark detection based on the deformed atlas images generated by different registration methods, with the results reported in Table II. From Table II, we can see that the overall performance of our ICNet method is superior to the five competing methods. For instance, the average landmark detection error achieved by ICNet is , which is lower than the result of Demons (i.e., ).
It is worth noting that our methods (i.e., ICNet1, ICNet2, and ICNet) are unsupervised and do not need to generate the groundtruth flow for each toberegistered image during network training. This is a particularly useful property for the deformable registration algorithm, which not only maintains the unsupervised nature of deformable registration, but also avoids the challenge of collecting accurate groundreal registration.
4.6 Computational Cost
We now analyze the computational costs of the proposed ICNet method and those competing methods. For the four deeplearningbased methods (i.e., DL, ICNet1, ICNet2, and ICNet), the training process is performed offline, while the nonleandingbased methods (i.e., Demons and SyN) do not need any training process. Hence, we only analyze the online computational cost for nonlinearly registering a new brain MRI in the application/testing stage. Table III reports the computational costs of different methods. Note that Demons and SyN are implemented using a CPU (i77700, 3.6GHz), while the remaining four methods are implemented using a GPU (GTX 1080ti). We can observe from Table III that the computational costs of the four deeplearningbased methods require only second for deformable registration of one MRI, which is much faster than Demons ( minutes) and SyN ( hours). These results further demonstrate the potential utility of our method in practical applications.
Method  Demons  SyN  DL  ICNet1  ICNet2  ICNet 
Time 
5 Discussion
In this section, we first investigate the effect of two essential components (i.e., the inverseconsistent and antifolding constraints) in the proposed ICNet method. We then analyze the influence of network architectures and a network refining strategy used in the application stage.
5.1 Influence of Inverseconsistent Constraint
To evaluate the influence of the proposed inverseconsistent constraint in Eq. 2, we visually illustrate the flows estimated by ICNet with different contributions from the proposed inverseconsistent constraint. Fig. 6 shows a pair of input images, as well as the flows and estimated inverse flows achieved by ICNet1 and ICNet with different parameter settings. Here, we fix the parameter for the antifolding constraint, while the inverse flows are generated by linear interpolation via the proposed inverse network (see Fig. 2 (c)). Results in Fig. 6 (b)(c) are generated by ICNet1 without using the inverseconsistent constraint (i.e., ) but having different weights for the smoothness constraint, while those in Fig. 6 (d) are yielded by ICNet with the inverseconsistent constraint.
From Fig. 6 (b), we can observe that using a small weight (i.e., ) for the smoothness term in ICNet1 cannot generate good results (with foldings in the estimated inverse flow), and also the flow between two images is not inverseconsistent. For instance, the estimated inverse flow is not consistent with , while looks different from . Fig. 6 (c) suggests that using a large weight (i.e., ) for the smoothness constraint in ICNet1 will generate over smooth flow, which may degrade the registration accuracy. Fig. 6 (d) shows that ICNet () can generate flows with a reasonable smoothness degree. Also, it can be seen from Fig. 6 (d) that is similar to , and also looks similar to . This suggests that ICNet can well preserve the inverseconsistent property of the bidirectional flows.
Furthermore, we show the validation loss achieved by ICNet1 and ICNet with different contributions from the smoothing and inverseconsistent constraints in Fig. 7. For ICNet1 with , the inverseconsistent term in Eq. 7 is not used for network optimization and we only record the corresponding loss here. In Fig. 7, red line denotes the validation loss of ICNet1 with , green line represents the loss of ICNet1 with for the smoothness constraint, and blue line denotes the loss of ICNet with and . As shown in the figure, using a large weight for the smoothing term (green lines) can have relatively small loss , but the is pretty large. It implies the warped source image is largely different from the target image. Besides, using a small weight for the smoothness regularizer (red lines) can yield relatively large loss , but a good loss concerning the similarity between the warped source image and the target image. In contrast, the losses of ICNet with and (blue lines) suggest that ICNet can not only produce inverseconsistent registration, but also keep the warped source image as similar as possible to the target image.
5.2 Influence of Antifolding Constraint
We then study the influence of the proposed antifolding constraint, by comparing ICNet with ICNet2 (without using the antifold constraint). In this group of experiments, we fix the parameters for the smoothing and inverseconsistent constraints (i.e., and ). Fig. 8 shows the flows generated by ICNet2 (left) and ICNet (right). It can be observed from Fig. 8 (a) that the flow generated by ICNet2 without using the antifolding regularizer includes many folding (see the black rectangle) that would degrade the registration accuracy. In contrast, Fig. 8 (b) suggests that ICNet using the antifolding regularizer can effectively avoid the folding in the flow. These results demonstrate the effectiveness of the antifolding constraint in preventing foldings in the learned transformation.
5.3 Influence of Network Architecture
We also investigate the influence of the network architecture on the performance of ICNet, where the number of starting filter channels in FCN (see in Fig. 2 (b)) is the essential component. In the abovementioned experiments, we empirically set . In this group of experiments, we compare ICNet with its variant denoted as ICNet_16 with , and report the results of tissue segmentation achieved by these two methods in Fig. 9.
It can be seen from Fig. 9 that ICNet_16 achieves slightly better results in segmenting the three types of tissues, compared with ICNet. This implies that using more filter channels in the FCN within the proposed ICNet framework helps boost the registration accuracy, thus improving the tissue segmentation performance. Note that ICNet_16 requires much large memory () for network training, while the ICNet with starting filter channels only need memory. Also, the training time of ICNet_16 will double that of ICNet. Considering the marginal improvement of the registration performance shown in Fig. 9, we can flexibly choose the number of starting filter channels in practice, based on the memory capacity at hand.
5.4 Influence of Network Refining Strategy
The proposed ICNet is unsupervised, without using any groundtruth registration results. Therefore, given a pair of new testing images, we can feed them to ICNet (trained on the training data) to further refine the network, thus adapting the network to the testing images. Here, we denote ICNet with such a network refining process as ICNet_R. In ICNet_R, we first optimize the network parameters using the training data, and then refine the network (with the learned parameters as initialization) for the new pair of testing images. In the refining stage, we use a small learning rate (i.e., ) for optimization, and the number of iteration is empirically set to . After refinement, we can use the newly learned network parameters to produce the final registration results for the testing images. The experimental results on tissue segmentation achieved by ICNet and its refined variant (i.e., ICNet_R) are shown in Fig. 10.
It can be seen from Fig. 10 that ICNet_R consistently outperforms ICNet in segmenting three types of tissues, regarding five evaluation metrics. For instance, the PPV value of ICNet_R in segmenting CSF is , while that of ICNet is only . The possible reason could be that the refining strategy makes the network to be better coordinated with the new input images, thus reducing the negative influence of distribution differences between training and test data. It is worth noting that such network refining strategy is a general approach, which can also be applied to improving other unsupervised algorithms for image registration.
6 Conclusion and Future Work
In this paper, we propose an inverseconsistent deep network (ICNet) for unsupervised deformable image registration. Specifically, we develop an inverseconsistent constraint to encourage that a pair of images are symmetrically deformed toward one another, and then propose an antifolding constraint to minimize foldings in the estimated flow. The proposed method is evaluated on registration of T1weighted brain MR images for tissue segmentation and anatomical landmark detection. Experimental results demonstrate that our ICNet outperforms several stateoftheart algorithms for deformable image registration.
In the current work, we utilize the mean squared distance (MSD) to measure the similarity between a warped source image and a target image, while many other similarity measures (such as correlation or mutual information) can be employed in the proposed deeplearning framework. Besides, only a simple network refining strategy is proposed in this work to handle the challenge of data heterogeneity (e.g., in image contrast, resolution, and noise level), while more advanced data adaptation algorithms [71, 72] could be used to further boost the registration performance.
Appendix
As mentioned in the main text, besides using the conventional smoothness constraint [16], we also develop an antifolding constraint to further prevent the folding in the learned discrete displacement field (i.e., flow). Note that each element in a flow is a dimensional vector (corresponding to three axes, i.e., ), indicating the displacement of a particular voxel from its original location to a new location. We now explain the details of this antifolding constraint defined in Eq. in the main text.
We denote the tobeestimated flow from an image to an image as , and is defined in the space. For simplicity, we denote the dimensional displacement in along the th axis as . As shown in Fig. 11, for the point and its nearest neighbor point along the th axis in the space of , we denote their displacements as and , respectively.
To avoid folding at the location of , it is required that the new locations of these two points (after displacement via ) should follow
(8)  
where is the new location of the point , and denotes the new location of the point . In discrete problems, the gradient of at the location of along the th axis is typically defined as
(9)  
Combining Eq. 9 to Eq. 8, we can get the following
(10) 
which indicates that, if , there is no folding at the location of . In contrast, if , there is folding at the location of .
Accordingly, to avoid folding in the learned flow, we propose an antifolding constraint by penalizing the gradient of the flow, at locations that violate the rule in Eq. 8. Let , and we denote an index function , where if , and , otherwise. Specifically, in the proposed antifold constraint (see Eq. in the main text), if there is folding in the location of along the th axis (i.e., ), we enforce the penalty on the gradient at this location, requiring this gradient to be small. In contrast, if there is no folding at the location of along the th axis (i.e., ), we do not penalize the gradient of the estimated flow at this location.
References
 [1] A. Sotiras, C. Davatzikos, and N. Paragios, “Deformable medical image registration: A survey,” IEEE Transactions on Medical Imaging, vol. 32, no. 7, pp. 1153–1190, 2013.
 [2] M. Liu, J. Zhang, E. Adeli, and D. Shen, “Landmarkbased deep multiinstance learning for brain disease diagnosis,” Medical Image Analysis, vol. 43, pp. 157–168, 2018.
 [3] C. Lian, J. Zhang, M. Liu, X. Zong, S.C. Hung, W. Lin, and D. Shen, “Multichannel multiscale fully convolutional network for 3D perivascular spaces segmentation in 7T MR images,” Medical Image Analysis, vol. 46, pp. 106–117, 2018.
 [4] J. V. Hajnal and D. L. Hill, Medical Image Registration. CRC Press, 2001.
 [5] B. B. Avants, C. L. Epstein, M. Grossman, and J. C. Gee, “Symmetric diffeomorphic image registration with crosscorrelation: Evaluating automated labeling of elderly and neurodegenerative brain,” Medical Image Analysis, vol. 12, no. 1, pp. 26–41, 2008.
 [6] D. Shen and C. Davatzikos, “HAMMER: Hierarchical attribute matching mechanism for elastic registration,” IEEE Transactions on Medical Imaging, vol. 21, no. 11, pp. 1421–1439, 2002.
 [7] R. Bajcsy and S. Kovavcivc, “Multiresolution elastic matching,” Computer Vision, Graphics, and Image Processing, vol. 46, no. 1, pp. 1–21, 1989.
 [8] D. Rueckert, L. I. Sonoda, C. Hayes, D. L. Hill, M. O. Leach, and D. J. Hawkes, “Nonrigid registration using freeform deformations: Application to breast MR images,” IEEE Transactions on Medical Imaging, vol. 18, no. 8, pp. 712–721, 1999.
 [9] J.P. Thirion, “Image matching as a diffusion process: An analogy with Maxwell’s demons,” Medical Image Analysis, vol. 2, no. 3, pp. 243–260, 1998.
 [10] X. Yang, R. Kwitt, and M. Niethammer, “Fast predictive image registration,” in Deep Learning and Data Labeling for Medical Applications. Springer, 2016, pp. 48–57.
 [11] X. Cao, J. Yang, Y. Gao, Q. Wang, and D. Shen, “Regionadaptive deformable registration of CT/MRI pelvic images via learningbased image synthesis,” IEEE Transactions on Image Processing, vol. 27, no. 7, pp. 3500–3512, 2018.
 [12] X. Cao, J. Yang, J. Zhang, Q. Wang, P.T. Yap, and D. Shen, “Deformable image registration using cueaware deep regression network,” IEEE Transactions on Biomedical Engineering, vol. 65, no. 9, pp. 1900–1911, 2018.

[13]
G. Wu, M. Kim, Q. Wang, Y. Gao, S. Liao, and D. Shen, “Unsupervised deep feature learning for deformable registration of MR brain images,” in
International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2013, pp. 649–656.  [14] S. Miao, Z. J. Wang, and R. Liao, “A CNN regression approach for realtime 2D/3D registration,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1352–1363, 2016.
 [15] S. Shan, X. Guo, W. Yan, E. I. Chang, Y. Fan, Y. Xu et al., “Unsupervised endtoend learning for deformable medical image registration,” arXiv preprint arXiv:1711.08608, 2017.
 [16] B. D. de Vos, F. F. Berendsen, M. A. Viergever, M. Staring, and I. Ivsgum, “Endtoend unsupervised deformable image registration with a convolutional neural network,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, 2017, pp. 204–212.

[17]
G. Balakrishnan, A. Zhao, M. R. Sabuncu, J. Guttag, and A. V. Dalca, “An
unsupervised learning model for deformable medical image registration,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, 2018, pp. 9252–9260.  [18] S. Marius, K. Stefan, and P. Josien, P. W., “A rigidity penalty term for nonrigid registration,” Medical Physics, vol. 34, no. 11, pp. 4098–4108, 1998.
 [19] H. Lester and S. R. Arridge, “A survey of hierarchical nonlinear medical image registration,” Pattern Recognition, vol. 32, no. 1, pp. 129–149, 1999.
 [20] L. G. H. Derek, G. B. Philipp, H. Mark, and J. H. David, “Medical image registration,” Physics in Medicine and Biology, vol. 46, no. 3, pp. R1–R45, 2001.
 [21] Z. Barbara and F. Jan, “Image registration methods: A survey,” Image and Vision Computing, vol. 21, no. 11, pp. 977–1000, 2003.
 [22] M. Holden, “A review of geometric transformations for nonrigid body registration,” IEEE Transactions on Medical Imaging, vol. 27, no. 1, pp. 111–128, 2008.
 [23] D. Rueckert and J. A. Schnabel, Medical Image Registration. Springer Berlin Heidelberg, 2011, pp. 131–154.
 [24] R. P. Woods, S. T. Grafton, C. J. Holmes, S. R. Cherry, and J. C. Mazziotta, “Automated image registration: I. General methods and intrasubject, intramodality validation,” Journal of Computer Assisted Tomography, vol. 22, no. 1, pp. 139–152, 1998.
 [25] P. Hellier, J. Ashburner, I. Corouge, C. Barillot, and K. J. Friston, “Intersubject registration of functional and anatomical data using SPM,” in Medical Image Computing and ComputerAssisted Intervention. Springer Berlin Heidelberg, 2002, pp. 590–597.
 [26] J.P. Thirion, “Image matching as a diffusion process: An analogy with Maxwell’s demons,” Medical Image Analysis, vol. 2, no. 3, pp. 243–260, 1998.
 [27] T. Vercauteren, X. Pennec, A. Perchant, and N. Ayache, “Diffeomorphic demons: Efficient nonparametric image registration,” NeuroImage, vol. 45, no. 1, pp. S61–S72, 2008.
 [28] A. J., S. Suzi, and J. Mark, “FNIRTFMRIB’s nonlinear image registration tool,” Human Brain Mapping, 2008.
 [29] B. A. Ardekani, S. Guckemus, A. Bachman, M. J. Hoptman, M. Wojtaszek, and J. Nierenberg, “Quantitative comparison of algorithms for intersubject registration of 3D volumetric brain MRI scans,” Journal of Neuroscience Methods, vol. 142, no. 1, pp. 67–76, 2005.

[30]
D. L. Collins and A. C. Evans, “Animal: Validation and applications of
nonlinear registrationbased segmentation,”
International Journal of Pattern Recognition and Artificial Intelligence
, vol. 11, no. 8, pp. 1217–1294, 1997.  [31] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever, “Mutualinformationbased registration of medical images: A survey,” IEEE Transactions on Medical Imaging, vol. 22, no. 8, pp. 986–1004, 2003.
 [32] F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Transactions on Medical Imaging, vol. 16, no. 2, pp. 187–198, 1997.
 [33] W. M. Wells, P. Viola, H. Atsumi, S. Nakajima, and R. Kikinis, “Multimodal volume registration by maximization of mutual information,” Medical Image Analysis, vol. 1, no. 1, pp. 35–51, 1996.
 [34] C. G. E. and J. H. J. F., “Consistent image registration,” IEEE Transactions on Medical Imaging, vol. 20, no. 7, pp. 568–582, 2001.
 [35] R. D., A. P., H. R. A., H. J. V., and A. Hammers, “Diffeomorphic registration using Bsplines,” in International Conference on Medical Image Computing and ComputerAssisted Intervention, 2006, pp. 702–709.

[36]
O. Musse, F. Heitz, and J. P. Armspach, “Topology preserving deformable image matching using constrained hierarchical parametric models,”
IEEE Transactions on Image Processing, vol. 10, no. 7, pp. 1081–1093, 2001.  [37] M. Sdika, “A fast nonrigid image registration with constraints on the Jacobian using large scale constrained optimization,” IEEE Transactions on Medical Imaging, vol. 27, no. 2, pp. 271–281, 2008.
 [38] T. Mansi, X. Pennec, M. Sermesant, H. Delingette, and N. Ayache, “iLogDemons: A demonsbased registration algorithm for tracking incompressible elastic biological tissues,” International Journal of Computer Vision, vol. 92, no. 1, pp. 92–111, 2011.
 [39] C. Tanner, J. A. Schnabel, D. Chung, M. J. Clarkson, D. Rueckert, D. L. G. Hill, and D. J. Hawkes, “Volume and shape preservation of enhancing lesions when applying nonrigid registration to a time series of contrast enhancing MR breast images,” in International Conference on Medical Image Computing and ComputerAssisted Intervention, 2000.
 [40] W. Greene, S. Chelikani, K. Purushothaman, J. Knisely, Z. Chen, X. Papademetris, L. Staib, and J. Duncan, “Constrained nonrigid registration for use in imageguided adaptive radiotherapy,” Medical Image Analysis, vol. 13, no. 5, pp. 809–817, 2009.
 [41] T. Rohlfing, C. R. Maurer, D. A. Bluemke, and M. A. Jacobs, “Volumepreserving nonrigid registration of MR breast images using freeform deformation with an incompressibility constraint,” IEEE Transactions on Medical Imaging, vol. 22, no. 6, pp. 730–741, 2003.
 [42] A. Bistoquet, J. Oshinski, and O. Skrinjar, “Myocardial deformation recovery from cine MRI using a nearly incompressible biventricular model,” Medical Image Analysis, vol. 12, no. 1, pp. 69–85, 2008.
 [43] J. Dauguet, A. Herard, J. Declerck, and T. Delzescaux, “Locally constrained cubic Bspline deformations to control volume variations,” in IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2009, pp. 983–986.
 [44] D. Loeckx, F. Maes, D. Vandermeulen, and P. Suetens, “Nonrigid image registration using freeform deformations with a local rigidity constraint,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Springer Berlin Heidelberg, 2004, pp. 639–646.
 [45] J. Modersitzki, “FLIRT with rigidityimage registration with a local nonrigidity penalty,” International Journal of Computer Vision, vol. 76, no. 2, pp. 153–163, 2008.
 [46] M. Lorenzi, N. Ayache, G. B. Frisoni, and X. Pennec, “LCCDemons: A robust and accurate symmetric diffeomorphic registration algorithm,” NeuroImage, vol. 81, pp. 470–483, 2013.
 [47] Y. Ou and C. Davatzikos, “DRAMMS: Deformable registration via attribute matching and mutualsaliency weighting,” Medical Image Analysis, vol. 15, no. 4, 2009.

[48]
B. Glocker, N. Komodakis, G. Tziritas, N. Navab, and N. Paragios, “Dense image registration through MRFs and efficient linear programming,”
Medical Image Analysis, vol. 12, no. 6, pp. 731–741, 2008.  [49] Z. Xue, D. Shen, B. Karacali, J. Stern, D. Rottenberg, and C. Davatzikos, “Simulating deformations of MR brain images for validation of atlasbased segmentation and registration algorithms,” NeuroImage, vol. 33, no. 3, pp. 855–866, 2006.
 [50] L. Wei, X. Cao, Z. Wang, Y. Gao, S. Hu, L. Wang, G. Wu, and D. Shen, “Learningbased deformable registration for infant MRI by integrating random forest with autocontext model,” Medical Physics, vol. 44, no. 12, pp. 6289–6303, 2017.
 [51] M. Kim, G. Wu, P. Yap, and D. Shen, “A general fast registration framework by learning deformationappearance correlation,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1823–1833, 2012.
 [52] Q. Wang, M. Kim, Y. Shi, G. Wu, and D. Shen, “Predict brain MR image registration via sparse learning of appearance and transformation,” Medical Image Analysis, vol. 20, no. 1, pp. 61–75, 2015.
 [53] B. GutiérrezBecker, D. Mateus, L. Peter, and N. Navab, “Learning optimization updates for multimodal registration,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer International Publishing, 2016, pp. 19–27.
 [54] X. Cao, J. Yang, Y. Gao, Y. Guo, G. Wu, and D. Shen, “Dualcore steered nonrigid registration for multimodal images via bidirectional image synthesis,” Medical Image Analysis, vol. 41, pp. 18–31, 2017.
 [55] X. Yang, R. Kwitt, M. Styner, and M. Niethammer, “Quicksilver: Fast predictive image registrationA deep learning approach,” NeuroImage, vol. 158, pp. 378–396, 2017.
 [56] M.M. Rohé, M. Datar, T. Heimann, M. Sermesant, and X. Pennec, “SVFNet: Learning deformable image registration using shape matching,” in Medical Image Computing and ComputerAssisted Intervention. Springer International Publishing, 2017, pp. 266–274.
 [57] H. Sokooti, B. de Vos, F. Berendsen, B. P. F. Lelieveldt, I. Ivsgum, and M. Staring, “Nonrigid image registration using multiscale 3D convolutional neural networks,” in Medical Image Computing and ComputerAssisted Intervention. Springer International Publishing, 2017, pp. 232–239.
 [58] J. Krebs, T. Mansi, H. Delingette, L. Zhang, F. C. Ghesu, S. Miao, A. K. Maier, N. Ayache, R. Liao, and A. Kamen, “Robust nonrigid registration through agentbased action learning,” in Medical Image Computing and ComputerAssisted Intervention.
 [59] A. Leow, S.C. Huang, A. Geng, J. Becker, S. Davis, A. Toga, and P. Thompson, “Inverse consistent mapping in 3D deformable image registration: Its construction and statistical properties,” in Biennial International Conference on Information Processing in Medical Imaging. Springer, 2005, pp. 493–503.
 [60] J. He and G. E. Christensen, “Large deformation inverse consistent elastic image registration,” in Biennial International Conference on Information Processing in Medical Imaging. Springer, 2003, pp. 438–449.
 [61] O. Ronneberger, P. Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2015, pp. 234–241.
 [62] M. Jaderberg, K. Simonyan, A. Zisserman et al., “Spatial transformer networks,” in Advances in Neural Information Processing Systems, 2015, pp. 2017–2025.
 [63] N. Ketkar, “Introduction to pytorch,” in Deep Learning with Python. Springer, 2017, pp. 195–208.
 [64] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
 [65] C. R. Jack, M. A. Bernstein, N. C. Fox, P. Thompson, G. Alexander, D. Harvey, B. Borowski, P. J. Britson, J. L Whitwell, and C. Ward, “The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods,” Journal of Magnetic Resonance Imaging, vol. 27, no. 4, pp. 685–691, 2008.
 [66] C. J. Holmes, R. Hoge, L. Collins, R. Woods, A. W. Toga, and A. C. Evans, “Enhancement of MR images using registration for signal averaging,” Journal of Computer Assisted Tomography, vol. 22, no. 2, pp. 324–333, 1998.

[67]
Y. Zhang, M. Brady, and S. Smith, “Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm,”
IEEE Transactions on Medical Imaging, vol. 20, no. 1, pp. 45–57, 2001.  [68] P. Aljabar, R. A. Heckemann, A. Hammers, J. V. Hajnal, and D. Rueckert, “Multiatlas based segmentation of brain images: Atlas selection and its effect on accuracy,” NeuroImage, vol. 46, no. 3, pp. 726–738, 2009.
 [69] J. Zhang, M. Liu, and D. Shen, “Detecting anatomical landmarks from limited medical imaging data using twostage taskoriented deep neural networks,” IEEE Transactions on Image Processing, vol. 26, no. 10, pp. 4753–4764, 2017.
 [70] X. Cao, J. Yang, J. Zhang, D. Nie, M. Kim, Q. Wang, and D. Shen, “Deformable image registration based on similaritysteered CNN regression,” in International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, 2017, pp. 300–308.
 [71] R. Gopalan, R. Li, and R. Chellappa, “Unsupervised adaptation across domain shifts by generating intermediate data representations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 11, pp. 2288–2302, 2014.
 [72] I.H. Jhuo, D. Liu, D. Lee, and S.F. Chang, “Robust visual domain adaptation with lowrank reconstruction,” in IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012, pp. 2168–2175.
Comments
There are no comments yet.