Discriminative Consistent Domain Generation for Semi-supervised Learning

07/24/2019 ∙ by Jun Chen, et al. ∙ Imperial College London 0

Deep learning based task systems normally rely on a large amount of manually labeled training data, which is expensive to obtain and subject to operator variations. Moreover, it does not always hold that the manually labeled data and the unlabeled data are sitting in the same distribution. In this paper, we alleviate these problems by proposing a discriminative consistent domain generation (DCDG) approach to achieve a semi-supervised learning. The discriminative consistent domain is achieved by a double-sided domain adaptation. The double-sided domain adaptation aims to make a fusion of the feature spaces of labeled data and unlabeled data. In this way, we can fit the differences of various distributions between labeled data and unlabeled data. In order to keep the discriminativeness of generated consistent domain for the task learning, we apply an indirect learning for the double-sided domain adaptation. Based on the generated discriminative consistent domain, we can use the unlabeled data to learn the task model along with the labeled data via a consistent image generation. We demonstrate the performance of our proposed DCDG on the late gadolinium enhancement cardiac MRI (LGE-CMRI) images acquired from patients with atrial fibrillation in two clinical centers for the segmentation of the left atrium anatomy (LA) and proximal pulmonary veins (PVs). The experiments show that our semi-supervised approach achieves compelling segmentation results, which can prove the robustness of DCDG for the semi-supervised learning using the unlabeled data along with labeled data acquired from a single center or multicenter studies.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Fitting the possible differences of distributions between labeled data and unlabeled data is of high importance for the semi-supervised learning. The usage of unlabeled data can overcome the limitation of insufficient labeled data, which is normally a hurdle in medical image analysis problems that lack labeled data. In practice, incorporating unlabeled data may fail due to the domain shift between labeled data and unlabeled data [1].

The domain adaptation can learn generically adaptive representation domain but is subject to the limited discriminative feature domain for the task model learning. On the one hand, recent domain adaptation approaches usually introduce a discriminator to encourage data from one domain to generate a feature domain that is similar to the other one by keeping inter representation invariant between the two domains. They are based on a single adaptation direction. Thus the generated feature space is limited by one of the two domains. Because the original feature space of the other domain is lost, it can result in a reduced feature space of the two domains. On the other hand, the widespread domain adaptation approaches work with only the labeled data for the task model learning. The unlabeled data is only used to generate the domain adaptation space along with the labeled data based on an adversarial learning. Therefore, the discriminativeness of adapted features for subsequent task model still completely rely on the labeled data. Consequently, it is hard to guarantee the discriminativeness of adapted features for task learning of the unlabeled data.

Figure 1: Indirect double-sided domain adaptation.The labeled data () and the unlabeled data () are mapped to the adapted consistent feature domain ( and ). The is generated based on the double-sided domain adaptation via an indirect adversarial learning that the domain discriminator is applied to the feature domain , which is obtained from the predicted label. Feature matching is then used to keep the consistence between and .

In this work, we propose a discriminative consistent domain generation method based on double-sided domain adaptation (as shown in Fig.1) to learn a task model (TM) in a semi-supervised manner. In our proposed DCDG, the available labeled data come from domain and the unlabeled data come from domain . The and are from the same or a similar domain. We adopt the double-sided domain adaptation to generate a consistent feature domain that fuses the feature spaces of and instead of extracting the common parts of two domains or making one domain to adapt the other one. DCDG shares a feature representation generator that maps and to the consistent feature domain and . For the purpose of discriminative feature generation, we map to the predicted label domain via . Then the domain discriminator maps predicted label domain to another feature domain () to make an identification via indirect double-sided domain adaptation for . During the indirect double-sided domain adaptation, the parameters of are fixed and we constrain to match the generated by . We can adapt the indirectly by adapting the to guarantee the discriminativeness of feature domain for the subsequent learning of . During the discriminative consistent feature domain generation, we can learn the although there are no available labels in for us to directly learn the . The is matched with , which is produced by both the and . We can further map to the generated labeled and unlabeled data ( and ) to use the image consistency as the semi-supervised information to learn the .We demonstrate the performance of our proposed DCDG for the left atrium segmentation [2, 3] on a LGE-CMRI dataset, which plays an important role in the management of atrial fibrillation and myocardial infarction[4, 5].

2 Method

The proposed tries to generate the discriminative consistent feature domain with fused feature space from the labeled data and unlabeled data by the indirect double-sided domain adaptatipon. Meanwhile, we introduce an extra image consistent generation as the semi-supervised learning. Then the segmentation model can also be trained by the along with the . Detailed network configuration can be found in the supplementary materials.

Figure 2: Illustration of the proposed . and represent the labeled data and unlabeled data respectively. They share a common feature generator () to generate the consistent features . Then the is mapped to the and of LA and PVs mask for and by the segmentation model (). Next, two domain discriminators with shared weights map the and to a feature domain () to make an identification via indirect double-sided adversarial learning for . During the adversarial learning, the parameters of are fixed and we constrain to match the generated by . Finally, the is mapped to the and matched with and .

Discriminative feature extraction via indirect learning.

We aim to generate the consistent feature domain via double-sided domain adaptation to learn a segmentation model using both the labeled data and the unlabeled data . It is important to maintain the discriminativeness of feature representations for the generated consistent feature domain. Hence, we introduce the indirect learning for the discriminative feature domain extraction. In our proposed DCDG, we use a feature generator to generate the consistent feature domain ( and ) from the and without using any knowledge of the source of images during testing [6]. In order to guarantee the discriminativeness of the generated features for subsequent segmentation model, we introduce the indirect domain adaptation instead of direct domain adaptation for . We map the generated feature

to the estimated LA and PVs with probability maps (

and ) of and via the segmentation model (). Then the domain discriminator () generates the feature domain ( and ) by the final convolutional layer from and , and finally produces a scalar value to identify the and . During the indirect learning, we fix the parameters of and match the feature domains between and via the squared -norm that is defined as:

(1)

where the and is produce by the and respectively. the and is produce by the final convolutional layer of the domain discriminator.

Therefore, we identify the indirectly by identifying the to guarantee the discriminativeness of feature domain for the learning of segmentation network.

Consistent feature domain generation via double-sided adaptation. In order to generate the consistent feature domain, we introduce two discriminators ( and ) to achieve a double-sided domain adaptation which enables the features produced by the and to adapt each other as shown in Fig.2. is used to encourage the to generate the feature domain that is similar to the ones produced by . is used to force the to generate the feature domain that is similar to the ones produced by . Hence, a double-sided adversarial training is used to achieve the double-sided domain adaptation. During the double-sided adversarial training, for the learning of domain discriminator, is used as the feature and is used as the feature to learn the , while is used as the feature and is used as the feature to learn the simultaneously. For the learning of feature generator , tries to identify the as features and tries to identify the as features. In order to reduce the parameters of network, we make the and to share a discriminator to achieve the double adversarial learning directly. Inspired by this, we take the and as features and features respectively to directly learn the . When we learn the generator, the and the are assigned with the label and label respectively. Overall, during the double-sided adaptation, we fix the parameters of segmentation model and aim to optimize the following and for learning the domain discriminator and feature generator respectively:

(2)
(3)

where the and represent the input of the labeled and unlabeled data, respectively. is the binary cross-entropy loss. represents the segmentation model. During the domain adaptation, the parameters of the are fixed.

Semi-supervised segmentation model learning. During the generation of the discriminative consistent feature domain, we learn the segmentation model by and . Since there is no available labels for , cannot be directly used for training the segmentation model along with the . However, during the double-sided domain adaptation, we get the matched feature domains between and , while the is generated from the and . Hence, we perform a reverse mapping () that mapping the to the and which are matched with and to achieve the consistent image generation. Then the can use a consistent image loss as semi-supervised loss to train the along with the supervised loss from the . Finally, we use and to train with the loss defined as follows:

(4)
(5)

where is the estimated LA and PVs map. is the ground truth. is the squared -norm. and are the matched features.

represents the Dice loss function. We train the

after each epoch of the domain adaptation.

Evaluation criteria. We use the region-based metrics: the Dice coefficient () and intersection-over-union (), which validate the predicted LA and PVs () against the ground-truth (). And the surface-based metric of mean surface distance defined as , which is the mean of the distances between every surface voxel in predicted mesh and the closest surface voxel in ground-truth mesh .

3 Experimental Results and Discussion

In our experiments, we used two centers dataset (detailed imaging parameters can be found in the supplementary materials) for the LA and PVs segmentation to validate the proposed . The final segmentation model was obtained using ‘early stopping’ on validation data. To demonstrate the performance of our proposed , we compare our proposed with full supervised methods, e.g., 2D UNet [7], SegNet [8] and a recent state-of-the-art 3D segmentation architecture namely 3D DenseNet [9], and also make a comparision between and a semi-supervised [10] () method along with a domain adaptation [11] () method with the single-level adversarial learning. In addition, we also compared DCDG to itself but used in a fully supervised manner, namely full supervised segmentation (FSS).

Method IoU MSD (mm) Dice
2D UNet
SegNet
3D DenseNet
FSS
AR(25%)
ASOS(25%)
DCDG(25%)
AR(50%)
ASOS(50%)
DCDG(50%)
AR(75%)
ASOS(75%)
DCDG(75%)
Table 1: Comparison of the performance of our proposed on C1.

Experiments on a single center dataset. We performed multiple experiments on data with the same image domain based on different ratios of labeled data acquired at center 1 denoted as . The total number of data is 175, in which 140 samples are randomly selected and used to train the model. We randomly selected 15 samples for model validation (7 pre-ablation and 8 post-ablation samples). We also randomly selected 20 samples for independent testing (10 pre-ablation and 10 post-ablation samples). During the experiment, we randomly selected different ratio r of labeled cases () from 140 samples along with the ratio (1-r) of the unlabeled data for the semi-supervised learning of DCDG and AR, while the ASOS is learned by the labeled data of ratio r with the test data. The full supervised methods performed with the standard supervised training manner based on all the labeled data. The quantitative results are summarized in Table 1. As shown in Table 1, when we use 50% labeled data, the performance of DCDG is superior to 2D UNet, SegNet, 3D DenseNet. When we use 75% labeled data, the performance of DCDG is superior to FSS, which use 100% labeled data with fully supervised learning. Compared to those methods, we can use much less labeled data to obtain better results. It has great significance to avoid costly manual labeling when there is limited expert availability. Furthermore, compared with AR and ASOS, our proposed DCDG also achieves the best results.

Method IoU MSD (mm) Dice
2D UNet
SegNet
3D DenseNet
AR
ASOS
DCDG
FSS
Table 2: Comparison of the performance of our proposed on C1 and C2.
Figure 3: Qualitative visualization of LA and PVs segmentation results compared to the manual delineation for representative slices on a pre-ablation and a post-ablation 3D LGE-CMRI images. Each estimated segmentation is represented as a dashed green contour, and its corresponding manual delineation is represented as a red contour.

Experiments on a two center dataset. For the experiment performed at data was labelled as opposed to which was unlabelled. The total number of data is 94. We randomly selected 20 samples for testing (including 10 pre-ablation samples and 10 post-ablation samples). The remained 74 samples with no labels were used to train the DCDG along with the 140 samples with available labels from . We also apply the same 15 validation dataset used in the learning from the single center data to validate the segmentation model during training. For 2D UNet, SegNet, 3D DenseNet and FSS, supervised learning were performed on 140 samples with available labels from . As shown in Table 2, although 2D UNet, SegNet, 3D DenseNet have achieved great success in many medical image segmentation applications, they can not be directly applied to learn a good model from the distribution of one domain to the other domain thus obtained sub-performance. In addition, our DCDG also obtained the best results compared to the AR and ASOS. Fig.3 visually gives a further illustration to demonstrate the good performance of DCDG.

Figure 4: Boxplots of , and evaluations for ablation tests.
Figure 5: (a) The stability of double-sided domain adaptation. The and

are the mean identification value of the discriminator on labeled samples and unlabeled samples respectively after each learning of the double-sided domain adaptation. The intervals between the two vertical lines and two horizontal lines are the confidence intervals of 95% limits of agreement for MIL-MIU and MIL+MIU. (b) and (c): The t-Distributed Stochastic Neighbor Embedding (t-SNE) visualization of feature distribution on two-center data via double-sided adaptation (c) compared to the ones without adaptation(b). The red and blue points represent the samples from

and respectively.

Model variation study. To verify the effectiveness of each component in our proposed DCDG, we perform Model variation study on and . We take the structure of DCDG as the baseline. Then we train the model with single domain adaptation (SDA), without domain adaptation (WDA), without feature matching (WFM) and with the direct double-sided domain adaptation (DDDA) for F. We compare these models with the DCDG and the results are shown in Fig.4. As shown in the Fig.4, our proposed DCDG achieves the best results across these models on the measures of IoU, MSD and Dice Score, which proves the effectiveness of our DCDG.

Effectiveness of double-sided domain adaptation. In our proposed , the double-sided domain adaptation aims to generate the discriminative consistent feature domain. Its effectiveness can be demonstrated by the identified accuracy of discriminator for and . Ideally, the probability values are identified by the discriminator for and are both 0.5. In our experiment, we record the and as shown in Fig.5(a). Ideally, the value of plus is 1, while the value of minus is 0. In this situation, the identification values of discriminator on labeled data and unlabeled data are both 0.5, which illustrates that the model achieves the double-sided domain adaptation. As shown in Fig.5(a), in each epoch during double-sided domain adaptation, the value of plus is close to 1. Meanwhile, the value of minus is close to 0. Furthermore, the adapted features show the fused features of and (Fig.5(c)) compared to no adapted feature distribution (Fig.5(b)) .

4 Conclusion

In this paper, we proposed a discriminative consistent domain generation for the semi-supervised learning. In our proposed , we investigate the double-sided domain adaptation based on an indirect adversarial learning to fit the differences between labeled data and unlabeled data and generate the discriminative feature domain with fused feature space. Validation of our framework has been performed against manually delineated ground truth of the LA and PVs segmentation task. Compared to other supervised, semi-supervisd and domain adaptation methods, our has demonstrated superior performance. In conclusion, our proposed makes it possible to create a robust semi-supervised learning model using the unlabeled data along with labeled data collected from a single center or a multicenter studies that can be well extended to solve other medical image analysis problems.

5 Acknowledgments

This work was supported in part by the Young Scientists Fund of the National Natural Science Foundation of China (61602003), in part by the Natural Science Foundation of China (61771464 and U1801265) and in part by the Guangdong Science and Technology (2018A050506031 and 2019B010110001).

References