Adversarial Continual Learning for Multi-Domain Hippocampal Segmentation

07/19/2021
by   Marius Memmel, et al.
0

Deep learning for medical imaging suffers from temporal and privacy-related restrictions on data availability. To still obtain viable models, continual learning aims to train in sequential order, as and when data is available. The main challenge that continual learning methods face is to prevent catastrophic forgetting, i.e., a decrease in performance on the data encountered earlier. This issue makes continuous training of segmentation models for medical applications extremely difficult. Yet, often, data from at least two different domains is available which we can exploit to train the model in a way that it disregards domain-specific information. We propose an architecture that leverages the simultaneous availability of two or more datasets to learn a disentanglement between the content and domain in an adversarial fashion. The domain-invariant content representation then lays the base for continual semantic segmentation. Our approach takes inspiration from domain adaptation and combines it with continual learning for hippocampal segmentation in brain MRI. We showcase that our method reduces catastrophic forgetting and outperforms state-of-the-art continual learning methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/06/2018

Towards continual learning in medical imaging

This work investigates continual learning of two segmentation tasks in b...
10/21/2020

What is Wrong with Continual Learning in Medical Image Segmentation?

Continual learning protocols are attracting increasing attention from th...
08/25/2020

Continual Domain Adaptation for Machine Reading Comprehension

Machine reading comprehension (MRC) has become a core component in a var...
01/16/2020

Continual Learning for Domain Adaptation in Chest X-ray Classification

Over the last years, Deep Learning has been successfully applied to a br...
07/06/2021

CoReD: Generalizing Fake Media Detection with Continual Representation using Distillation

Over the last few decades, artificial intelligence research has made tre...
11/25/2021

Continual Active Learning Using Pseudo-Domains for Limited Labelling Resources and Changing Acquisition Characteristics

Machine learning in medical imaging during clinical routine is impaired ...
05/28/2018

Parallel Weight Consolidation: A Brain Segmentation Case Study

Collecting the large datasets needed to train deep neural networks can b...

Code Repositories

ACS

Implementation for Adversarial Continual Learning for Multi-Domain Hippocampal Segmentation


view repo

ACS

Adversarial Continual Learning for Multi-Domain Hippocampal Segmentation


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In medical imaging, privacy regulations and temporal restrictions limit access to data [8]. These limitations inhibit the application of traditional supervised deep learning methods for medical imaging tasks, which require the simultaneous availability of all data during training. Continual learning reframes the problem into a sequential training process, where not all datasets are available at each time step. However, when we evaluate continual learning models, they still experience a significant drop in performance, caused by catastrophic forgetting, i.e., the model adapting too strongly to particularities of the last training batch [16].

The leading cause of catastrophic forgetting in medical imaging is multi-domain data originating from different domains [24]. These domains result from diverse disease patterns among the examined subjects and divergent technologies and standards used during the acquisition process. In magnetic resonance imaging (MRI) it is common practice for institutions to operate scanners from various vendors and employ disparate protocols [20]. Additionally, MRI datasets frequently contain subjects that are either healthy or suffer from various pathological conditions. To sufficiently solve a task with data from multiple domains, models have to adapt and learn in a domain-invariant fashion.

(a)
(b)
Figure 1: The Adversarial Continual Segmenter (ACS) can react to different dataset availability through training (black) or freezing (gray) network parts. a) In stage 1, the disentanglement is trained on two datasets from different domains. In stage 2, a new dataset is available, and the previous ones are not. ACS is now fine-tuned on the new one. As a fourth dataset is introduced in stage 3, fine-tuning is repeated. b) If the first two datasets are still available when a third becomes accessible, disentanglement can be repeated in stage 2 with three datasets.

The limited availability of multi-domain data makes developing a general-purpose model for continual hippocampal segmentation difficult. Commonly, at least two datasets from different domains are accessible simultaneously due to open access, relaxed restrictions, or access to historical data acquired with older scanners and protocols within the institution. This serendipity allows for learning a disentanglement between the domains and the content needed for segmentation, which continual learning methods do not yet exploit. Inspired by image-to-image translation (I2I), we utilize adversarial training to learn a disentanglement between a domain-invariant content representation sufficient for segmentation and a dataset-specific domain representation

[10, 12, 18]. We train an encoder for each representation and share the content encoder with our segmentation module. Finally, we extend our approach to continual learning. In Fig. 1, we describe how our architecture could react to common dataset availability scenarios. We perform experiments on a subset of those hypothetical scenarios.

We contribute the Adversarial Continual Segmenter (ACS) for continual semantic segmentation of multi-domain data through adversarial disentanglement and latent space regularization that reduces catastrophic forgetting in hippocampal segmentation of brain MRIs.

2 Related work

Adversarial disentanglement:

Several Generative Adversarial Networks (GANs) disentangle the feature space to improve interpretability

[14]. Chen et al. [6] take a mutual-information-based approach while Karras et al. [13] directly modify the generator to achieve automatic separation of high-level attributes. Adversarial disentanglement shows promising results when applied to segmentation in a multi-domain [11] and multi-modal setting [5]. In domain adaptation, Kamnitsas et al. [12] utilize domain-invariant features for a segmentation task [25] and learn those with adversarial regularization of numerous layer outputs.

Cross-domain disentanglement: I2I translation extends cross-domain feature disentanglement by splitting the latent space into a content and style encoding to achieve better translation results [21]. The content encoding is assumed only to capture task-specific information. The style encoding holds the domain-specific information. Huang et al. [10] and Lee et al. [18] further assume that the complexity of the content outweighs the domain, which they reflect in different encoder complexities.

Continual learning: The main problem that continual learning methods face is catastrophic forgetting. For this purpose, regularization-based approaches constrain important parameters from changing [3, 28, 16, 19]

. With a similar goal, Memory Aware Synapses (MAS)

[2] learn importance weights for each parameter and use those to penalize parameter changes. Knowledge distillation methods try to preserve specific model outputs to retain performance on old data [7, 22]. Keeping a subset of training data is also widely used, e.g., in dynamic memory [9] and rehearsal methods [27]. However, keeping parts of the old data is not feasible in most medical imaging scenarios due to privacy concerns [8].

Whereas existing continual learning methods focus solely on a sequential learning process and do not consider the simultaneous availability of datasets and their divergent domains, we specifically exploit these circumstances through adversarial disentanglement.

Figure 2: Detailed visualization of the ACS architecture. We achieve feature disentanglement through two encoders, for content and for domain. The content and domain discriminators and regularize the latent representation , and in the case of , also the skip-connections of the U-Net to be domain-invariant. Segmentation is done by a forward pass of the U-Net.

3 Methods

We first describe how we disentangle input images into content and domain representation and follow up by introducing the adversarial approach. Our domain representation models the heterogeneity of the acquisition modality, e.g., varying protocols and machine vendors, as well as different disease patterns. To learn the domain-invariant content representation, we initially train on two datasets simultaneously, as is common in I2I translation. This representation then acts as a basis for our U-Net segmenter.

Variational autoencoder loss:

We model the domain encoder as a variational autoencoder (VAE)

[15], which encodes the domain as a distribution

parameterized by variance

and mean . Because the complexity of the domain is assumed to be lower than the content complexity, we limit the dimensionality of the domain representation to one float value. We use a combination of a reconstruction loss between the input image and the generator output

and a Kullback–Leibler regularization term weighted by hyperparameter

to draw the encoding close to a normally distributed Gaussian prior of

.

(1)

Latent regression loss: To prepare the domain information for the generator input, we first sample from the learned domain distribution and then pass the samples into the latent scale layer as proposed by Alharbi et al. [1] to produce a latent domain scale . To give additional information about the domain, we inject a domain code using central biasing instance normalization [11, 29]

. We model this domain code with a one-hot vector. As proposed by Jiang and Veeraraghavan

[11] and based on Lee et al. [18] and Huang et al. [10], we also define a latent code regression loss , which constrains the generator to produce unique mappings for a latent code .

GAN loss: We interpret the encoder of the U-Net as content encoder with the output of the bottom layer treated as content representation . Both and the scaled domain sample are fed into generator to reconstruct the input sample . To now disentangle the content and domain, we introduce adversarial training. We deploy a domain discriminator that regularizes , , and by discriminating whether an input image is part of a given domain . trains on a combination of real and generated images as described in the corresponding discriminator loss in Eq. 2. generates these images from content of and random domain . , , and counter the discriminator by minimizing a negative binary cross-entropy loss shown in Eq. 3. The training forces to produce a domain-invariant output, which we utilize for the segmentation task.

(2)
(3)

Content adversarial loss: The information about the domain can still flow from to via the skip-connections of the U-Net. To prevent this, we introduce a content discriminator inspired by Jiang et al. and Kamnitsas et al. [11, 12]. regularizes as well as the skip-connections because we want to not leak any domain information to the segmenter . The discriminator design is similar to a reversed U-Net decoder, i.e., it takes as input and the skip-connections of at each corresponding layer. We train both and using a multi-class cross-entropy loss and respectively. In the case of the generated images, we represent the domain code as a placeholder class. To ensure that the adversarial training is stable, we train the discriminators and the remaining architecture at separate steps [11].

Segmentation: To produce the segmentation mask , we encode input into through . We then pass and the skip-connections of into to compute . We train the U-Net for semantic segmentation through a pixel-wise combination of a Dice and a binary cross-entropy loss between the target mask and the prediction . After initially training the architecture on two or more datasets, the model has sufficiently learned the disentanglement between the content and the domain. To train on a new dataset, we only fine-tune the last four convolutional layers of .

4 Datasets & Experiments

Datasets: All datasets are from different domains and contain T1-weighted MRIs. The first dataset was released as part of the 2018 Medical Segmentation Decathlon challenge [26] and consists of 195 subjects in total, with 90 healthy and 105 with non-affective psychotic disorder. The scans were collected using a Philips Achieva scanner, and the mean size of the volumes is . The second dataset was published in Scientific Data [17] and has a T1-weighted dataset with 25 healthy subjects. All scans were acquired using MRI systems with 3 Tesla units. The mean standard resolution is . Finally, the third dataset is provided by the Alzheimer’s Disease Neuroimaging Initiative [4] and consists of 68 subjects that are either part of the control group or suffer from either mild cognitive impairment or Alzheimer’s disease. The images were acquired with scanners from Siemens, GE, and Philips with 23, 24, and 21 scans, respectively. The mean volume size is . All three datasets provide reference segmentation masks for the hippocampus. The masks were annotated manually with the protocols defined in the respective publications.We evaluate our architecture on all three datasets, which we will refer to as A, B, and C, respectively.

Experimental setup:

We split each dataset into 70% train, 20% test, and 10% validation and use the latter to select the hyperparameters. We train slice-by-slice and upsample via bilinear interpolation to achieve uniform slices. We compare ACS with the following baselines. First, just the U-Net block of ACS (U-Net-b) shown in Fig.

2, and second a standard U-Net. Furthermore, we extend the U-Net by knowledge distillation on the output layer (OL-KD) as proposed by Michieli and Zanuttigh [22], and Memory Aware Synapses adapted to brain segmentation (BS-MAS) by Özgün et al. [23]

. As suggested in BS-MAS, we divide the surrogate loss by the number of network parameters and normalize the resulting importance values between zero and one. We report the Intersection over Union (IoU) and Dice coefficient on the hippocampus class of the test set. We use a batch size of 40 and train on four Tesla V100 SXM3 GPUs. Each method receives training over 60 epochs. After 30 epochs, the training only continues with the third dataset. We repeat training for every combination of the three datasets (

AB-C, AC-B, BC-A), e.g., initial training on datasets A and B, then on C (AB-C). Additionally, we jointly train ACS and the U-Net on all datasets simultaneously (ABC). To justify the necessity for all mechanisms in our method, as described in Sec. 3, we conduct an ablation study in Tab. 3. Implementation details and qualitative results including the disentanglement can be found in the supplementary material and code on github.com/MECLabTUDA/ACS.

5 Result & Discussion

To assess the continual learning performance, we evaluate the results after stages 1 and 2 corresponding to Fig. 0(a). An ideal algorithm should perform equally or better on the initial training datasets from epoch 30 to 60 while it should improve on the third dataset added after 30 epochs.

Stage 1: Tab. 1 shows the results for all methods after 30 epochs on the initial two datasets. All baselines observe the same score because they apply the regularization in the second training stage, whereas ACS performs disentanglement during the initial training phase. ACS outperforms them by a Dice of (IoU ) averaged over all combinations and datasets.

Dataset A Dataset B Dataset C Average
IoU Dice IoU Dice IoU Dice IoU Dice
AB Baselines 0.641 0.779 0.779 0.875 0.358 0.512 0.593 0.722
ACS (ours) 0.749 0.855 0.793 0.884 0.478 0.628 0.673 0.789
AC Baselines 0.080 0.147 0.265 0.416 0.380 0.547 0.241 0.370
ACS (ours) 0.646 0.782 0.731 0.844 0.727 0.841 0.702 0.822
BC Baselines 0.260 0.407 0.749 0.856 0.649 0.784 0.553 0.682
ACS (ours) 0.239 0.376 0.798 0.887 0.710 0.829 0.582 0.698
Table 1: Comparison of all baselines to ACS after 30 epochs.

Stage 2: To measure overall continual learning performance, i.e., the combination of learning and forgetting, we inspect the average scores over all datasets after 60 epochs in Tab. 2. While the comparison methods’ results fluctuate, our approach achieves a consistently higher performance across all combinations and datasets. This observation manifests in an increase of the average Dice score by over the U-Net, over the U-Net-b, over BS-MAS, and over OL-KD. On combination AB-C, the U-Net drops by an IoU of (Dice ) on dataset A and by (Dice ) on dataset B. The remaining methods, including ACS, show a significantly lower decline and effectively reduce catastrophic forgetting.

Dataset A Dataset B Dataset C Average
IoU Dice IoU Dice IoU Dice IoU Dice
AB-C U-Net 0.238 0.381 0.501 0.664 0.621 0.764 0.454 0.603
U-Net-b 0.339 0.503 0.570 0.724 0.473 0.631 0.461 0.619
BS-MAS 0.304 0.464 0.568 0.722 0.624 0.766 0.499 0.651
OL-KD 0.578 0.729 0.727 0.841 0.473 0.633 0.593 0.734
ACS (ours) 0.640 0.779 0.760 0.863 0.572 0.718 0.657 0.787
AC-B U-Net 0.295 0.451 0.718 0.836 0.387 0.547 0.467 0.611
U-Net-b 0.273 0.425 0.567 0.723 0.419 0.584 0.419 0.577
BS-MAS 0.307 0.466 0.702 0.825 0.364 0.523 0.458 0.604
OL-KD 0.094 0.171 0.381 0.549 0.400 0.571 0.292 0.430
ACS (ours) 0.681 0.808 0.787 0.880 0.679 0.808 0.716 0.832
BC-A U-Net 0.745 0.852 0.668 0.800 0.418 0.579 0.610 0.743
U-Net-b 0.450 0.615 0.591 0.742 0.497 0.661 0.513 0.673
BS-MAS 0.731 0.847 0.626 0.768 0.409 0.569 0.589 0.728
OL-KD 0.347 0.511 0.766 0.867 0.639 0.776 0.584 0.718
ACS (ours) 0.600 0.747 0.649 0.786 0.465 0.631 0.571 0.721
ABC U-Net 0.277 0.431 0.458 0.627 0.431 0.599 0.388 0.440
(joint) ACS (ours) 0.737 0.847 0.760 0.863 0.724 0.839 0.740 0.849
Table 2: Comparison of all baselines and ACS after 60 epochs. ABC represents joint training on all datasets at once.

Combination AC-B shows the clear advantage of our approach. Dataset A contains four types of disorders recorded by a single scanner, while dataset C holds three disease patterns recorded by three different scanners. The baselines struggle with the diversity of these domains, and our model outperforms them by an IoU of (Dice ). These observations show that our model learns a sufficient content representation that can deal with diverse cognitive impairments and scans acquired by scanners of various vendors.

We trace back the low performance on dataset A in combination BC-A to A outnumbering B and C in its variability and number of subjects. The high performance of the U-Net thereby originates from overfitting on A which through its high variability still allows it to perform well on B and C. Because ACS is only fine-tuned on A, it cannot fully exploit this anomaly, but still shows competitive results.

Ablation Study: The conducted ablation study in Tab. 3 verifies that all losses contribute to the performance of ACS. Only on AC-B, the combination of all losses underperforms slightly, but remains competitive. For more detailed numbers we direct the reader to the supplementary material.

AB-C AC-B BC-A Average
IoU Dice IoU Dice IoU Dice IoU Dice
X 0.620 0.759 0.730 0.843 0.526 0.686 0.626 0.763
X X X 0.574 0.721 0.709 0.849 0.524 0.681 0.603 0.750
X 0.637 0.772 0.732 0.843 0.555 0.711 0.642 0.776
X 0.636 0.772 0.721 0.835 0.551 0.707 0.636 0.772
0.658 0.787 0.716 0.832 0.572 0.721 0.649 0.780
Table 3: Snapshot of the ablation study on ACS trained for 60 epochs with losses active () or deactivated (X) during training. Average reported over all test datasets in a configuration.

The results demonstrate that leveraging the availability of multiple datasets increases multi-domain segmentation performance by sufficiently learning a domain-invariant representation. This assumption is further supported by the joint training results in Tab. 2 showing the superior capability of ACS in comparison to the U-Net. Additionally, our method outperforms the state-of-the-art on most continual learning setups and effectively reduces catastrophic forgetting.

6 Conclusion

We propose ACS, an architecture for continual semantic segmentation of multi-domain data that leverages the simultaneous availability of datasets. In real clinical practice, multiple datasets are available at the beginning of the continual training process through, among other sources, public or accessible historical data. Unlike current methods, we leverage this serendipity to disentangle MRI images into content and domain representations through adversarial training. We then perform multi-domain hippocampal segmentation directly on the domain-invariant content representation. We demonstrate drastic improvements through domain disentanglement of multi-domain data in the first training stage. In the second training stage, the benefits of our proposal for continual learning become clear by showcasing that using all available data reduces catastrophic forgetting and outperforms current state-of-the-art methods. Our method pushes continual learning closer towards a clinical application where various degrees of variability such as disease patterns, scan vendors, and acquisition protocols exist and further enables the continual usage of deep learning models in clinical practice.

References

  • [1] Y. Alharbi, N. Smith, and P. Wonka (2019-06) Latent filter scaling for multimodal unsupervised image-to-image translation. In

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    ,
    Cited by: §3.
  • [2] R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars (2018) Memory aware synapses: learning what (not) to forget. In Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss (Eds.), Cham, pp. 144–161. External Links: ISBN 978-3-030-01219-9 Cited by: §2.
  • [3] C. Baweja, B. Glocker, and K. Kamnitsas (2018) Towards continual learning in medical imaging. CoRR abs/1811.02496. External Links: 1811.02496 Cited by: §2.
  • [4] M. Boccardi, M. Bocchetta, F. C. Morency, D. L. Collins, M. Nishikawa, R. Ganzola, M. J. Grothe, D. Wolf, A. Redolfi, M. Pievani, L. Antelmi, A. Fellgiebel, H. Matsuda, S. Teipel, S. Duchesne, C. R. Jack, and G. B. Frisoni (2015) Training labels for hippocampal segmentation based on the eadc-adni harmonized hippocampal protocol. Alzheimerś & Dementia 11 (2), pp. 175–183. External Links: ISSN 1552-5260, Link Cited by: §4.
  • [5] A. Chartsias, T. Joyce, G. Papanastasiou, S. Semple, M. Williams, D. E. Newby, R. Dharmakumar, and S. A. Tsaftaris (2019) Disentangled representation learning in cardiac image analysis. Medical Image Analysis 58, pp. 101535. External Links: ISSN 1361-8415 Cited by: §2.
  • [6] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, Red Hook, NY, USA, pp. 2180–2188. External Links: ISBN 9781510838819 Cited by: §2.
  • [7] A. Douillard, Y. Chen, A. Dapogny, and M. Cord (2021-06) PLOP: learning without forgetting for continual semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4040–4050. Cited by: §2.
  • [8] C. González, G. Sakas, and A. Mukhopadhyay (2020) What is wrong with continual learning in medical image segmentation?. CoRR abs/2010.11008. External Links: 2010.11008 Cited by: §1, §2.
  • [9] J. Hofmanninger, M. Perkonigg, J. A. Brink, O. Pianykh, C. Herold, and G. Langs (2020) Dynamic memory to alleviate catastrophic forgetting in continuous learning settings. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2020, A. L. Martel, P. Abolmaesumi, D. Stoyanov, D. Mateus, M. A. Zuluaga, S. K. Zhou, D. Racoceanu, and L. Joskowicz (Eds.), Cham, pp. 359–368. External Links: ISBN 978-3-030-59713-9 Cited by: §2.
  • [10] X. Huang, M. Liu, S. Belongie, and J. Kautz (2018-09) Multimodal unsupervised image-to-image translation. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §1, §2, §3.
  • [11] J. Jiang and H. Veeraraghavan (2020) Unified cross-modality feature disentangler for unsupervised multi-domain mri abdomen organs segmentation. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2020, A. L. Martel, P. Abolmaesumi, D. Stoyanov, D. Mateus, M. A. Zuluaga, S. K. Zhou, D. Racoceanu, and L. Joskowicz (Eds.), Cham, pp. 347–358. External Links: ISBN 978-3-030-59713-9 Cited by: §2, §3, §3.
  • [12] K. Kamnitsas, C. Baumgartner, C. Ledig, V. Newcombe, J. Simpson, A. Kane, D. Menon, A. Nori, A. Criminisi, D. Rueckert, and B. Glocker (2017) Unsupervised domain adaptation in brain lesion segmentation with adversarial networks. In Information Processing in Medical Imaging, M. Niethammer, M. Styner, S. Aylward, H. Zhu, I. Oguz, P. Yap, and D. Shen (Eds.), Cham, pp. 597–609. External Links: ISBN 978-3-319-59050-9 Cited by: §1, §2, §3.
  • [13] T. Karras, S. Laine, and T. Aila (2019-06) A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
  • [14] S. Kazeminia, C. Baur, A. Kuijper, B. van Ginneken, N. Navab, S. Albarqouni, and A. Mukhopadhyay (2018) GANs for medical image analysis. CoRR abs/1809.06222. External Links: 1809.06222 Cited by: §2.
  • [15] D. P. Kingma and M. Welling (2014) Auto-encoding variational bayes.. In ICLR, Y. Bengio and Y. LeCun (Eds.), Cited by: §3.
  • [16] J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell (2017)

    Overcoming catastrophic forgetting in neural networks

    .
    Proceedings of the National Academy of Sciences 114 (13), pp. 3521–3526. External Links: ISSN 0027-8424, https://www.pnas.org/content/114/13/3521.full.pdf Cited by: §1, §2.
  • [17] J. Kulaga-Yoskovitz, B. C. Bernhardt, S. Hong, T. Mansi, A. J.W. v. d. K. Kevin E. Liang, J. Smallwood, A. Bernasconi, and N. Bernasconi (2015) Multi-contrast submillimetric 3 tesla hippocampal subfield segmentation protocol and dataset. Scientific Data 2 (1), pp. 150059. External Links: Link Cited by: §4.
  • [18] H. Lee, H. Tseng, J. Huang, M. Singh, and M. Yang (2018-09) Diverse image-to-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision (ECCV), Cited by: §1, §2, §3.
  • [19] M. Lenga, H. Schulz, and A. Saalbach (2020-06–08 Jul) Continual learning for domain adaptation in chest x-ray classification. In Proceedings of the Third Conference on Medical Imaging with Deep Learning, T. Arbel, I. Ben Ayed, M. de Bruijne, M. Descoteaux, H. Lombaert, and C. Pal (Eds.),

    Proceedings of Machine Learning Research

    , Vol. 121, pp. 413–423.
    Cited by: §2.
  • [20] H. Li, S. M. Smith, S. Gruber, S. E. Lukas, M. M. Silveri, K. P. Hill, W. D.S. Killgore, and L. D. Nickerson (2020)

    Denoising scanner effects from multimodal mri data using linked independent component analysis

    .
    NeuroImage 208, pp. 116388. External Links: ISSN 1053-8119 Cited by: §1.
  • [21] Y. Liu, Y. Yeh, T. Fu, S. Wang, W. Chiu, and Y. F. Wang (2018-06) Detach and adapt: learning cross-domain disentangled deep representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Cited by: §2.
  • [22] U. Michieli and P. Zanuttigh (2019-10) Incremental learning techniques for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Cited by: §2, §4.
  • [23] S. Özgün, A. Rickmann, A. G. Roy, and C. Wachinger (2020) Importance driven continual learning for segmentation across domains. In Machine Learning in Medical Imaging, M. Liu, P. Yan, C. Lian, and X. Cao (Eds.), Cham, pp. 423–433. External Links: ISBN 978-3-030-59861-7 Cited by: §4.
  • [24] O. S. Pianykh, G. Langs, M. Dewey, D. R. Enzmann, C. J. Herold, S. O. Schoenberg, and J. A. Brink (2020) Continuous learning ai in radiology: implementation principles and early applications. Radiology 297 (1), pp. 6–14. Note: PMID: 32840473 Cited by: §1.
  • [25] T. Prangemeier, C. Wildner, A. O. Françani, C. Reich, and H. Koeppl (2020) Multiclass yeast segmentation in microstructured environments with deep learning. In 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Vol. , pp. 1–8. Cited by: §2.
  • [26] A. L. Simpson, M. Antonelli, S. Bakas, M. Bilello, K. Farahani, B. van Ginneken, A. Kopp-Schneider, B. A. Landman, G. Litjens, B. H. Menze, O. Ronneberger, R. M. Summers, P. Bilic, P. F. Christ, R. K. G. Do, M. Gollub, J. Golia-Pernicka, S. Heckers, W. R. Jarnagin, M. McHugo, S. Napel, E. Vorontsov, L. Maier-Hein, and M. J. Cardoso (2019) A large annotated medical image dataset for the development and evaluation of segmentation algorithms. CoRR abs/1902.09063. External Links: 1902.09063, Link Cited by: §4.
  • [27] G. Sokar, D. C. Mocanu, and M. Pechenizkiy (2021) Learning invariant representation for continual learning. CoRR abs/2101.06162. External Links: 2101.06162 Cited by: §2.
  • [28] K. A. van Garderen, S. V. D. Voort, F. Incekara, M. Smits, and S. Klein (2019) Towards continuous learning for glioma segmentation with elastic weight consolidation. ArXiv abs/1909.11479. Cited by: §2.
  • [29] X. Yu, Z. Ying, and G. Li (2018) Multi-mapping image-to-image translation with central biasing normalization. CoRR abs/1806.10050. External Links: 1806.10050 Cited by: §3.