Adversarially Trained Convolutional Neural Networks for Semantic Segmentation of Ischaemic Stroke Lesion using Multisequence Magnetic Resonance Imaging

08/03/2019 ∙ by Rachana Sathish, et al. ∙ IIT Kharagpur 1

Ischaemic stroke is a medical condition caused by occlusion of blood supply to the brain tissue thus forming a lesion. A lesion is zoned into a core associated with irreversible necrosis typically located at the center of the lesion, while reversible hypoxic changes in the outer regions of the lesion are termed as the penumbra. Early estimation of core and penumbra in ischaemic stroke is crucial for timely intervention with thrombolytic therapy to reverse the damage and restore normalcy. Multisequence magnetic resonance imaging (MRI) is commonly employed for clinical diagnosis. However, a sequence singly has not been found to be sufficiently able to differentiate between core and penumbra, while a combination of sequences is required to determine the extent of the damage. The challenge, however, is that with an increase in the number of sequences, it cognitively taxes the clinician to discover symptomatic biomarkers in these images. In this paper, we present a data-driven fully automated method for estimation of core and penumbra in ischaemic lesions using diffusion-weighted imaging (DWI) and perfusion-weighted imaging (PWI) sequence maps of MRI. The method employs recent developments in convolutional neural networks (CNN) for semantic segmentation in medical images. In the absence of availability of a large amount of labeled data, the CNN is trained using an adversarial approach employing cross-entropy as a segmentation loss along with losses aggregated from three discriminators of which two employ relativistic visual Turing test. This method is experimentally validated on the ISLES-2015 dataset through three-fold cross-validation to obtain with an average Dice score of 0.82 and 0.73 for segmentation of penumbra and core respectively.



There are no comments yet.


page 1

page 2

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Cerebrovascular accident (CVA), more commonly known as stroke, is one of the most common causes of death and disability in the world [1]. It is characterized by a sudden focal neurological deficit due to cerebral infarction caused by poor vascular supply. Ischaemic stroke is the most common type of stroke [13], caused by occlusion of a blood vessel due to atherosclerotic stenosis, or by an embolus of atherosclerosis in a large artery, or may be of cardiac origin. This perfusion deficit causes irreversible necrosis of a small area of cerebral tissue which becomes completely devoid of blood supply. This area is called the core of the lesion and is surrounded by an area of hypoperfusion, which develops a reversible functional impairment due to temporary hypoxia. If the perfusion is not restored, this surrounding area which is called the penumbra of the lesion undergoes delayed apoptosis over the following days and weeks to form a permanent structural lesion with irreversible loss of function [3]. Early intervention to restore perfusion to this salvageable area can reverse the impairment and prevent the extension of the lesion [4]. Therefore, one of the most crucial investigations into stroke lesion detection is evaluating the extent of penumbra as compared to the core of the lesion, which helps the physician decide on interventions like thrombolytic therapy. Multisequence magnetic resonance imaging (MRI) is classically employed, especially using perfusion-weighted imaging (PWI) and diffusion-weighted imaging (DWI), where PWI indicates the region of hypo-perfusion as in core and the penumbra, while DWI indicates the region of restricted diffusion as in core.

Fig. 1: Overview of the proposed method

Challenge: There is a wide range of variability in visual appearance across these sequences, which along with growth in the number of sequences makes symptomatic biomarker discovery for clinical use challenging.


In this context we propose a method for core and penumbra estimation employing recent developments in deep learning (DL) employing adversarial learning of a convolution neural network (CNN) for semantic segmentation as illustrated in Fig. 

1. The limited availability of annotated data in stroke segmentation makes it difficult to train deep neural networks for automated detection with good generalisability and hence an adversarial approach is employed.

(a) Phase 1: Train with segmentation loss
(b) Phase 2: Train discriminator 1
(c) Phase 3: Train discriminator 2
(d) Phase 4: Train discriminator 3
(e) Phase 5: Train with adversarial loss
(f) Effect of including adversarial loss during training
Fig. 2: Training of the proposed framework using adversarial losses from three relativistic discriminators in addition to segmentation loss, for boosting ability to detect file lesions.

Related work: Random Fields (RF) based techniques have been most commonly employed for penumbra segmentation [7, 5, 10]

on the ISLES-2015 dataset. Patch based stacked sparse auto-encoder for feature learning followed by a support vector machine (SVM) classifier have also been employed as a data-driven approach 

[9]. A multi-scale 11-layer deep 3D CNN with a 3D conditional random field (CRF) based post-processing was proposed by [6]. More recently, several other approaches [14] have been proposed based on a modification of UNet [11] for later versions of the ISLES challenge on stroke lesion segmentation.

Organization: The problem statement is defined in Sec. II with the proposed solution detailed in Sec. III. Sec IV outlines the various experiments conducted for validation of performance. Results and their discussion of the results are presented in Sec. V and Sec. VI concludes the work.

Ii Problem Statement

We consider the task of lesion segmentation to be carried out on a per slice basis. Given all the MRI sequences for a particular slice, it is arranged as a

sized tensor

with denoting the number of MRI sequences and being the spatial size of each slice. We model the segmentation problem as that of a three class semantic segmentation where each pixel is classified as belonging to either of the {un-annotated brain tissue and background} denoted as class 0, or {penumbra} denoted as class 1 or {core} denoted as class 2.

Baseline Dice Precision Recall
Pen. Core Pen. Core Pen. Core
TABLE I: Performance comparison with baselines

Iii Method

In Phase 1 the segmentation CNN () in Fig. 2(a) predicts where is a tensor of size with denoting the number of tissue classes, and the objective is to minimize the cross entropy loss between and , where is the ground truth. Subsequently in Phase 2 the first relativistic Turing test discriminator in Fig. 2(b) learns to identify the ground truth (GT) annotation for penumbra from the segmented map obtained from () by minimizing the binary cross entropy loss . Similarly in Phase 3 the second discriminator in Fig. 2(c) learns to identify GT annotation of core from the segmented map by minimizing the binary cross entropy loss . In Phase 4 the discriminator learns to predict which channel contains the penumbra when fed with a shuffled channels in the input as in Fig. 2(d) thus minimizing the binary cross entropy loss . Finally in Phase 5 the () parameters are optimized to be minimize the adversarial loss and thus learn to be able to produce segmentation which closely resembles the GT. This impact of incorporating the adversarial losses in learning is finely visible in Fig. 2(f) where finer details of core and penumbra are evident in our segmentation approach.

Segmentation CNN: The segmentation CNN () used is an encoder-decoder like architecture [8] with the encoder having layer definitions similar to that of VGG11 [12]

. Concatenation of features across matched layers in the encoder and decoder is present in this architecture along with the passing of max pooling indices for up-sampling in the decoder. We additionally add batch normalization after each convolutional layer. The VGG11 like encoder is initialized with ImageNet pre-trained model weights.

Discriminator networks: The three discriminators () are a shallow convolutional neural network with five convolutional layers each with

kernels, interleaved with batch normalization layers and leaky ReLU non-linearity. Sigmoid activation is added to the last layer. The first layer has 32 channels. The number of channels in the subsequent layers is multiplied by a factor of 2.

Iv Experiments

Dataset description: This method is experimentally validated using the Ischaemic Stroke Lesion Segmentation Challenge (ISLES) - 2015 We have used the SPES dataset from the challenge which consists of data from 30 subjects with an average of 70 slices per patient. Seven sequence maps viz. T1c, T2, DWI, CBF, CBV, TTP and Tmax of

size on average is available for each patient. We have trained our network using only the DWI, TTP and Tmax. Whitening transform is performed on the slices using the mean and standard deviation of corresponding sequences in the training set. Performance evaluation was conducted using 3-fold cross-validation. In each fold, 20 subjects were used for training, 5 for validation and 5 for testing.

Baselines: The following baselines are used for comparison of performance. BL1: SegNet[2] trained using only segmentation loss. BL2: SegNet trained using adversarial losses as employed in the proposed framework. BL3: SUMNet trained using segmentation loss only. The performance of the proposed method in comparison with the baselines is tabulated in Tab. I

. The reported scores are mean and standard deviation across the three folds on the held-out test set. The performance of the proposed method is evaluated in comparison with the baselines using Dice coefficient, precision, and recall.

(a) Tmax
(b) TTP
(c) DWI
(d) GT
(e) SegNet Core - BL1
(f) SegNet Pen. - BL1
(g) SegNet Core - BL2
(h) SegNet Pen. - BL2
(i) SUMNet Core - BL3
(j) SUMNet Pen. - BL3
(k) Proposed Core
(l) Proposed Pen.
Fig. 3: Performance comparison for different baselines and our approach for estimating the core and penumbra using multisequence MRI. (a-e) denotes the different sequences used which constitutes the input to the network, (d) represents the GT with black representing the unannotated class 0, gray representing penumbra class 1 and white representing core as class 2. (e-f) represent the results of BL1, (g-h) for BL2, (i-j) for BL3 and (k-l) for our proposed approach. Red denotes under-segmentation, Green denotes over-segmentation and White denotes proper segmentation.

Training setup: The segmentation network and the discriminators were trained using Adam optimizer for epochs, with a learning rate of without any augmentation of the dataset. In the adversarial loss, , and .

V Results and Discussion

Qualitative results for the proposed method on a sample slice from the test set is illustrated in Fig. 3 along with the sequences and the ground truth. The models compared were trained on the same fold of the data. It can be observed that the performance of segmentation improves significantly with the proposed adversarial training framework. There is a notable reduction in over-segmentation and under-segmentation.

Also, it can be seen from Tab. I that the proposed method exhibits the least standard deviation in terms of dice coefficient across different folds of the dataset. This shows the generalization capability of the proposed framework despite being trained with a few annotated samples.

Vi Conclusion

Early estimation of the extent of penumbra is one of the most crucial aspects of stroke management. This delineation between core and penumbra helps the physician decide on thrombolytic therapy that could reverse the damage to the salvageable tissue. Traditional methods of acute stroke lesion estimation utilize distinct image processing algorithms and handcrafted machine learning-based feature extraction techniques, to separately segment core and penumbra from DWI and PWI respectively. Our proposed method gives a unified framework that uses both diffusion and perfusion maps as inputs for deep learning based supervised learning of features to segment both core and penumbra with comparable accuracy. The limited availability of annotated diffusion and perfusion maps for the same patient has lead to over-fitting of networks in training data. This is mitigated by the use of adversarial learning.


  • [1] J. Adamson, A. Beswick, and S. Ebrahim (2004) Is stroke the most common cause of disability?. J. Stroke Cerebrovascular Diseases 13 (4), pp. 171–177. Cited by: §I.
  • [2] V. Badrinarayanan, A. Kendall, and R. Cipolla (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Patt. Anal. Mach. Intell. 39 (12), pp. 2481–2495. Cited by: §IV.
  • [3] U. Dirnagl, C. Iadecola, and M. A. Moskowitz (1999) Pathobiology of ischaemic stroke: an integrated view. Trends in Neurosciences 22 (9), pp. 391–397. Cited by: §I.
  • [4] W. Hacke, G. Donnan, C. Fieschi, M. Kaste, J. Broderick, T. Brott, M. Frankel, J. Grotta, J. E. Haley, T. Kwiatkowski, et al. (2004) Association of outcome with early stroke treatment: pooled analysis of atlantis, ecass, and ninds rt-pa stroke trials. The Lancet 363 (9411), pp. 768–774. Cited by: §I.
  • [5] H. Halme, A. Korvenoja, and E. Salli (2015)

    ISLES (siss) challenge 2015: segmentation of stroke lesions using spatial normalization, random forest classification and contextual clustering

    In Int. Workshop on Brain Lesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, pp. 211–221. Cited by: §I.
  • [6] K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, D. Rueckert, and B. Glocker (2017) Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation. Med. Image Anal. 36, pp. 61–78. Cited by: §I.
  • [7] O. Maier, B. H. Menze, J. von der Gablentz, L. Häni, M. P. Heinrich, M. Liebrand, S. Winzeck, A. Basit, P. Bentley, L. Chen, et al. (2017) ISLES 2015-a public evaluation benchmark for ischemic stroke lesion segmentation from multispectral mri. Med. Image Anal. 35, pp. 250–269. Cited by: §I.
  • [8] S. Nandamuri, D. China, P. Mitra, and D. Sheet (2019) SUMNet: fully convolutional model for fast segmentation of anatomical structures in ultrasound volumes. arXiv preprint arXiv:1901.06920. Cited by: §III.
  • [9] G. Praveen, A. Agrawal, P. Sundaram, and S. Sardesai (2018)

    Ischemic stroke lesion segmentation using stacked sparse autoencoder

    Comp. Bio. Med. 99, pp. 38–52. Cited by: §I.
  • [10] S. M. Reza, L. Pei, and K. Iftekharuddin (2015) Ischemic stroke lesion segmentation using local gradient and texture features. In Proc. Int. Conf. Med. Image Comput. Computer Assisted Interv., pp. 23–26. Cited by: §I.
  • [11] O. Ronneberger, P. Fischer, and T. Brox (2015) U-net: convolutional networks for biomedical image segmentation. In Int. Conf. Medical Image Comput. Comp. Assist. Interv., pp. 234–241. Cited by: §I.
  • [12] K. Simonyan and A. Zisserman (2015) Very deep convolutional networks for large-scale image recognition. In Int. Conf. Learn. Repr., Cited by: §III.
  • [13] C. Warlow (1998) Epidemiology of stroke. The Lancet 352, pp. S1–S4. Cited by: §I.
  • [14] S. Winzeck, A. Hakim, R. McKinley, J. A. Pinto, V. Alves, C. Silva, M. Pisov, E. Krivov, M. Belyaev, M. Monteiro, et al. (2018) ISLES 2016 and 2017-benchmarking ischemic stroke lesion outcome prediction based on multispectral mri. Frontiers in Neurology 9. Cited by: §I.