Deep Recurrent Level Set for Segmenting Brain Tumors

10/10/2018 ∙ by T. Hoang Ngan Le, et al. ∙ Carnegie Mellon University 0

Variational Level Set (VLS) has been a widely used method in medical segmentation. However, segmentation accuracy in the VLS method dramatically decreases when dealing with intervening factors such as lighting, shadows, colors, etc. Additionally, results are quite sensitive to initial settings and are highly dependent on the number of iterations. In order to address these limitations, the proposed method incorporates VLS into deep learning by defining a novel end-to-end trainable model called as Deep Recurrent Level Set (DRLS). The proposed DRLS consists of three layers, i.e, Convolutional layers, Deconvolutional layers with skip connections and LevelSet layers. Brain tumor segmentation is taken as an instant to illustrate the performance of the proposed DRLS. Convolutional layer learns visual representation of brain tumor at different scales. Since brain tumors occupy a small portion of the image, deconvolutional layers are designed with skip connections to obtain a high quality feature map. Level-Set Layer drives the contour towards the brain tumor. In each step, the Convolutional Layer is fed with the LevelSet map to obtain a brain tumor feature map. This in turn serves as input for the LevelSet layer in the next step. The experimental results have been obtained on BRATS2013, BRATS2015 and BRATS2017 datasets. The proposed DRLS model improves both computational time and segmentation accuracy when compared to the the classic VLS-based method. Additionally, a fully end-to-end system DRLS achieves state-of-the-art segmentation on brain tumors.



There are no comments yet.


page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

According to CBTRUS (Central Brain Tumor Registry of the United States), an estimated 78,980 new cases of primary malignant and non-malignant brain and other CNS tumors are expected to be diagnosed in the United States in 2018. Moreover, 16,616 deaths are likely to be attributed to primary malignant brain and other CNS tumors in the US in 2018. Magnetic resonance imaging (MRI) and computed tomography (CT) scans provide high-resolution images of the brain. Based on the degree of excitation and repetition times, different modalities of MRI images maybe obtained, i.e. Fluid attenuation inversion recovery (FLAIR), spin-lattice relaxation T1-weighted (also referred to as T1), spin-spin relaxation T2- weighted (also referred to as T2) and T1-weighted contrast-enhanced (gadolinium-DTPA) also referred to as T1C. These modalities prove to be highly useful in detecting different subregions of the brain tumor, namely: edema (Whole tumor), non-enhancing solid core (tumor core), necrotic/cystic core and enhancing core. Manual detection, segmentation of the brain tumors for cancer diagnosis, from large amounts of MRI images generated during clinical routine, is a difficult and time consuming task. Thus, there is substantial importance for automatic brain tumor image segmentation from the magnetic resonance imaging (MRI) for diagnosis and radiotherapy.

A fundamental difficulty with segmenting brain tumors automatically is that they can appear anywhere in the brain, and vary in their shape, size and structure.Additionally, Brain tumors along with their surrounding edema are often diffused, poorly contrasted, and have extended tentacle-like structures. VLS method with Active Contouring (AC) is widely applied in image segmentation [1] due to its ability to automatically handle such various topological changes. Some of the remarkable works [2], [3], [4]

on brain tumor segmentation that utilized VLS have shown the potential of VLS in achieving highly accurate brain tumor segmentation. However, the segmentation accuracy in the VLS based methods dramatically reduces when dealing with numerous intervening factors such as lighting, shadows, colors and backgrounds with large variety or complexity. MRI images are modalities that contain such factors. The limitations of the VLS approaches can be surmised as follows: Firstly, VLS methods are largely handicapped in capturing variations of real-world objects due to its sole dependency on pixel values. Secondly, VLS methods fail to memorize and to fully infer target objects since they do not have any learning capability. Thirdly, VLS based methods are limited in segmenting multiple objects with semantic information. Finally, the segmentations generated by the LS methods are quite sensitive to numerous predefined parameters such as initial contour and number of iterations.o““1”” To overcome the limitation of VLS in solving the problem of brain tumor segmentation, our motivation is to answer the following questions: (1) VLS provides accurate segmentations but depends on parameters while deep neural network algorithms show the ability to learn parameters. Thus the question lies as to

How to incorporate LS into deep learning to inherit the merits of both LS algorithm and deep learning? (2) most deep learning-based semantic image segmentation methods perform the segmentation task using a softmax function. Is it possible to replace the softmax function by a LS energy minimization to get a better outcome on MRI images? If so, how is curve evolution performed with forward and backward processes of the deep learning framework. (3) In LS-based approach, the foreground background separation depends on the zero LS function which is computed by a sequence of iterations. How are such iteration processes performed in the deep framework?

To address these issues and boost the classic VLS methods to learn-able deep network approaches, we propose a new formulation of VLS integrated in a deep framework, called the Deep Recurrent Level Set (DRLS) which combines the advantages of both fully convolutional network (FCN)[5] and LS method [6]. For MRI images, the local, global, and contextual information is important to obtain a high quality feature map. To achieve this goal, the proposed DRLS is designed by incorporating LevelSet Layers into VGGNet-16 [7] with three types of layers: Convolutional layers, deconvolutional layers and LevelSet layers. The proposed DRLS contains two main parts corresponding to visual representation and curve evolution. The first part extracts features using a Fully Convolutional Networkk (FCN) while incorporating skip connection to up-sample the feature map. The second part is composed of a level set layer that drives the contour such that the energy function attains a minima as shown in Fig.1. Notably, our target is to show that it is completely possible to promote VLS to the higher level of learnable framework.

2 Literature review

Over the years, discriminative, generative and deep learning methods have been used to segment brain tumors from MRI images. The following is brief description of such methodologies.

Classical segmentation methods

Discriminative methods mostly use supervised learning techniques to learn the relationship between the input image and the ground truth by learning the features. Anitha et al


, proposed segmentation using adaptive pillar K-means followed by extracting crucial features from the segmented image using discrete wavelet transforms. The features are put through two-tier classifiers namely, k-Nearest Neighbor Classifier(k-NN) and self-organizing maps(SOM). Dimah et al


proposed a level set based approach for tumor segmentation by using histogram based clustering. The method also provides a local statistical characterization of the image by integrating the probabilistic non-negative matrix factorization (PNMF) framework into level set formulation. The model showed more robustness to noise and intensity inhomogeneity. Few papers make use of Support Vector Machines such as in

[10] and [11]. [11] incorporates Conditional random fields to refine the segmentation. Tustison et al [12]

used asymmetry and first order statistical features to train concatenated Random Forests (RF) by introducing the output of the first RF as an input to the another.

Level-Set Methods Some of the initial works that utilized Level-Sets for brain tumor segmentations are [2] and [3]. [2]

combined Level set evolution and global smoothness with the flexibility of topology changes followed by mathematical morphology. Thus achieving significant advantages over conventional statistical classification. The method evaluated the working of the algorithm based on volume overlap and Haussdorf distance. A major challenge of level set algorithms is to set the equation parameters: more specifically the speed function.


introduced a threshold-based scheme that uses level sets for 3D tumor segmentation (TLS). A global threshold is used to design the speed function which is defined based on confidence interval and is iteratively updated(search-based and adaptive) throughout the evolution process that require different degrees of user involvement. Thapaliya et al

[4] introduced a new signed pressure function (SPF) that can efficiently stop the contours at weak or blurred edges. The algorithm differentiates tumors from the rest of the image using local statistics. Additionally, calculations of basic thresholding and therefore different parameters for different types of images were automatic.

Deep Learning Methods

In the year 2015, the top finisher of BRATS 2015 challenge was the first to apply Convolution Neural Networks (CNN) to brain tumor segmentation

[13]. The proposed CNN architecture exploits both local features as well as more global contextual features simultaneously and was 30 times faster than the then state of art solutions. Additionally, the architecture uses convolutional implementation of a fully connected layer thereby allowing a 40 fold speed up. Urban et al [14]. proposed a 3D CNN architecture which extracts 3D voxel patches from different brain MRI modalities. The tissue label of the center voxel is predicted by feeding 3D voxels into a 4-layered CNN architecture. In order to avoid high computations of 3D voxels, Zikic et al [15] transformed the 4D data into 2D data such that standard 2D-CNN architectures can be used to solve the brain tumor segmentation task

Recently, [16] evaluated a 11-layered CNN architecture on BRATS dataset by implementing small 3 x 3 sized filters in the convolutional layers and reported comparative dice scores. In order to improve the performance and overcome the limitation of training data, CNNs is designed in a another fashion which combines with other classification methods or clustering methods [17] One of the state of the art deep learning - based approach for segmenting brain tumor was developed by [18] called DMRes which is an improvement of Deep Medic [19].

3 Proposed Network

The pipeline of the proposed network is as illustrated in Figure. 1. In this section, a review of formulation of the traditional LS is provided. Then, the proposed Recurrent Fully Convolutional Neural Network (RFCN) is introduced. Finally, the proposed end-to-end Deep Recurrent Level Set (DRLS) which incorporates VLS into RFCN is detailed.

(a) The proposed DRLS network with two main parts: visual representation by recurrent FCN and curve evolution by the proposed LevelSet layer
(b) Psedo code of the proposed network
Figure 1: The proposed DRLS network and algorithm

3.1 Formulation of Level Sets

Consider a binary image segmentation problem in 2D space, . The boundary of an open set is defined as: . In VLS framework, the boundary can be represented by the zero level set as follows:


For image segmentation, denotes the entire domain of an image I. The zero LS function divides the region into two regions: region inside (foreground), denoted as inside(C) and region outside (background) denoted as outside(C). The length of the contour C is defined as: and the area inside the contour C is defined as

Typically, the LS-based segmentation methods start with an initial level set and an given image I. The LS updating process is performed via gradient descent by minimizing an energy function which defined based on the difference of image features, such as color and texture, between foreground and background. LS utilizes shape and regions to improve the performance. Since LS uses only low-level features, it is limited when reading complex images. However, to compensate this limitation, deep networks have the ability to learn and encode useful high-level features.

3.2 Recurrent Fully Convolutional Neural Network

CNNs contain a set of building blocks of Neural Networks (NNs) which have shared weights across different spatial locations and are based on translation invariance with three main components, namely, convolution, pooling, and activation functions. The output feature map is obtained by convolving convolution kernels with the input feature map of fixed size.

[20] extended the classic CNNs to infer and learn from arbitrary-sized inputs. Later, [5] proposed a Fully Convolutional Neural Network (FCN) model which adapts and extends deep classification architectures to learn efficiently from whole input and whole ground truth images. By casting fully connected layers into convolutional neural network with kernels that cover their entire input regions, FCN allows to take input of any size and generate spatial outputs in one forward pass. To map the coarse feature map into a pixel-wise prediction of the input image, FCN up-samples the coarse feature map by a stack of deconvolution layers. Figures 2 (a-1, a-2) show the comparison between classic CNN and FCN.

The recurrent fully-convolutional network (RFCN) is an extension of FCN architecture and given in 2 (a-3). In the proposed RFCN, the output feature map of the current step is the input to the next step.

(a) Comparison of different deep models. (1) Convolution network. (2) Fully convolutional network. (3): recurrent fully convolutional network
(b) VGG-16 with 5 convolutional layer and 5 deconvolutional layers on which DRLS is built
Figure 2: The details of the proposed DRLS

3.3 Deep Recurrent Level Set (DRLS) - Proposed

To overcome the limitation of traditional LS, we incorporate LS into deep network that makes it more robust and powerful to deal with complex images. Our proposed DRLS is based on an observation that LS segmentation uses gradient descent which is seamless for deep framework to solve the brain tumor segmentation. Our proposed DRLS network is built based on VGG-16 with three different layers: convolutional layer, deconvolutional layer and LevelSet layer as shown in Figure 1.

3.3.1 Convolutional Layer:

Convolutional layers form the building blocks of a CNN. They have shared weights which are across different spatial locations and are defined on a translation invariance. The input and output of each convolutional layer is a feature map (tensor). The output feature map is computed by convolving the input feature map with convolution kernels

where X is the input feature map. are convolution kernel and bias.

indicates convolution at a stride

. denotes the output feature map generated by the convolutional layers with total stride of and parameterized by . Because of the stride of convolutional and pooling layers, the final output feature maps is downsampled by a factor of the total stride of compared to the input feature map.

3.3.2 Deconvolutional Layer:

The deconvolutional layer is used to upsample the input feature maps using the stored max-pooling indices from the corresponding convolutional feature map. Here, a skip connection is introduced to concatenate the output of deconvolutional feature map with the corresponding convolutional feature map. Figure

2 (b) illustrates the network’s architecture.

A deconvolutional layer takes the output feature maps () from the previous convolutional layer as its input feature maps. Let denote a deconvolutional layer parameterized by that up-samples the input by a factor of . The output is then concatenated with the corresponding convolutional layer via skip connection as

. Unlike the simple bilinear interpolation, parameters

of deconvolutional layers are jointly learned.

3.3.3 LevelSet Layer

To incorporate the RFCN with LS framework, the output feature maps are converted (y) from FCN to [-0.5, 0.5] via Euclidean distance transformation () to treat it as a level set function The proposed DRLS, refers to the input space as . The network is trained to minimize the following energy function:


In Eq.2, the first term defines the area inside the contour C whereas the second term defined the length of the contour (segmentation boundary). The first term is ignored by setting . Unlike the traditional VLS which sets to be robust to noise, here is set as to get more information regarding the different shapes of a brain tumor. In the third term, is the groundtruth. Minimizing this term with supervises the network to learn where a brain tumor occurs in the MRI images. The last two terms correspond to energy inside and outside of the contour . To force the feature map to be uniform on both inside and outside of the brain tumor regions, are set to be positive. and are two constants. To optimize the energy function, the calculus of variations is used. The derivative of energy function w.r.t is,


The derivatives of energy function w.r.t and are,


By maintaining a fixed and minimizing the energy function, and are calculated to be the average values of the inside and outside of the contour as,


Maintaining fixed and , and minimizing the energy function w.r.t , the associated Euler–Lagrange equation for is deduced. The descent direction is parameterized by a time and the level set is updated as:


Unlike the traditional deep learning methods, the predicted segmentation by the proposed method is computed via a Heaviside function defined in Section 3.1. In the context of brain tumor segmentation, foreground measurement is of higher interest i.e. either flair or T1 or T2 or T1c of each pixel in the given image. At the initial stage, the DRLS takes an input image I of as input feature maps and generates the output feature maps x of . As shown in Figure 2(b), the proposed DRLS is built on VGG-16 with 5 convolutional layer and 5 deconvolutional layers. The DRLS forward procedure is given in the Algorithm 1(b)

3.3.4 Implementation Details

Figure 2

(b) shows the structure of the proposed DRLS network which is built upon VGG16-based FCN. The proposed network is composed of three layers, i.e. convolutional layer, deconvolutional layer and LevelSet layer. ReLU layers are applied after each convolutional layers. LevelSet layer take

produced by FCN network as an input. An initial learning rate of 1e-4 is used to update the weights and it is reduced when validation performance stops improving. The network is implemented over a PyTorch framework in Python environment and runs in a machine of Core i7-6700 @3.4GHz CPU, 64.00 GB RAM and a single NVIDIA GTX Titan X GPU. To setting up the training and testing, the BRATS 2017 dataset is divided into 80% training set and 20% testing set.

4 Experimental Results

Dataset & Measurements: The proposed DRLS method is evaluated on BRATS 2017 data. Additionally, BRATS 2013 and BRATS 2015 datasets were used for comparing its performance with the state of the art techniques. BRATS 2017 dataset is provided by MICCAI for automated brain tumor segmentation task. Each dataset contains two subsets corresponding to LGG (grade one and grad two) and HGG (high grade gliomas). The dataset is divided into training (80%) and testing (20%) datasets. The network is first trained on 168 HGG and 60 LGG training set of BRATS 2017 and then is fine tuned on the training sets of BRATS 2015 and 2013, respectively.

Results: The algorithm was tested on 42 HGG and 15 LGG patients from BRATS 2017 data, 44 HGG and 11 LGG patients from BRATS 2015 and 9 HGG and 7 LGG patients from BRATS 2013 for comparison with other methods. The algorithm integrated merits on both level sets and deep learning and performs at par with the state of the art methods when run on local machines.

Evaluation metrics: To evaluate the performance of the proposed method, standard metrics are used as suggested in BRATS challenge [21] namely, Dice score, Sensitivity(Sens) score and Specificity (Spec). Besides metric scores, time consumption is also a key factor . Certain methods such as Tustison et al [12] take 100 minutes to compute predictions per brain. However, when run on 4 GPUs the proposed algorithm shows a run time of just 55 seconds per patient. Overall, on 2017 BRATS dataset, the DRLS algorithm achieved an average Dice score 0.86, 0.89 and 0.77 for Whole tumor(WT), Core Tumor(CT) and Enhancing core tumor(CT) regions respectively. Thus, achieving high performance and hence is on equal footing with the other state of the art methods. The sensitivity and specificity values achieved on BRATS 2017 dataset are (0.89, 0.88, 0.91) and (0.88, 0.78, 0.73), respectively. Although, the algorithm can be compared using dice scores, the comparisions cannot be considered as completely fair since certain algorithms such as Urban et al [14] and Chang et al[22] do not report sensitivity values. Additionally, most of the algorithms do not report specificity values. Furthermore, [15] do not report results on BRATS 2013 test data-set. A summary of the comparison of the algorithm with BRATS 2013 and 2015 dataset are provided in Table 1 and 2.

[H] Methodology Dice Score Sensitivity Specificity WT CT ET WT CT ET WT CT ET Havei et al[13] 0.88 0.79 0.73 0.87 0.79 0.80 0.89 0.79 0.68 Urban et al[14] 0.87 0.77 0.73 0.92 0.79 0.70 - - - Zikic et al[15] 0.837 0.736 0.69 - - - - - - Pereira et al[16] 0.88 0.83 0.77 0.89 0.83 0.81 - - - Proposed Algorithm 0.89 0.79 0.74 0.90 0.89 0.93 0.91 0.82 0.73

Table 1: Performance of DRLS in comparison with other methods when tested on BRATS 2013 dataset

Table 1 and Table 2 have shown that the proposed algorithm outperforms other methods in terms of Dice scores and Sensitivity. The performance can be credited to the availability of additional training data from 2017 BRATS dataset that helped in fine tuning hyper-parameters of the algorithm.

Methodology Dice Score Sensitivity Specificity
Pereira et al[16] 0.78 0.65 0.7 - - - - - -
Pavel et al[17] 0.83 0.75 0.77 - - - - - -
Chang et al[22] 0.87 0.81 0.72 - - - - - -
Deep Medic[19] 0.896 0.754 0.718 0.903 0.73 0.73 - - -
DMRes[18] 0.896 0.763 0.724 0.922 0.754 0.763 - - -
Proposed Algorithm 0.88 0.82 0.73 0.91 0.76 0.78 0.90 0.81 0.71
Table 2: Performance of DRLS in comparison with other methods when tested on BRATS 2015 dataset

As mentioned above, the results reported for DRLS pertain to the metrics achieved by testing on local machines on 20% of the BRATS datasets. The algorithm is robust to outliers, runs fast and consistently shows improved core tumor segmentation.

5 Conclusion

In this paper, a novel algorithm for automatic brain tumor segmentation method using deep recurrent level sets that integrates the advantages of both deep learning and level set is proposed and the current state of the art solutions were briefly introduced. The results obtained confirm that by integrating level sets and recurrent FCN architectures the proposed DRLS is a cutting-edge solution. Additionally, DRLS improves the speed of segmenting brain tumors to a large extent and thus making it a practical solution.


  • [1] S. Osher and J. A. Sethian, “Fronts propagating with curvature-dependent speed: algorithms based on hamilton-jacobi formulations,” Journal of computational physics, vol. 79, no. 1, pp. 12–49, 1988.
  • [2] S. Ho, E. Bullitt, and G. Gerig, “Level-set evolution with region competition: Automatic 3-d segmentation of brain tumors,” in ICPR, 2002.
  • [3] S.Taheri, S.H.Ong, and V.F.H.Chong, “Level-set segmentation of brain tumors using a threshold-based speed function,” Image and Vision Computing, vol. 28, no. 1, pp. 26–37, 2010.
  • [4] K. Thapaliya, J.-Y. Pyun, C.-S. Park, and G.-R. Kwon, “Level set method with automatic selective local statistics for brain tumor segmentation in mr images,” Computerized Medical Imaging and Graphics, vol. 37, no. 7, pp. 522 – 537, 2013.
  • [5] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in

    Proceedings of the IEEE conference on computer vision and pattern recognition

    , pp. 3431–3440, 2015.
  • [6] T. F. Chan and L. A. Vese, “Active contours without edges,” IEEE Transactions on Image Processing (TIP), vol. 10, no. 2, pp. 266–277, 2001.
  • [7] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  • [8] Anitha.V and Murugavalli.S, “Brain tumor classification using two-tier classifier with adaptive segmentation technique,” IET Comput.Vis, vol. 10, no. 1, pp. 9–17, 2016.
  • [9] Dimah.D, Nidhal.B, and Hassan.M.F, “Assessing the non-negative matrix factorization level set segmentation on the brats benchmark,” in Proceedings MICCAI-BRATS Workshop 2016, 2016.
  • [10] Havaei.M, Larochelle.H, Poulin.P, and Jadoin.P.M, “Within-brain classification for brain tumor segmentation,” Int J Cars, vol. 11, pp. 777–788, 2016.
  • [11] Bauer.S, Nolte.L.P, and Reyes.M, Fully automatic segmentation of brain tumor images using support vector machine classification in combination with hierarchical conditional random field regularization, pp. 354–361. 2011.
  • [12] Tustison, Nicholas.J, Shrinidhi.K.L, Wintermark, Max, and D. et al, “Optimal symmetric multimodal templates and concatenated random forests for supervised brain tumor segmentation (simplified) with antsr,” Neuroinformatics, vol. 13, pp. 209–225, Apr 2015.
  • [13] M. Havaei, A. Davy, D. Warde-Farley, A. Biard, A. Courville, Y. Bengio, C. Pal, P.-M. Jodoin, and H. Larochelle, “Brain tumor segmentation with deep neural networks,” Medical image analysis, vol. 35, pp. 18–31, 2017.
  • [14] G. Urban, M. Bendszus, F. Hamprecht, and J. Kleesiek, “Multi-modal brain tumor segmentation using deep convolutional neural networks,” MICCAI BraTS (Brain Tumor Segmentation) Challenge. Proceedings, winning contribution, pp. 31–35, 2014.
  • [15] D. Zikic, Y. Ioannou, M. Brown, and A. Criminisi, “Segmentation of brain tumor tissues with convolutional neural networks,” Proceedings MICCAI-BRATS, pp. 36–39, 2014.
  • [16] S.Pereira, A.Pinto, V.Alves, and C.A.Silva, “Brain tumor segmentation using convolutional neural networks in mri images,” IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1240–1251, 2016.
  • [17] P. Dvořák and B. Menze, “Local structure prediction with convolutional neural networks for multimodal brain tumor segmentation,” in International MICCAI Workshop on Medical Computer Vision, pp. 59–71, Springer, 2015.
  • [18] K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, D. Rueckert, and B. Glocker, “Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation,” Medical image analysis, vol. 36, pp. 61–78, 2017.
  • [19] “Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation,” Medical Image Analysis, vol. 36, pp. 61 – 78, 2017.
  • [20] O. Matan, C. J. Burges, Y. LeCun, and J. S. Denker, “Multi-digit recognition using a space displacement neural network,” in Advances in neural information processing systems, pp. 488–495, 1992.
  • [21] B. H. Menze, A. Jakab, Bauer, et al., “The multimodal brain tumor image segmentation benchmark (brats),” IEEE transactions on medical imaging, vol. 34, no. 10, pp. 1993–2024, 2015.
  • [22] P. D. Chang, “Fully convolutional neural networks with hyperlocal features for brain tumor segmentation,” in Proceedings MICCAI-BRATS Workshop 2016, pp. 4–9, 2016.