1 Introduction
Glioblastoma (GBM) is a highly malignant brain tumor with a dismal prognosis [1]. Most patients experience disease progression within 710 months, and targeted therapies have not increased survival [2, 3]. Accurate brain tumor segmentation is a significant yet challenging task for followup computeraided diagnosis. Semiautomatic segmentation remains a bottleneck in mining medical imaging data with a lack of definitive guidance of human experts involvement. Automated methods such as graph cuts method tend to lead highbias models that have not significantly improved accuracy [8].
While datadriven models like Convolutional Neural Networks (CNNs) are increasingly prevalent [11, 12, 15, 17]
, high variance limits their use for medical image analysis as many medical data sets have at most hundreds of patient samples. Strategies like data augmentation and transfer learning may gain success in creating more generalizable lowbias models
[9, 10], but they do not consider the development of network structures incorporating specific domain knowledge in tumor MRI. In this paper, we identify two key weaknesses in previous approaches by applying CNN models to medium sized imaging data sets. Canonically, CNNs utilize layers of 2D convolutions as filters for feature learning, feeding the outputs of these convolutional layers into a fully connected neural network. We propose a 3D CNN model for brain tumor segmentation by generalizing the conventional 2D architecture to fully take advantage of 3D multimodality MRI data. In addition, we propose several important advances leading to accurate segmentation performance.First, most prior methods for volumetric image data use either 2D convolutions or limited 3D convolutions on the , , and planes. By contrast, we propose a true generalization to 3D CNNs, made it computationally feasible by the transformation into Fourier space. Such innovation renders our system more robust with minimal loss of spatial information during convolution.
Second, the use of CNN models with medium data sets likely lead to high variance due to the lack of training data for learn network weights. Related algorithms have been proposed to use pretrained filters from ImageNet
[11], but these 2D filters maximize object classification from realworld images rather than volumetric medical images [11, 12, 13]. Since texture filters have been proven as effective tools for image data analysis, in this study, we perform 3D convolutions using predefined difference of Gaussian (DoG) filters, which are rotationally symmetric and act as effective blob detectors [14]. Subsequent CNN layers use convolutions to decouple pixels, expanding effective data size from patient number to total pixel count and significantly reducing variance. Our 3D CNN leverages the structure of medical imaging data to train a robust and efficient algorithm for learning from 3D images. We apply our framework for segmenting brain tumors and compare with previous approaches as well as results provided by expert annotations.2 Methods
2.1 3D Convolutional Neural Networks Architecture
Our CNN architecture utilizes 5 convolutional layers (Figure 1). Starting from inputs of volumetric image data in 4 MRI modalities (channels), the first layer performs 3D convolution over all input channels with 72 predefined filters. We then train 4 convolutional layers of
filters over the number of channels in the preceding layer (72 for the first and 100 for all subsequent layers). The final output layer of 5 channels represents scores for the predicted class probabilities of each pixel as either nontumor or 4 tumor subregions.
2.2 3D Convolution
Next, we minimized the potential bias of the CNN architecture by considering the volumetric image data as a 3D space of pixels. For 3D image and filter , the usual 2D filters of CNNs can be generalized to 3D convolutions (Eq. 1) as defined below,
(1) 
Given images and filters, time complexity of 3D convolution is . Since convolution in space is equivalent to elementwise multiplication in Fourier space, this complexity can be reduced to .
Previously, CNNs have repeatedly been found to have a first layer with trained weights that resemble Gaborlike filters [10]. Thus, to save computational time in training the model, we preselect the first layer’s filters to function as edge detectors. More specifically, we use 3D DifferenceofGaussian (DoG) filters, each represented by the difference of two normalized 3D Gaussians of scales and , as defined in Eq. 2.
(2) 
We created 8 filters of size with scales . Previous algorithms have shown the efficacy of DoG filters in blob detection [14]
; in particular, their rotational symmetry enables the CNN to learn a blob profile for each pixel. By contrast, while Gabor texture filters have emerged as a common theme in deep learning on image data
[12], their lack of rotational symmetry requires learning a full feature vector for each pixel for every possible orientation, which greatly increases learning complexity.
Next, we apply the 8 DoG filters to the original input as well as the magnitude of the gradient of those images. Thus, we create 18 “feature” images: the original pixel intensities and their 8 filter products as well as the magnitude of the gradient values and their 8 filter products. After applying such computation on 4 MRI modalities, such design leads to a 72dimensional feature space for each pixel. Overall, this nontrained convolutional layer results in a 3D convolution of the input data using 72 predefined filters.
2.3 Subsequent Convolution Layers as Pixelwise Neural Network
Each subsequent convolution layer consists of
kernels over all input channels. This choice enabling training on a CPU cluster is motivated by two benefits: (1) drastic decrease in the number of weights to be trained and (2) decoupling of pixels, allowing for a fully connected neural net implementation of the last five convolution layers. This decoupling is possible because the convolution layers and the softmax loss function operate independently for each pixel.
Thus, given a 72dimensional feature vector produced for each pixel by the first convolutional layer, we classify each pixel using a fully connected neural network. Our network architecture consists of 3 hidden layers of 100 neurons each, with a rectified linear unit (ReLU) as the activation function and the softmax function as the loss function. The output layer of five neurons predicts the following classes:
, , , , and . The final classification step follows a voting algorithm as described previously for pooling expert segmentations from BRATS dataset [16].3 Experiments
3.1 Brain Tumor MRI Data
We used the Brain Tumor Image Segmentation Challenge (BRATS)[16] to evaluate performance of the proposed approach. The 2015 BRATS data set consists of 274 samples: 220 patients with highgrade GBM (HGG) and 54 with lowgrade GBM (LGG). Each patient has 4 modalities (T1 postcontrast, T1 precontrast, T2weighted, and FLAIR) and an expert segmentation that we treat as ground truth. The expert segmentation, which provides pixelwise labeling into five segmentations based on the consensus of eleven radiologists: , , , , and
. We additionally included BRATS 2013 dataset to compare with prior studies. All images were preprocessed by stripping the skull, coregistering images, and interpolating images to
pixels (Figure 2).3.2 Evaluation
We evaluated our algorithms by focusing on three clinically relevant segmentations: “whole” or “total” referring to the entire tumor, “core” including all structures except “edema,” and “active” including only “enhancing” subregions unique to HGG [16]. For each of these three regions, accuracy is reported using the Dice coefficient by comparing the predicted segmentation with the expert reference and with previously developed algorithms. This score, given in Eq. 3
, is equivalent to the harmonic mean of the precision and the recall.
(3) 
4 Results
The performance of the 3D CNN for “total,” “core,” and “active” tumor regions on the 2015 data set is shown (Figure 3), with a median accuracy of over on total tumor detection (compared to interradiologist reproducibility of ). Slicelevel comparison of our algorithm’s labels with the expert segmentation is shown for representative samples of varying Dice scores (Figure 4).
Table 1 compares the performance of our algorithm to expert segmentations and competing methods for brain tumor segmentation. Overall, our algorithm shows quite competitive results comparing to prior approaches. First, comparing to raters, our results are comparable with annotations by individual radiologists and even close to results of expert segmentation generated by a voting algorithm [16]. Additionally, we evaluated the performance of our method on the 2013 BRATS data set, comparing it to the best combination of programs from the 2013 BRATS challenge [16]. While each individual program from the 2013 challenge has lower performance than the combination, our algorithm, trained only on the 2013 data, has equal or better results than the combination in all three categories. Finally, we compare our method with other methods on the 2015 data [18]. Our algorithm achieves Dice scores for whole, core, and active tumor detection of 87%, 76%, and 80%, with the highest performance in two of the three clinically used regions. Our similar outcomes on both 2015 data and 2013 BRATS data reaffirmed the superior performance of the proposed 3D CNN model with a notable improvement in classification accuracy of active tumor regions.
People  Description  Whole  Core  Active 
HGG/LGG  HGG/LGG  
Rater v. Rater  Comparison between radiologists using 2013 BRATS challenge data.  85 (88/84)  75 (95/67)  74 
Rater v. Fused  Comparison between radiologists and fused segmentation.  91 (93/92)  86 (96/80)  85 
Combination  The best combination of 2013 BRATS challenge programs using Algorithm 1.  88 (89/86)  78 (82 / 66)  71 
3D Convolutional Neural Network, using 2013 data set.  89 (89/88)  78 (79/74)  71  
Davy  Deep neural networks. 2014 Workshop.  85 (/)  74 (/)  68 
Goetz  Extremely randomized trees. 2014 Workshop.  83 (/)  71 (/)  68 
Kleesiek  ilastik Algorithm. 2014 Workshop.  84 (84/82)  68 (71/61)  72 
Kwon  GLISTR Algorithm. 2014 Workshop.  88 (/)  83 (/)  72 
Meier  Appearance and Context Sensitive Features. 2014 Workshop.  83 (84/)  66 (73/)  68 
3D Convolutional Neural Network, using 2015 data set.  89 (89/87)  76 (79/69)  80 
5 Conclusion
We have proposed a 3D fully convolutional netwroks that generalizes conventional CNNs in learning 3D tumor MRI data. Specifically, we first use a nontrained convolutional layer with predefined DoG filters to perform true 3D convolution that incorporates information about the local neighborhood at each pixel of the output. We then use three trained convolutional layers that act to decouple voxels, under the assumption that voxels are coupled only by the information already incorporated in the initial 3D convolution. This architecture of a fully connected neural network at the level of pixels allows us to greatly increase the effective training data size from the number of patient samples to the number of pixels. We show that the use a modified nontrained convolutional layer can greatly reduces variance by increasing the number of training samples. It is known that patientbased samples can theoretically allow for complex features that relate wholly different parts of the brain, but the presented voxelbased training data allows the fully connected feedforward neural network to learn higherlevel features based on a much larger training data set in pixel space. Overall, our generalization to a 3D CNN incorporates several key innovations addressing problems with existing approaches to using deep learning in mediumsized imaging data sets.
6 Acknowledgement
This work was supported by the National Institutes of Health (NIH) under Award Number R01EB020527.
7 References
References
 [1] Adamson C, et al. Glioblastoma multiforme: a review of where we have been and where we are going. Expert Opin. Investig. Drugs 18(8) (2009) 10611083.
 [2] Omuro A, DeAngelis LM. Glioblastoma and Other Malignant Gliomas: A Clinical Review. JAMA 310(17) (2013) 18421850.
 [3] Omuro AM, Faivre S, Raymond E. Lessons learned in the development of targeted therapy for malignant gliomas. Mol. Cancer Ther. 6(7) (2007) 19091919.
 [4] Barajas RF, et al. Glioblastoma Multiforme Regional Genetic and Cellular Expression Patterns: Influence on Anatomic and Physiologic MR Imaging. Radiology 254(2) (2010) 564576.
 [5] Gutman DA, et al. MR Imaging Predictors of Molecular Profile and Survival: Multiinstitutional Study of the TCGA Glioblastoma Data Set. Radiology 267(2) (2013) 560569.
 [6] Jain R, et al. Genomic Mapping and Survival Prediction in Glioblastoma: Molecular Subclassification Strengthened by Hemodynamic Imaging Biomarkers. Radiology 267(1) (2013) 212220.
 [7] Birkbeck N, et al. An Interactive Graph Cut Method for Brain Tumor Segmentation. IEEE WACV (2009) 17.
 [8] Njeh I, et al. 3D multimodal MRI brain glioma tumor and edema segmentation: A graph cut distribution matching approach. Comput. Med. Imag. Graph. 40 (2015) 108119.
 [9] Cui X, Goel V, Kingsbury B. Data Augmentation for Deep Neural Network Acoustic Modeling. IEEE ICASSP (2014) 55825586.
 [10] Yosinski J, et al. How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27 (2014) 33203328.

[11]
Deng J, et al. ImageNet: A LargeScale Hierarchical Image Database. IEEE Comput. Vision and Pattern Recognit. (2009) 248255.
 [12] Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. (2012) 19.
 [13] Bar Y, et al. Chest Pathology Detection Using Deep Learning with NonMedical Training. IEEE International Symposium on Biomedical Imaging (ISBI) (2015).
 [14] Lowe DG. Object recognition from local scaleinvariant features. Proc. Int. Conf. Comput. Vision. 2 (1999) 11501157.

[15]
Lawrence S, et al. Face Recognition: A Convolutional NeuralNetwork Approach. IEEE Trans. on Neural Networks
8(1) (1997) 98113.  [16] Menze BH, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans. Med. Imag. 34(10) (2015) 19932024.
 [17] Wei S, et al. Multiscale convolutional neural networks for lung nodule classification. International Conference on Information Processing in Medical Imaging. (2015) 588599.
 [18] BraTS Challenge Manuscripts. MICCAI (2014).
Comments
There are no comments yet.