Crossbar-Net: A Novel Convolutional Network for Kidney Tumor Segmentation in CT Images

04/27/2018 ∙ by Qian Yu, et al. ∙ 0

Due to the irregular motion, similar appearance and diverse shape, accurate segmentation of kidney tumor in CT images is a difficult and challenging task. To this end, we present a novel automatic segmentation method, termed as Crossbar-Net, with the goal of accurate segmenting the kidney tumors. Firstly, considering that the traditional learning-based segmentation methods normally employ either whole images or squared patches as the training samples, we innovatively sample the orthogonal non-squared patches (namely crossbar patches), to fully cover the whole kidney tumors in either horizontal or vertical directions. These sampled crossbar patches could not only represent the detailed local information of kidney tumor as the traditional patches, but also describe the global appearance from either horizontal or vertical direction using contextual information. Secondly, with the obtained crossbar patches, we trained a convolutional neural network with two sub-models (i.e., horizontal sub-model and vertical sub-model) in a cascaded manner, to integrate the segmentation results from two directions (i.e., horizontal and vertical). This cascaded training strategy could effectively guarantee the consistency between sub-models, by feeding each other with the most difficult samples, for a better segmentation. In the experiment, we evaluate our method on a real CT kidney tumor dataset, collected from 94 different patients including 3,500 images. Compared with the state-of-the-art segmentation methods, the results demonstrate the superior results of our method on dice ratio score, true positive fraction, centroid distance and Hausdorff distance. Moreover, we have extended our crossbar-net to a different task: cardiac segmentation, showing the promising results for the better generalization.



There are no comments yet.


page 3

page 5

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Quantification and correct classification of renal tumors could largely influence the effect of the computer-aided treatment and diagnosis of renal cell carcinoma, in which the most significant prerequisite is the accurate kidney tumor segmentation. According to the clinical observation, the locations of different kidney tumors in medical images (e.g., CT or MR) are very difficult to predict, which could possibly appear in very different places for different patients. Moreover, different tumors normally appear with diverse shapes and sizes, and most of the tumors appear a similar appearance with their renal parenchyma and other surrounding tissues. Therefore, segmenting the CT kidney tumors is a very challenging task. Although several methods for kidney tumor segmentation have been proposed recently [Lee et al.2017, Hodgdon et al.2015, Linguraru et al.2009, Linguraru et al.2011, Skalski et al.2016], their segmentation performance is far from meeting the clinical requirements.

Recent trends of deep convolutional neural network (CNN) have demonstrated the superior performance on learning-based segmentation problem in different imaging modalities for different organs, e.g., prostate [Shi et al.2017], lung nodule [Wang et al.2017], brain [Havaei et al.2017]. Unfortunately, to the best of our knowledge, CNN-based segmentation methods have not been adopted to segment the kidney tumor in CT images till now. Moreover, directly performing the traditional CNN-based methods (e.g., the image-based methods [Ronneberger et al.2015] and the patch-based CNN methods [Wang et al.2017]) is unsuitable for segmenting kidney tumor. For the image-based methods, U-net [Ronneberger et al.2015] is the representative model which builds a hierarchical multi-scale based structure for medical image segmentation, while it is not very suitable for our case according to our experimental validation. For the patch-based methods, the major reason of unfitness is that the obtained patches and the subsequent learning process could not fully describe the shape and appearance property of kidney tumors. In CT images, the kidney tumors normally appear as the subrounded shape, with a certain degree of symmetry, the squared patches fail to take both the local information and contextual information at the same time. For example, as shown in Fig. 1, the red and yellow squares are squared patches.Obviously, the tumor and surrounding organs cannot be easily distinguished with these squared patches. Basically, it can be observed that, compared to the squared patch with the same area, the non-squared rectangle patch could capture more information from one direction (i.e., horizontal or vertical) (Fig. 1). That is to say, if we sample these non-squared patches to fully cover the whole tumor along one direction from side-to-side, we indeed integrate the contextual and symmetrical information simultaneously.

Thus, based on patch-based method, we innovatively propose crossbar patches which consists of vertical patch and horizontal patch, aiming to capture the local appearance and contextual information from vertical and horizontal directions, respectively.

Figure 1: An example of kidney tumors. (a) ground truth. (b) squared patches. (c) crossbar patches.

Moreover, regarding the large diversity of the tumor appearance which makes the segmentation model difficult to learn, inspired by the cascaded learning strategies (i.e., boosting and co-training) in learning methods [Walach and Wolf2016], we also devise a cascaded boosting co-training style framework. Specifically, with the obtained crossbar patches, we first separately train two sub-models termed as vertical and horizontal sub-model, respectively. Then, in each round, we select the mis-segmented region of one sub-model to fine-tune the other sub-model in the next round with a patches re-sampling strategy. We will repeat this process until to the convergence condition. This process means that if one sub-model cannot segment a region correctly in one direction, the other sub-model can complement in other direction. Furthermore, if both sub-models in the same round fail in a common region, the above process could also guarantee that the role of this common region will be enhanced in the next round. In the testing stage, for each new coming image, we will sample the crossbar patches in the same setting, and then predicts the center pixel of each patch is tumor or not, under the trained cascaded models. We named our method as Crossbar-Net. Overall, our contributions in this work are three-fold:

The crossbar patches are easy to sample, and also without any additional parameters to train in ConvNets. However, they could large capture both the local appearance and contextual information.

The cascaded boosting training style could provide complementary information across different sub-models, to enhance the final segmentation.

Our model is easy to implement, which allows integrating different sub-models. Also, our model can also be employed on on cardiac segmentation, showing promising results and good generalization.

2 Related work

For kidney tumor segmentation methods, in term of the way of feature representation, previous methods are mostly low-level methods which either employ the energy minimization-based models [Linguraru et al.2011, Skalski et al.2016, Lee et al.2017] or learned the segmentation model on the hand-crafted features [Hodgdon et al.2015, Linguraru et al.2009]

. We term these methods as hand-crafted based methods (HCM). There are three basic operations in the HCM: extracting the whole kidney including tumors and calculating features manually, classifying or segmenting tumors by non-CNN methods. The HCM work well in simple cases where tumors shows different appearance from surrounding tissues. However, their performance could not be fully guaranteed when the tumors are very similar to the surrounding tissues.

Figure 2: Framework of proposed method. and are sub-models of the round.

Although the attempts of developing the specific deep feature-based methods for segmenting kidney tumor are very limited, the related deep feature-based segmentation methods for segmenting other medical organs can be borrowed to our task.

[Ciresan et al.2012]

use multiple deep networks to segment biological neuron membranes with squared patches which were sampled by sliding-window, while the sliding-window leads to a large patch redundancy and label imbalance.

[Wang et al.2017] devise a multi-branch CNN model to segment lung nodules. [Shi et al.2017] propose a cascaded deep domain adaptation model to segment the prostate. [Havaei et al.2017] present a cascade architecture of CNNs to segment glioblastomas in MR images.

Obviously, it is not a good choice to directly apply the above methods, which are specially designed for segmenting one typical organ, to our task. So, besides proposing the crossbar patch, we also propose a new patch extraction method and a new cascaded boosting training style. The extraction method is termed as basic sampling strategy to reduce redundancy and avoid label imbalance. And for the cascaded boosting training style, as mentioned in Section 1, methods in [Walach and Wolf2016, Karianakis et al.2015, Shalev-Shwartz2014]

have introduced boosting into their deep learning models. Both

[Shalev-Shwartz2014] and [Karianakis et al.2015] boost within one network. [Walach and Wolf2016] introduce a multi-scale net model, in which the CNNs are added iteratively and trained in stages. Although achieving a promising result in object counting task, the whole model in [Walach and Wolf2016] is too dependent on the first net. Therefore, we adopt the idea of boosting and avoid the shortcomings of method in [Walach and Wolf2016] to design Crossbar-Net. In Crossbar-Net, we keep all sub-models equal and make the sub-models in each round complementary and benefit to each other.

3 Framework of Crossbar-Net

Our proposed method involves (1) crossbar patch by sampling the orthogonal non-squared patches to fully cover the whole kidney tumors in either horizontal or vertical directions, and (2) cascaded training process to integrate the results from two directions by a covering sampling strategy, which are largely different to current segmentation methods. The framework of training stage in Crossbar-Net is schematized in Fig. 2. Firstly, crossbar patches are initially extracted from the training CT images under the basic sampling strategy, with the manual segmentation available as the ground truth. Then, regarding the cascaded training process, in each round, we evaluate the segmentation performance of current trained vertical and horizontal sub-models. we select the mis-segmented regions in each sub-models, and then re-sample the corresponding patches by using the cover sampling strategy, then, feed the re-sampled patches and the basic sampled patches to another sub-model for parameter fine-tuning, e.g., a non-squared region represented by a vertical patch is largely mis-segmented, the corresponding re-sampled patches (to fully cover the region) will be accordingly fed to horizontal model. This cover sampling strategy can guarantee that the role of mis-segmented patches will be enhanced in the next round. We keep repeating the process until the segmentation error converges or the maximum round number is reached. Finally, in testing stage for a new coming CT image, the trained sub-models in each round are gathered together to perform a majority vote to obtain the final segmentation.

4 Model training

4.1 Initialization and Setting

Parameter Layer1 Layer2 Layer3 Layer4 Layer5 Layer6 Layer7 Layer8 Layer9 Layer10 Layer11
Vertical sub-model Layer type C C P C P C C C C C Softmax
Feature maps 16 36 36 64 64 64 64 64 500 2 2
Kernel size -
Stride size -
Input size
Horizontal sub-model Layer type C C P C P C C C C C Softmax
Feature maps 16 36 36 64 64 64 64 64 500 2 2
Kernel size -
Stride size -
Input size
Table 1: Details of Crossbar-Net architectures. “C” and “P” denote convolutional layers and pooling layers respectively.

For extracting the training crossbar patches in the first round, with the goal of making the segmentation model more focus on the region surrounding the tumor boundary (which is hard to segment in practice), we develop the basic sampling strategy by increasing the patches that are close to the tumor, and reducing the redundant patches that are far away from the tumor. This sampling strategy will also be used in each round of cascaded training.

In the basic sampling strategy, we select pixels according to the distance between the pixel and the center of tumor. We first extract crossbar patches uniformly in the tumor region, with one-third of the total pixels being selected as the crossbar patch center. Then, we sample non-tumor patches densely near the tumor and sparsely in the far region. Specifically, we select pixels on the circles centered on the center of the tumor with certain intervals. Radius of these circles are denoted by , then , where is radius of the tumor’s incircle. is 1.5 times of tumor’s circumcircle radius. is parameter defined as , in both equations, is a constant, here 3.5 is commonly used. We now discuss the architecture of the sub-model

in Crossbar-Net. Basically, both vertical and horizontal sub-models consist of eight convolutional layers, two max pooling layers, and one softmax layer, regarding the non-square property of crossbar patches. Details of sub-model are illustrated in Table


. For both sub-models, each convolutional layer is followed by the rectified linear unit (ReLU)

[Nair and Hinton2010]

activation. Regarding our crossbar patches are non-square, the shape of convolutional kernels is non-squared, too. Also, each layer is performed with 0 padding. In addition, the dropout

[Hinton et al.2012] after the last convolutional layer is applied to avoid over-fitting.

4.2 Cascaded Training

Figure 3: Illustration of re-sampling mis-sigmented region. The vertical patch is mis-segmented region. The horizontal patches are the re-sampled patches.

We now discuss how to train our Crossbar-Net in a cascaded manner, which is a boosting-like training style. The advantages of our cascaded training can be summarized

  • The vertical and horizontal sub-models could benefit each other during cascaded training by feeding each other with newly generated crossbar patches using cover re-sampling strategy (will detailed below).

  • As boosting-like algorithms, our vertical(and horizontal) sub-model can perform self-improvement by emphasizing the learning on the mis-segmented regions.

Formally, we denote the vertical and horizontal sub-model in the -th training round as and , respectively. In particular, in our cascaded training, firstly, we train the initial sub-models as and

. Typically, the weights of filters are initialized randomly with Gaussian distribution and updated by stochastic gradient descent algorithm.

Secondly, continuously updating Crossbar-Net by fine-tuning the current sub-models. Specifically, in -th round,

  • Performing the evaluation on (i.e., checking the mis-classified pixels according to the ground truth), and thus determine the mis-segmented regions in .

  • Performing both the cover re-sampling strategy (on the mis-segmented regions in ) and basic sampling strategy, to obtain the newly generated vertical patches.

  • Employing these newly generated patches to fine-tune to obtain .

  • Similarly updating the from and .

Finally, we will repeat the aforementioned step until reaching the convergence condition or maximum round number.

Cover re-sampling strategy: As shown in Fig. 3, we assume that the vertical patch is an mis-segmented region in vertical sub-model. Our purpose is to borrow horizontal sub-models to well segment this region using horizontal patches. In particular, we sample the horizontal patches by fully covering the mis-segmented region, according to the location of the mis-classified central pixel in vertical sub-model. In order to cover the region, we sample horizontal patches with the central pixel being located at three columns: the center (cyan), the right (magenta) and the left column (yellow) of the vertical patch. To avoid sampling redundant horizontal patches, we re-sample by every three pixels on each column. Normally, for a vertical patch, we can roughly obtain  40 horizontal patches by cover re-sampling strategy. In this way, if both sub-models, i.e., and , fail to segment in the same region, the role of this region will still be enhanced in the next round, thus the performance of is expected to be superior than , and better than ; That is to say, the two sub-models benefit each other.

5 Model testing

In the testing phase, for a new coming image, we first extract the crossbar patches in a sliding-window manner, then feed these patches to the trained sub-models in each round. Each sub-model outputs a segmentation result, and the final result is generated by a majority vote of all sub-models. For example, is the number of maximum round, then, we get sub-models which are and . The result is decided by these sub-models and the weights of and are bigger than others.

6 Experiments

In this section, we report the characteristics and the segmentation results of Crossbar-Net qualitatively and quantitatively. For the evaluation metrics, we employ the dice ratio (DR) score, the True Positive Fraction (TPF), the Centroid Distance (CD) and the Hausdorff distance (HD). Large DR and TPF indicate high segmentation accuracy. CD indicates the distance between the central pixels of the new method segmentation result and the manual result, and we make it indicate the Euclidean distance between two central points in 2-D space. More details about these metrics are introduced in

[Shi et al.2017] and [Faragallah et al.2017]. 3-fold cross validation is employed.

3,500 CT images of 94 subjects with kidney tumors are used to implement and test the Crossbar-Net. To reduce irrelevant information, the slices are cropped into the size of . To our observation, most of the tumor diameters range from 10 pixels to 90 pixels, so we set the size of horizontal patch as and vertical patch as , respectively. For each image, the tumor is manually annotated and used as ground truth to train models. We run our networks on Matconvnet toolbox [Vedaldi and Lenc2015]

and the learning rate is 0.0005. Each sub-model reaches convergence within 20 epochs. In addition, in the fine-tuning process, the interval in

basic sampling strategy is different from the previous round.

Figure 4: Error of each sub-model.
Figure 5: Performance of each sub-model. The left column is ground truth image. The second to the forth columns and the right three columns are the segmentation results of the three vertical sub-models, the three horizontal sub-models, respectively.

6.1 Characteristics of Crossbar-Net

The characteristics of Crossbar-Net include four aspects: Sub-models can perform self-improvement, sub-models in the same round can benefit each other, result is combined by all sub-models, and the effectiveness of crossbar patches.

Self-improvement We first train the vertical sub-model separately with being gotten. Thus, we re-sample the mis-segmented regions with cover re-sampling strategy, while the re-sampled patches are vertical patches. Then, fine-tuning with these patches together with those gotten by basic sampling strategy to get . We repeat fine-tuning the sub-model 10 times, and the segmentation error converges. The same process is also employed on horizontal sub-model. As illustrated in Fig.4, error rate of each sub-model shows a downward trend from the whole.

Benefit and complement each other Here, we get six sub-models which will be used in all the following kidney tumor experiments. As shown in Fig. 5, the first rumor is the typical case for implementation. In this case, we can see fails to properly segment the upper and lower parts of the tumor from vertical direction, but dose well from horizontal direction. Similarly, cannot segment the tumor with its left and right background correctly, while can do this. At the same time, the degree of dis-segmentation are reduced in and ; and both achieve promising result at last. For other tumors, in addition to complementary regions, there are more or less common mis-sigmented regions. For example, in the second tumor, the upper right boundary and the lower right boundary are the incorrectly segmented by both sub-models in the first two rounds. In this case, and are obviously superior than their former sub-models, also, and achieves the best results. This phenomenon verifies that the sub-models can benefit and complement each other. Overall, in all cases, the performances of later sub-models are superior to the former sub-models.

The way of results combination. Like with Fig. 5, Table 2 illustrates that the latter sub-model works better than the previous sub-model. However, we cannot simply combine the result of the two last sub-models, the majority vote of all sub-models is adopted, in which the weights of last sub-models are greater than others. We set the weights of the last two models to 1.5 and the rest to 1. Fig. 8 illustrates that the DR and TPF of last two sub-models combination are lower than the majority vote result of six sub-models about 2%.

DR 0.853 0.870 0.882 0.846 0.870 0.881
TPF 0.844 0.875 0.883 0.834 0.869 0.883
HD 10.200 9.586 9.223 11.315 10.001 9.890
CD 4.346 3.424 2.790 4.403 3.560 2.909
Table 2: Average metrics for each sub-model
(a) Cluster of vertical patches
(b) Cluster of squared patches
Figure 6: t-SNE visualization of the high level representations in Crossbar-net. All patches are extracted from the same image.
Figure 7: DR and TPF of Crossbar-Net.
Figure 8: Kernel Density Estimation of Dice Ratio score.

Effectiveness of crossbar patches. We maintain the main structure and training way of Crossbar-Net, and then change the patch size into and pixels. Training data of both vertical and horizontal sub-models are sampled on the same images, sampling intervals are different. The results are shown in Fig. 8, DR and TPF are much lower than crossbar patch. Furthermore, in order to highlight the effectiveness of non-squared patch, we also compare the high level features learned from crossbar patch and squared patch. We use t-SNE (t-distributed Stochastic Neighbour Embedding) [Maaten and Hinton2008] to evaluate the high level features. The 500-dimensional features in are taken from vertical patch and patch, respectively. As shown in Fig. 6, each point represents a patch projected from 500 dimensions into two dimensions, the purple one is tumor case and the red one is non-tumor. The positive and negative cases represented by squared patch features are almost indivisible in Fig. 6(b), while cases in Fig. 6(a) are separated well, which are represented by vertical patches.

Figure 9: Examples of segmentation results with ground truth. The red curves are manual annotation, the blue curves are Crossbar-net contours, yellow ones are HCM, green ones are U-net and cyan are multi-scale CNN.

6.2 Comparison with baselines

We compare Crossbar-Net with three baselines, e.g., the hand-crafted based methods (HCM), the multi-scale CNN and U-net. As mentioned in Section 2, there are three basic operations in HCM, and we apply them to our data set.VThe second baseline method is a state-of-the-art multi-scale CNN model proposed in [moeskops2016automatic], in which all parameters are kept except for the nodes of the output layer are changed from 9 to 2. U-net is widely used in medical image segmentation, it is necessary to compare the Crossbar-Net with it.

We sample testing images from one test set which consists of 30 subjects, with 20 images being randomly sampled in each subject. Fig. 8 depicts the kernel density estimation of DR of these 600 tumors after segmentation with all compared methods. It can be seen that crossbar-net can achieve promising DR scores (greater than 0.9) on most of tumors. Many low DR scores are distributed in the multi-scale CNN and HCM. U-net performs better than HCM and multi-scale CNN on the whole, while it still fail to get good segmentation result on tumors with low contrast.

In Table 3, we list average values of DR, TPF, HD and CD of all test sets using different methods. It is obvious that Crossbar-Net outperforms other methods in terms of the higher DR and TPF. Moreover, as shown in Table 3, it is predominant that Crossbar-Net has the smallest value of HD and CD measurements which reflect high quality segmentation.

Fig. 9 shows examples of the segmentations obtained by all methods with the corresponding manual segmentation being annotated. Obviously, Crossbar-Net segmentation is similar to the ground truth in most cases. For tumors with simple texture, such as the first two images of first row in Fig. 9, HCM works well. However, in other cases, HCM cannot achieve appealing performance. Multi-scale CNN does not achieve competitive results especially in the small tumor cases. U-net works well in big tumors, while fails in the small ones. As is shown in the first and second image in the first row of Fig. 9, the former tumor is only located by U-net and no tumor being detected in the second one.

HCM U-net Multi-scale CNN Crossbar-Net
DR 0.686 0.835 0.718 0.913
TPF 0.788 0.830 0.709 0.915
HD 25.99 13.21 20.98 8.89
CD 11.2 4.51 9.85 2.62
Table 3: Comparison among different methods.
Figure 10: Examples of resulting segmentations for subjects 24-33. Red/green contours is endocardial/epicardial manual segmentation, blue/fuchsia is Crossbar-net segmentation.

6.3 Crossbar-net for cardiac segmentation

Although proposed for kidney tumors, Crossbar-Net can also be used for segmentation of other tissues, such as left ventricle (LV) in cardiac MRI. We use a public dataset of cardiac MRI [Andreopoulos and Tsotsos2008]. This dataset is comprised of cardiac MRI sequences from 33 subjects with total 7980 2D images. The image resolution is . In each image, endocardial and epicardial contours of LV are provided as ground truth. Both vertical and horizontal sub-model are trained only once in this case, and the result is the average of the two sub-models. The quantantive results are shown in Tabe 4. Our goal is to prove that Crossbar-Net can be applied to other organ tissues, so we did not compare with the existing methods in detail, but we can guarantee that the performance of Crossbar-Net can be competitive with other state-of-the-art methods [Santiago et al.2016, Avendi et al.2016, Tan et al.2017]. For example, the DR of endocardial segmentation in [Santiago et al.2016] is lower than 0.85 while our DR is higher than 0.9, two methods were carried out on the same dataset. Other methods [Avendi et al.2016, Tan et al.2017] are applied to different datasets, and we do not analyze quantitatively, but we show some of the segmented visualization results. As shown in Fig. 10, two or three representative samples from each sequence of subject 24-33 being displayed.

Endocardium 0.915 0.937 3.624 1.335
Epicardium 0.940 0.947 3.384 1.041
Table 4: Results parameters calculated by Crossbar-net

7 Conclusion

In this paper, we propose a new CNN model, Crossbar-Net, which consists of two main innovations. One is the use of crossbar patches, which cover the kidney tumor in both horizontal and vertical directions and capture the local and contextual information simultaneously. The other is we design a cascaded boosting straining style with a cover re-sampling strategy. In our Crossbar-Net, the segment result of one sub-model can be complemented by fine-tuning the other sub-model, and each sub-model can performs self-improvement with re-sampling the mis-segmented region. The combination of the basic sampling strategy and the cover re-sampling strategy not only enhances the role of mis-segmented regions, but also prevents sub-models from being over-emphasis on the mis-segmented regions. Our model can simultaneously learn a variety of information and achieve promising segmentation results on different size, shape, contrast and appearance of kidney tumors. Moreover, the promising results on cardiac segmentation shows that Crossbar-Net has a wide range of application. The future work is to extent the direction of symmetric information, from horizontal and vertical axis to the other axis.


  • [Andreopoulos and Tsotsos2008] Alexander Andreopoulos and John K Tsotsos. Efficient and generalizable statistical models of shape and appearance for analysis of cardiac mri. Medical Image Analysis, 12(3):335–357, 2008.
  • [Avendi et al.2016] MR Avendi, Arash Kheradvar, and Hamid Jafarkhani. A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac mri. Medical image analysis, 30:108–119, 2016.
  • [Ciresan et al.2012] Dan Ciresan, Alessandro Giusti, Luca M Gambardella, and Jürgen Schmidhuber. Deep neural networks segment neuronal membranes in electron microscopy images. In Advances in neural information processing systems, pages 2843–2851, 2012.
  • [Faragallah et al.2017] Osama S Faragallah, Ghada Abdel-Aziz, and Hamdy M Kelash. Efficient cardiac segmentation using random walk with pre-computation and intensity prior model. Applied Soft Computing, 2017.
  • [Havaei et al.2017] Mohammad Havaei, Axel Davy, David Warde-Farley, Antoine Biard, Aaron Courville, Yoshua Bengio, Chris Pal, Pierre-Marc Jodoin, and Hugo Larochelle. Brain tumor segmentation with deep neural networks. Medical image analysis, 35:18–31, 2017.
  • [Hinton et al.2012] Geoffrey E Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, and Ruslan R Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
  • [Hodgdon et al.2015] Taryn Hodgdon, Matthew DF McInnes, Nicola Schieda, Trevor A Flood, Leslie Lamb, and Rebecca E Thornhill. Can quantitative ct texture analysis be used to differentiate fat-poor renal angiomyolipoma from renal cell carcinoma on unenhanced ct images? Radiology, 276(3):787–796, 2015.
  • [Karianakis et al.2015] Nikolaos Karianakis, Thomas J Fuchs, and Stefano Soatto. Boosting convolutional features for robust object proposals. arXiv preprint arXiv:1503.06350, 2015.
  • [Lee et al.2017] Han Sang Lee, Helen Hong, and Junmo Kim. Detection and segmentation of small renal masses in contrast-enhanced ct images using texture and context feature classification. In Biomedical Imaging (ISBI 2017), 2017 IEEE 14th International Symposium on, pages 583–586. IEEE, 2017.
  • [Linguraru et al.2009] Marius George Linguraru, Shijun Wang, Furhawn Shah, Rabindra Gautam, James Peterson, W Marston Linehan, and Ronald M Summers. Computer-aided renal cancer quantification and classification from contrast-enhanced ct via histograms of curvature-related features. In Engineering in Medicine and Biology Society, 2009. EMBC 2009. Annual International Conference of the IEEE, pages 6679–6682. IEEE, 2009.
  • [Linguraru et al.2011] Marius George Linguraru, Shijun Wang, Furhawn Shah, Rabindra Gautam, James Peterson, W Marston Linehan, and Ronald M Summers. Automated noninvasive classification of renal cancer on multiphase ct. Medical physics, 38(10):5738–5746, 2011.
  • [Maaten and Hinton2008] Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-sne.

    Journal of Machine Learning Research

    , 9(Nov):2579–2605, 2008.
  • [Nair and Hinton2010] Vinod Nair and Geoffrey E Hinton.

    Rectified linear units improve restricted boltzmann machines.

    In Proceedings of the 27th international conference on machine learning (ICML-10), pages 807–814, 2010.
  • [Ronneberger et al.2015] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015.
  • [Santiago et al.2016] Carlos Santiago, Jacinto C Nascimento, and Jorge S Marques. A new robust active shape model formulation for cardiac mri segmentation. In Image Processing (ICIP), 2016 IEEE International Conference on, pages 4112–4115. IEEE, 2016.
  • [Shalev-Shwartz2014] Shai Shalev-Shwartz. Selfieboost: A boosting algorithm for deep learning. arXiv preprint arXiv:1411.3436, 2014.
  • [Shi et al.2017] Yinghuan Shi, Wanqi Yang, Yang Gao, and Dinggang Shen. Does manual delineation only provide the side information in ct prostate segmentation? pages 692–700, 2017.
  • [Skalski et al.2016] Andrzej Skalski, Jacek Jakubowski, and Tomasz Drewniak. Kidney tumor segmentation and detection on computed tomography data. In Imaging Systems and Techniques (IST), 2016 IEEE International Conference on, pages 238–242. IEEE, 2016.
  • [Tan et al.2017] Li Kuo Tan, Yih Miin Liew, Einly Lim, and Robert A McLaughlin. Convolutional neural network regression for short-axis left ventricle segmentation in cardiac cine mr sequences. Medical Image Analysis, 39:78–86, 2017.
  • [Vedaldi and Lenc2015] Andrea Vedaldi and Karel Lenc. Matconvnet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM international conference on Multimedia, pages 689–692. ACM, 2015.
  • [Walach and Wolf2016] Elad Walach and Lior Wolf. Learning to count with cnn boosting. In

    European Conference on Computer Vision

    , pages 660–676. Springer, 2016.
  • [Wang et al.2017] Shuo Wang, Mu Zhou, Zaiyi Liu, Zhenyu Liu, Dongsheng Gu, Yali Zang, Di Dong, Olivier Gevaert, and Jie Tian. Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation. Medical image analysis, 40:172–183, 2017.