Combining CNN and Hybrid Active Contours for Head and Neck Tumor Segmentation in CT and PET images

12/28/2020 ∙ by Jun Ma, et al. ∙ Nanjing University 0

Automatic segmentation of head and neck tumors plays an important role in radiomics analysis. In this short paper, we propose an automatic segmentation method for head and neck tumors from PET and CT images based on the combination of convolutional neural networks (CNNs) and hybrid active contours. Specifically, we first introduce a multi-channel 3D U-Net to segment the tumor with the concatenated PET and CT images. Then, we estimate the segmentation uncertainty by model ensembles and define a segmentation quality score to select the cases with high uncertainties. Finally, we develop a hybrid active contour model to refine the high uncertainty cases. Our method ranked second place in the MICCAI 2020 HECKTOR challenge with average Dice Similarity Coefficient, precision, and recall of 0.752, 0.838, and 0.717, respectively.



There are no comments yet.


page 2

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Heak and neck cancers are one of the most common cancers [14]. Extracting quantitative image bio-markers from PET and CT images has shown tremendous potential to optimize patient care, such as predicting disease characteristics [9] [15]. However, it relies on an expensive and error-prone manual annotation process of Regions of Interest (ROI) to focus the analysis. The fully automatic segmentation methods for head and neck tumors in PET and CT images are highly demanded because they will enable the validation of radiomics models on very large cohorts and with optimal reproducibility.

PET and CT modalities include complementary and synergistic information for tumor segmentation. Thus, the key is how to explore the complementary information. Several methods have been proposed for joint PET and CT segmentation. Kumar et al. [8] proposed a co-learning CNN to improve the fusion of complementary information in multi-modality PET-CT, which includes two modality-specific encoders, a co-learning component, and a reconstruction component. Li et al. [11]

proposed a deep learning based variational method for non-small cell lung cancer segmentation. Specifically, A 3D fully convolutional network (FCN) was traind on CT images to produce a probability. Then, A fuzzy variational model was then proposed to incorporate the probability map and the PET intensity image. A split Bregman algorithm was used to minimize the variational model. Recently, Andrearczyk et al.

[1] used 2D and 3D V-Net to segment head and neck tumor from PET and CT images. Results showed that using the two modalities can obtain a statistically significant improvement than using CT images or PET images only.

Figure 1: Visual examples of PET and CT image and the corresponding ground truth.

Active contours [7] [3] [4]

have been one of the widely used segmentation methods before deep learning ear. The basic idea is to formulate the image segmentation task as an energy functional minimization problem. According to used the information in the energy functional, active contours can be classified into three categories, including edge-based active contours

[3] that rely on image gradient information, region-based active contours [10] that rely on image-intensity region descriptors, and hybrid active contours [18] [19] that use both image gradient and intensity information.

In this short paper, we propose an automatic segmentation method for head and neck tumors from PET and CT images based on the combination of convolutional neural networks (CNNs) and hybrid active contours. Specifically, we first introduce a multi-channel 3D U-Net to segment the tumor with the concatenated PET and CT images. Then, we estimate the segmentation uncertainty by model ensembles, and define a quality score to select the cases with high uncertainties. Finally, we develop a hybrid active contour model to refine the high uncertainty cases.

2 Method

2.1 CNN backbone

Our network backbone is the typical 3D U-Net [5]. The number of features is 32 in the first block. In each downsampling stage, the number of features is doubled. The implementation is based on nnU-Net [6]. In particular, the network input is configured with a batch size 2. The patch size is

. The optimizer is stochastic gradient descent with an initial learning rate (0.01) and a nesterov momentum (0.99). To avoid overfitting, standard data augmentation techniques are used during training, such as rotation, scaling, adding Gaussian Noise, gamma correction. The loss function is the sum between Dice loss and TopK loss 


. We train the 3D U-Net with five-fold cross validation. Each fold is trained on a TITAN V100 GPU with 1000 epochs. The training time costs about 4 days.

2.2 Uncertainty quantification

We train five U-Net models with five-fold cross validation. During testing, we infer the test cases with the trained five models. Thus, each test case has five predictions. Let denote the predictions (Probability) of the model, the final segmentation can be obtained by


Then, we compute the normalized surface Dice between each prediction and the final segmentation. Details and the code are publicly available at Finally, the uncertainty of the prediction is estimated by


If one case have a uncertainty value over 0.2, it will be selected for the next refinement.

2.3 Refinement with Hybrid active contours

This step aims to refine the segmentation results of the cases with high uncertainties by exploiting the complementary information among CT images, PET images, and network probabilities. Basically, CT images can provide edge informations, and PET and network probabilities can provide location or region informations. We propose the following hybrid active contour model






is the image intensity values, is the Gaussian kernel function, and are the average image intensities inside and outside the segmentation contour, respectively. is the Gaussian kernel, which is defined by


The hybrid active contour model is solved by the iterative convolution-thresholding method where the details can be found at [16, 17, 12].

3 Experiments and results

3.1 Dataset

We use the official HECKTOR dataset [2] to evaluate the proposed method. The training data comprises 201 cases from four centers (CHGJ, CHMR, CHUM and CHUS). The test data comprise 53 cases from another center (CHUV). Each case comprises: CT, PET and GTVt (primary Gross Tumor Volume) in NIfTI format, as well as the bounding box location and patient information. We use the official bounding box to crop all the images. We also resample the images to isotropic resolution

. Specifically, We use third order spline interpolation and zero order nearest interpolation for the images and labels, respectively. Furthermore, we apply Z-score (mean subtraction and division by standard deviation) to separately normalize each PET and CT image.

3.2 Quantitative and qualitative results

Table 1 and Figure 2 present the quantitative and qualitative results on the testing set, respectively. The proposed method achieved the 2nd place on the official leaderboard, which is also very close to the 1st-place performance. The segmentation results have better precision but inferior recall, indicating that most of the segmentation results are right but some tumors are missed by the method.

Participants DSC Precision Recall Rank
andrei.iantsen 0.759 0.833 0.740 1
junma (Ours) 0.752 0.838 0.717 2
badger 0.735 0.833 0.702 3
deepX 0.732 0.785 0.732 4
AIView_sjtu 0.724 0.848 0.670 5
DCPT 0.705 0.765 0.705 6
Table 1: Quantitative results on the testing set.
Figure 2: Visual examples of segmentation results from testing set.

4 Conclusion

In this paper, we proposed a fully automatic segmentation method for head and neck tumor segmentation in CT and PET images, which combines modern deep learning methods and traditional active contours. Experiments on official HECKTOR challenge dataset demonstrate the effectiveness of the proposed method. The main limitation of our method is the low recall, indicating that some of the lesions are missed in the segmentation results. This would be our further work to enhance the results towards higher performance.


This project is supported by the National Natural Science Foundation of China (No. 11531005, No. 11971229). The authors of this paper declare that the segmentation method they implemented for participation in the HECKTOR challenge has not used any pre-trained models nor additional datasets other than those provided by the organizers. We also thanks the HECKTOR organizers for their public dataset and hosting the great challenge.


  • [1] V. Andrearczyk, V. Oreiller, M. Vallières, J. Castelli, H. Elhalawani, M. Jreige, S. Boughdad, J. O. Prior, and A. Depeursinge (2020) Automatic segmentation of head and neck tumors and nodal metastases in pet-ct scans. In

    Proceedings of Machine Learning Research

    Cited by: §1.
  • [2] V. Andrearczyk, V. Oreiller, M. Vallières, M. Jreige, J. O. Prior, and A. Depeursinge (2021) Overview of the hecktor challenge at miccai 2020: automatic head and neck tumor segmentation in pet/ct. In Lecture Notes in Computer Science (LNCS) Challenges, Cited by: §3.1.
  • [3] V. Caselles, R. Kimmel, and G. Sapiro (1997) Geodesic active contours.

    International journal of computer vision

    22 (1), pp. 61–79.
    Cited by: §1.
  • [4] T. F. Chan and L. A. Vese (2001) Active contours without edges. IEEE Transactions on image processing 10 (2), pp. 266–277. Cited by: §1.
  • [5] Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger (2016) 3D u-net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention, pp. 424–432. Cited by: §2.1.
  • [6] F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein (2020) NnU-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature Methods. Cited by: §2.1.
  • [7] M. Kass, A. Witkin, and D. Terzopoulos (1988) Snakes: active contour models. International journal of computer vision 1 (4), pp. 321–331. Cited by: §1.
  • [8] A. Kumar, M. Fulham, D. Feng, and J. Kim (2019) Co-learning feature fusion maps from pet-ct images of lung cancer. IEEE Transactions on Medical Imaging 39 (1), pp. 204–217. Cited by: §1.
  • [9] P. Lambin, E. Rios-Velazquez, R. Leijenaar, S. Carvalho, R. G. Van Stiphout, P. Granton, C. M. Zegers, R. Gillies, R. Boellard, A. Dekker, et al. (2012) Radiomics: extracting more information from medical images using advanced feature analysis. European journal of cancer 48 (4), pp. 441–446. Cited by: §1.
  • [10] C. Li, C. Kao, J. C. Gore, and Z. Ding (2008) Minimization of region-scalable fitting energy for image segmentation. IEEE transactions on image processing 17 (10), pp. 1940–1949. Cited by: §1.
  • [11] L. Li, X. Zhao, W. Lu, and S. Tan (2020) Deep learning for variational multimodality tumor segmentation in pet/ct. Neurocomputing 392, pp. 277–295. Cited by: §1.
  • [12] J. Ma, D. Wang, X. Wang, and X. Yang (2020) A fast algorithm for geodesic active contours with applications to medical image segmentation. arXiv preprint arXiv:2007.00525. Cited by: §2.3.
  • [13] J. Ma (2020) Segmentation loss odyssey. arXiv preprint arXiv:2005.13449. Cited by: §2.1.
  • [14] R. L. Siegel, K. D. Miller, and A. Jemal (2020) Cancer statistics, 2020. CA: A Cancer Journal for Clinicians 70 (1), pp. 7–30. Cited by: §1.
  • [15] M. Vallieres, E. Kay-Rivest, L. J. Perrin, X. Liem, C. Furstoss, H. J. Aerts, N. Khaouam, P. F. Nguyen-Tan, C. Wang, K. Sultanem, et al. (2017) Radiomics strategies for risk assessment of tumour failure in head-and-neck cancer. Scientific reports 7 (1), pp. 1–14. Cited by: §1.
  • [16] D. Wang, H. Li, X. Wei, and X. Wang (2017) An efficient iterative thresholding method for image segmentation. Journal of Computational Physics 350, pp. 657–667. Cited by: §2.3.
  • [17] D. Wang and X. Wang (2019) The iterative convolution-thresholding method (ictm) for image segmentation. arXiv preprint arXiv:1904.10917. Cited by: §2.3.
  • [18] W. Zhang, X. Wang, J. Chen, and W. You (2020) A new hybrid level set approach. IEEE Transactions on Image Processing 29 (), pp. 7032–7044. Cited by: §1.
  • [19] Y. Zhang, B. J. Matuszewski, L. Shark, and C. J. Moore (2008) Medical image segmentation using new hybrid level-set method. In 2008 fifth international conference biomedical visualization: information visualization in medical and biomedical informatics, pp. 71–76. Cited by: §1.