Accurate segmentation of the aorta in CT can be used to analyze morphology and detect pathology such as atherosclerotic plaque and aneurysms . Moreover, the location of specific shape changes or pathology in the aorta is relevant for diagnosis and risk assessment in patients[1, 2]. However, manual annotation of the aorta and its subdivision into segments is time-consuming and cumbersome, especially in low-dose chest CT scans where a lack of contrast enhancement leads to low soft-tissue contrast and acquisition with a low radiation dose may result in high levels of image noise.
Thus far, several methods have been developed for automatic segmentation of the aorta in low-dose non-contrast-enhanced chest CT scans. Kurugol et al.  used Hough transforms on computed oblique and on axial slices to segment the aorta. Using the results of the Hough transforms, the surface of the aorta was reconstructed and thereafter, the segmentations were refined using level sets. Xie et al. proposed an algorithm that iteratively fits cylinders of varying lengths to track the aorta in the image. The cylindrical model is fit in the image space defined by previously segmented organs surrounding the aorta, such as the lungs and trachea. Finally, segmentation of the aorta is refined using local image intensities. Išgum et al.  employed atlas-based registration that locally combines atlases based on the registration success of each atlas.
Even though these methods generally obtain good results, they require either segmentation of neighboring organs[3, 4] (e.g. lungs or airways) or manual tuning of parameters, such as the atlases used . Furthermore, existing automatic methods only segment the complete thoracic aorta and do not subdivide it into segments. Therefore, we propose an automatic method to segment the ascending aorta, aortic arch and descending aorta in low-dose, non-contrast-enhanced chest CT. We employ a dilated convolutional neural network (CNN) that analyzes axial, coronal and sagittal CT slices to classify voxels into one of the three aortic segments or background. The results obtained in each image plane are merged to provide the final segmentation result. Unlike previous methods that exploit the expected shape of the aorta, CNNs are capable of using CT images as input and automatically acquire hierarchical feature representations needed for the segmentation task.
This study included 24 low-dose chest CT scans, randomly chosen from a set of baseline scans acquired in the National Lung Screening Trial (NLST). All scans were acquired during inspiratory breath-hold in supine position with the arms elevated above the head and included the outer rib margin at the widest patient dimension. The selected scans were acquired on seven different scanners of three major CT scanner vendors (GE, Siemens and Philips). Depending on patient weight, a tube voltage of 120 kVp or 140 kVp and a tube current ranging between 30 and 160 mAs were used. Scans were made using an axial reconstruction with an in-plane resolution varying between 0.46 and 0.86 mm, a slice thickness varying between 1.25 and 4.00 mm, and a slice spacing varying between 0.63 and 3.00 mm. No contrast enhancement or ECG-triggering was applied.
Reference annotations were obtained by manual voxel painting of the aorta in the axial plane (Fig. 7). Specific labels were assigned to the ascending aorta, the aortic arch and the descending aorta. The aortic arch was defined as the section of the aorta where the ascending and descending aorta are connected. The ascending aorta was defined from the aortic root up to the aortic arch and the descending aorta was defined from the aortic arch down to the last axial slice of a scan.
To segment the aorta, a CNN is trained to assign a class label to every voxel in a scan based on classification in three orthogonal image slices. A lack of contrast enhancement in scans leads to homogeneous image intensities, especially around the ascending aorta (Fig. 7). Hence, the precise location of the aorta has to be inferred from a larger image context. To use a large receptive field and to keep the number of parameters low, a dilated CNN (Fig. 8) is employed. It has a similar architecture as the networks described by Wolterink et al. and Yu et al. and analyzes 2D image slices using ten convolutional layers. The size of the receptive field is set to voxels but, due to increasing dilation factors in subsequent convolutional layers, the network only contains 72,643 trainable parameters. Dropout
(p=0.5) and batch normalization are applied to the fully connected layers to prevent overfitting. To compensate for varying in-plane resolutions, prior to analysis all scans are resized to an isotropic resolution of 1 mm.
The CNN is purely convolutional, thus it is able to analyze images of a variable size. Therefore, during training, batches containing sub-images in the axial, sagittal and coronal planes are analyzed, and during testing, full slices padded with 65 voxels in all directions are used as input. Moreover, all slices from the axial, sagittal and coronal planes of a scan are analyzed. This results in three multi-class 3D probability maps: one map for each plane orientation. A final probability map is determined by averaging these three multi-class probability maps. Results are resampled from isotropic resolution to the original image resolution using trilinear interpolation and subsequently, each voxel is assigned the class with the highest class probability. To prevent small isolated clusters of voxels being segmented, only the largest component for each class is included in the final segmentation.
Performance of the trained network was evaluated by the Dice coefficient as an overlap measure between automatically obtained and reference segmentations. Furthermore, the Average Symmetrical Surface Distance (ASSD) was computed to evaluate the segmentation along the aortic boundary. The evaluation was performed for each class separately.
4 Experiments and Results
Two-fold cross-validation experiments were performed with 24 CT scans. In each experiment, ten scans were used for training and another ten scans were used for testing the method. The remaining four scans were used as validation set to ensure no overfitting occurred during training. Unlike in the experiments presented by Wolterink et al. 
where categorical cross-entropy was used as a loss function, the current work employed the Dice coefficient as a loss function to address class imbalance in our data set. The Adam optimization algorithm (learning rate = 0.001) was used to optimize the network parameters during 250,000 training iterations. In each iteration, a mini-batch containing 16 randomly sampled
sub-images from the three planes was provided to the network. The same hyperparameters were used for both cross-validation experiments.
|Method||Segmentation task||Training images||Test images||Dice||ASSD (mm)|
|Kurugol et al. ||Thoracic aorta||-||45||-|
|Xie et al. ||Thoracic aorta||20||60|
|Isgum et al. ||Thoracic aorta||15||14||-|
Table 1 lists the average (standard deviation) Dice coefficients and ASSDs achieved on the test scans. The best performance was obtained for the descending aorta, both in terms of Dice coefficient and ASSD. In contrast, the lowest performance was obtained for the ascending aorta.
Previously described methods only segmented the aorta as a whole. To compare the performance of the proposed method with previous work, we retrained the network to perform two-class classification (aorta and background). This two-class segmentation network obtained slightly better results than the network trained for multi-class segmentation of the aorta (Table 1). Compared with other methods, both the multi-class and the two-class segmentation networks obtained competitive results. However, due to differences in used data and evaluation procedures among studies results can not be directly compared, but should be used as indication of the performance.
Fig. 18 shows segmentations obtained with the presented network trained for the multi-class and two-class segmentation problem. Results show that inaccuracies in classification may occur on the interface between different aortic segments. Nevertheless, no large differences are seen between automatic segmentations obtained with the multi-class and the two-class segmentation networks.
5 Discussion and Conclusion
We have presented a method for automatic segmentation of the ascending aorta, the aortic arch and the thoracic descending aorta in low-dose, non-contrast-enhanced chest CT scans using a purely convolutional neural network with dilated convolutions. The network is able to accurately segment the aorta. In addition, the proposed method obtained similar results as a network classifying voxels in only two classes (aorta and background). Moreover, the obtained results are on par with the results obtained in previous studies that only segment the aorta as a whole [3, 4, 5].
Dilated convolutions enable analysis with a large receptive field while keeping the number of network parameters low. This large receptive field allowed accurate detection of the aorta based on context information. Furthermore, because the network is purely convolutional, it is able to analyze images of a variable size. Hence, full slices could be segmented during testing even though the network was only trained with sub-images. On average, the segmentation took only 61.5 seconds per scan, making it suitable for application in studies including large numbers of images.
In this study, results were least accurate in the ascending aorta. This is similar to other studies that achieved the least accurate segmentation results near the aortic root. In low-dose non-contrast-enhanced chest CT it is often very difficult to outline the aortic root due to low soft-tissue contrast. A previous study reported substantial inter-observer disagreement in that region . Visual inspection of the here obtained automatic results revealed occasional inaccuracies just near the aortic root. Furthermore, in this study, results showed that segmentation of the descending aorta below the lungs was sometimes difficult due to high levels of image noise. Nevertheless, the employed CNN was able to overall segment the aorta accurately.
Our experiments showed that the overall aorta segmentation was slightly more accurate when using two-class segmentation than when using multi-class segmentation. This could be due to differences between the two tasks. First, the multi-class segmentation problem can be considered more complex than the two-class segmentation problem, and may require more labeled training samples. Second, the emphasis on accurate overall aorta segmentation is stronger in the two-class task than in the multi-class task due to the Dice loss function used. In future work, the loss function for the multi-class segmentation task could potentially be adapted to correct for this.
In this study, three image planes were analyzed independently. In our future work we will investigate whether a different way of merging the results from the three image planes or alternatively extending analysis to 3D might be beneficial. In addition, to ensure that the method is able to accurately segment the aorta in images showing large anatomical variability (e.g. atherosclerotic plaque, aneurysm) and in images acquired with a large range of image acquisition parameters (different hospitals, scanners and reconstruction parameters), we will increase the size of the dataset to ensure the presence of a large range of variability in the training and test images. Given that clinical analysis of the morphology of the aorta is routinely performed on contrast-enhanced images, we will extend the evaluation to clinically acquired contrast enhanced chest CT scans.
6 new or breakthrough work to be presented
A method for automatic segmentation of the thoracic aorta into the ascending aorta, the aortic arch and the descending aorta in low-dose, non-contrast-enhanced chest CT scans is presented. This could be a first step towards large-scale studies analyzing anatomical location of pathology and morphology of the thoracic aorta.
Acknowledgements.The authors thank the National Cancer Institute for access to NCI’s data collected by the National Lung Screening Trial. The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by NCI.
-  Erbel, R. and Eggebrecht, H., “Aortic dimensions and the risk of dissection,” Heart 92(1), 137–142 (2006).
-  French Study of Aortic Plaques in Stroke Group, “Atherosclerotic disease of the aortic arch as a risk factor for recurrent ischemic stroke,” N Engl J Med 334(19), 1216–1221 (1996).
-  Kurugol, S., Estepar, R. S. J., Ross, J., and Washko, G. R., “Aorta segmentation with a 3D level set approach and quantification of aortic calcifications in non-contrast chest CT,” in [Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE ], 2343–2346, IEEE (2012).
-  Xie, Y., Padgett, J., Biancardi, A. M., and Reeves, A. P., “Automated aorta segmentation in low-dose chest CT images,” Int J Comput Assist Radiol Surg 9(2), 211–219 (2014).
-  Išgum, I., Staring, M., Rutten, A., Prokop, M., Viergever, M. A., and Van Ginneken, B., “Multi-atlas-based segmentation with local decision fusion—application to cardiac and aortic segmentation in CT scans,” IEEE Trans Med Imag 28(7), 1000–1010 (2009).
-  Team, N. L. S. T. R. et al., “The national lung screening trial: overview and study design,” Radiology 258(1), 243–253 (2011).
-  de Vos, B. D., Wolterink, J. M., de Jong, P. A., Leiner, T., Viergever, M. A., and Išgum, I., “ConvNet-based localization of anatomical structures in 3D medical images,” IEEE Trans Med Imag 36(7), 1470–1481 (2017).
-  Wolterink, J. M., Leiner, T., Viergever, M. A., and Išgum, I., “Dilated convolutional neural networks for cardiovascular MR segmentation in congenital heart disease,” in [International Workshop on Reconstruction and Analysis of Moving Body Organs ], LNCS 10129, 95–102, Springer (2017).
-  Yu, F. and Koltun, V., “Multi-scale context aggregation by dilated convolutions,” ICLR (2016).
-  Srivastava, N., Hinton, G. E., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R., “Dropout: a simple way to prevent neural networks from overfitting,” J Mach Learn Res 15(1), 1929–1958 (2014).
Ioffe, S. and Szegedy, C., “Batch normalization: Accelerating deep network
training by reducing internal covariate shift,” in [
Proceedings of the 32nd International Conference on Machine Learning], Proceedings of Machine Learning Research 37, 448–456, PMLR (2015).
-  Milletari, F., Navab, N., and Ahmadi, S.-A., “V-net: fully convolutional neural networks for volumetric medical image segmentation,” in [Fourth International Conference on 3D Vision (3DV) ], 565–571, IEEE (2016).
-  Kingma, D. and Ba, J., “Adam: A method for stochastic optimization,” ICLR (2015).