Extracting tubular objects, e.g., blood vessels, has become a crucial task in computer-assisted diagnosis (CAD) of many diseases. For example, vessel lumen segmentation and centerline extraction are prerequisites for vessel curved-planar reconstruction (CPR)  from computed tomography angiography (CTA) images, which further facilitates stenosis detection and plaque identification in clinical diagnosis. However, it is usually time-consuming to segment vessel and extract centerline from various medical images. Instead, automatic vessel segmentation and centerline detection play a more and more important role for quantitative analysis of vascular diseases.   .
Recently, convolutional neural networks (CNNs) have been widely applied in 3D medical image segmentation. However, segmenting vessels in 3D medical images is still very challenging. The blood vessels have delicate tubular structures with a large variety in long-range topology, which cannot be captured by slice-wise or patch-wise convolutional operations in most deep learning based segmentation methods  . Moreover, at the presence of imaging artifacts which often exist in medical images, CNN-based segmentation algorithms are prone to missing some segments of vessels, resulting discontinuities in the extracted vessel centerline   . To preserve the correct topology of the extracted vessels, previous methods may rely on user input that annotates start and end points of each vessel  . Then, a more complete vessel centerline can be found by a minimal-cost path-based algorithm    . To avoid manual input, another approach to ensure correct topology is to build an atlas or template of the target vessels from training samples and register the template to the test image  . However, this approach is not very generalizable as the vascular structure of the test sample might be very different from the template.
In this paper, we present a novel approach for vessel centerline extraction, which is able to ensure the connectivity of extracted vessels without the need of any manual input or vessel template. The key idea is to use a patch-wise 3D CNNs to segment vessel mask and regress vessel centerline heatmap from the input image, and meanwhile use another point-cloud network to label the extracted vessel segments, such that segments belonging to the same vessel can be connected in a post-processing step. This hybrid approach makes the best use of both worlds: patch-wise CNNs for local appearance learning and point-cloud networks for global geometry learning, resulting in a robust and efficient algorithm for automatic centerline extraction. We also propose a geometry-aware grouping strategy to improve the performance of point-cloud network for vessel labeling. The effectiveness of the proposed framework is validated on two datasets: a public dataset of coronary artery CTA scans and an in-house dataset of head and neck artery CTA scans. Experimental results show that our approach outperforms existing baseline methods in terms of both accuracy and completeness of extracted centerlines.
In summary, we make the following contributions: (1) A novel hybrid representation learning approach for fully-automatic and template-free vessel centerline extraction; (2) A geometry-aware grouping method that utilizes the skeleton’s connection property to improve the performance of vessel labeling; (3) The state-of-the-art performance on the public benchmark.
Given a 3D CTA image consisted of a sequence of 2D slices, the objective is to segment the arteries and delineate their centerlines. The state-of-the-art segmentation methods are mostly based on CNNs. Due to the heavy computation of 3D convolutions, an input 3D image needs to be divided into overlapped patches and fed into the segmentation network separately. This leads to restricted local receptive fields, which may not provide sufficient information to distinguish between arteries and veins, resulting false detection. Moreover, there is no guarantee on the connectivity of extracted vessel segments from patch-based CNNs. A solution is to connect the segments that belong to the same vascular branch in post-processing. But to achieve this we need to label all the segments, which is also difficult for a patch-based CNN as vessel labeling requires considering the global geometry of the vessels.
To address these issues, we propose a hybrid approach that consists of a patch-based CNNs for local vessel segmentation, a point-cloud network for global vessel labeling and a path-finding algorithm for final centerline extraction. Figure 1 provides an overview to our approach.
2.1 Vessel Segmentation
At first, we use a 3D CNNs to learn vascular local appearance features from 3D patches of the original CTA images and produce a coarse vessel segmentation. An UNet  backbone architecture with an encoding-decoding module is selected to transform the input image to the segmentation mask. Moreover, to explore long-range contextual information inside 3D patches, we embed a dual attention module  on top of the UNet backbone. Finally, a combination of a binary cross-entropy loss and a Dice loss is used as the total segmentation loss. Please refer to the supplementary for more details.
2.2 Vessel Labeling
Due to the lack of the global information in the patch-wise 3D CNNs, the vessel segments obtained from the vessel segmentation procedure tend to contain some false-positive results like veins and also miss some parts of tiny or tortuous vessels. We propose to perform a vessel labeling procedure that classifies the segmented vessels into different branches. Then such semantic labels can be used to remove non-vascular segments and group discontinuous vessel segments, as shown in Figure 3. The vessel labeling procedure is implemented by first generating a set of points which represent vascular skeleton from the vessel segmentation results and then using a point-cloud network to predict labels of these skeleton points.
Point-cloud Generation. As it is inefficient to directly label vessel segments in 3D volumes using CNNs, we propose to perform the labeling on vascular skeletons represented by a set of points, which can reduce complexity and preserve original geometric information of vessels. To generate vascular skeletons from the vessel segments, a 3D thinning algorithm  is applied to erode the vessel segments and finally obtain single-voxel-width skeleton points, as shown in Figure 3. The generated skeletons are composed of discrete and unordered points lying on the center of the vascular lumen, which represent vascular geometry.
Vessel Labeling Network. Given a set of vascular skeleton points generated from vessel segments , the target of the vessel labeling network is to predict the label of each point, , as shown in Figure 3
. It is similar to point-cloud semantic segmentation tasks in 3D computer vision. Any point-cloud network can be adopted, such as the state-of-the-art PointNet++ and dynamic graph CNNs (DGCNN) . We will provide a comparison between them in experiments.
Geometry-aware Grouping. A particular property of the skeleton points compared to a general point cloud is the given connectivity among adjacent skeleton points. Specifically, skeleton points can be divided into separated components using a connected-component labeling (CCL) algorithm . However, both PointNet++ and DGCNN are realized to group points by -nearest neighbor (-NN) algorithm based on L2 distance, which ignores the given geometry of the vessel skeletons. As demonstrated in Figure 5(a), the L2-based methods are more likely to group points belonging to different components. To address this issue and leverage the skeletons’ connection property, we propose a geometry-aware grouping method (GAG) to modify the distance based on the connection relationship. The modified distance between two points is computed by
where is a weight, is the n-th component. As shown in Figure 5(b), the grouping area in GAG is prone to stretching along the skeleton lines and the points belonging to the same connected component are more likely to be grouped. This design facilitates local feature consistency in the same component and consequently improves the accuracy of vessel labeling as evaluated in the experiments.
2.3 Centerline Extraction
After vessel labeling, each vessel segment is assigned a semantic label. The semantic label can be used to remove non-vascular tissues and provide the guidance to connect disjointed vessel segments. Specifically, we first determine which segments should be connected and then use a minimal cost path method to connect them and output final centerlines based on a cost map. The cost map is constructed by a centerline heatmap regressed from another 3D CNNs and the labeled vessel skeletons.
Centerline Heatmap Regression. The centerline heatmap is defined as the opposite of a distance map, where points closer to the vessel centerline have larger values and points outside the vascular radius have a step-down low value, as shown in Figure 5
(a). Considering that the probability map obtained from the segmentation network has only learned the difference between background and the vessel (e.g. edge features) rather than the centerline feature within the vessel lumen, we adopt another 3D CNN similar to the network used in vessel segmentation to regress the centerline heatmap. The mean square error (MSE) loss is used to train the network. Based on the centerline heatmap, we construct a cost map to guide the minimal cost path search algorithm as shown in Figure5(b). In areas where vessel segments exist, the cost map directly assigns a large value to the skeleton points (red lines in Figure 5(b)) and a small value to the points elsewhere (gray areas in Figure 5(b)).
Minimal Cost Path. Given vessel skeletons and their labels, we successively merge the disjointed segments with the same label. Paired boundary points with the same label are put in a priority queue according to the distance between the two points in the pair. We take a pair successively from the queue and then the minimal cost path is found by Dijkstra algorithm  to connect the two boundary points, as shown in Figure 5(c). If the two segments represented by the two points are connected in the previous step, we skip the pair and go on until the queue is empty.
We evaluate the proposed method on two datasets: a public coronary artery dataset and a private head and neck artery dataset. The first dataset is mainly used to compare our method with existing baseline methods in literature. The second dataset is mainly used for ablative study to verify our system designs. Following , extracted centerlines are evaluated based on three metrics, namely total overlap (OV), overlap until first error (OF), and overlap with the clinically relevant part of the vessel (OT). The stage-wise results for the proposed framework are demonstrated in Figure 8.
Implementation details In vessel segmentation, we randomly crop 3D patches with the size of for the head and neck dataset and
for the coronary artery dataset. The ResNet34 is used as the encoder in the UNet architecture which starts with 32 feature channels that are doubled in each scale, and the max-pooling layer is removed from the original residual network. All convolutions are specified askernels, except the last two ResNet blocks, which are to reduce the parameter count. We employ the Adam optimizer with a polynomial learning rate which equals to
. In vessel labeling, the inputs are skeleton points, which are generated from vessel segments and resampled to 3000 points for each sample. The inital learning rate is set to 0.001, which is reduced by half every 30 epochs. The weightin GAG is set to be 0.3.
Experiments on the Coronary Artery Dataset This public dataset contains 100 cardiac CT angiography (CCTA) scans collected from the clinic for training and 32 CCTA scans from  for evaluation. We train the vessel segmentation network and vessel labeling network on the annotated coronary artery CTA images and the vessel skeletons are labeled as three categories including right arteries, left arteries and false-positive venous vessels. According to the ablation study in the head and neck artery dataset, we use the Pointnet++ with the GAG module as our vessel labeling network. The quantitative comparison is listed in Table 1 and the visualization of results is showed in Figure 7. Our hybrid approach achieves the highest performance in terms of OV and OT, respectively, indicating that the centerlines extracted by the proposed method are more complete than those produced by other methods.
Experiments on the Head and Neck Artery Dataset. This private dataset collected from the clinic contains 450 CTA scans, each of which has a manually annotated vessel mask. The dataset is split into 380 scans for training, 20 for validation and 50 for testing. In the dataset, vessel skeletons are labeled as 17 categories including left and right common carotid artery (L/RCCA), left and right vertebral artery (L/RVA), etc. Table 2 shows the evaluation results of several variants of our system with different point-cloud network designs. It can be seen that, with the geometry-aware grouping (GAG) method, the vessel labeling accuracy of both PointNet++ and DGCNN can be improved. Figure 7 shows the GAG can facilitate local consistency of the skeleton components.
We propose an automatic and template-free approach to 3D vessel centerline extraction based on hybrid representations, which ensures the connectivity of extracted centerlines. We show that the hybridization between learning local appearance with patch-based CNNs and learning global geometry with point-cloud networks results in an efficient and robust framework to extract geometric objects from 3D data. We demonstrate superior performance on artery centerline extraction from CTA images and believe that the proposed approach can also be applied in other centerline or skeleton extraction tasks.
This work is funded by National Key Research and Development Program of China (No. 2019YFC0118100), and is partially supported by National Key Research and Development Program of China with Grant No. 2018AAA0101900/2018AAA0101902, Beijing Municipal Commission of Science and Technology under Grant No. Z181100008918005, the National Natural Science Foundation of China (NSFC Grant No. 61772039 and No. 91646202), and Hong Kong Research Grant Council [12301417, 16307818, 16301419]; Hong Kong University of Science and Technology [R9405, IGN17SC02, Z0428].
A higher-order tensor vessel tractography for segmentation of vascular structures. IEEE transactions on medical imaging 34 (10), pp. 2172–2185. Cited by: §1.
-  (2001) Fast extraction of minimal paths in 3d images and applications to virtual endoscopy. Medical image analysis 5 (4), pp. 281–299. Cited by: §1.
-  (2019) Dual attention network for scene segmentation. In CVPR, Cited by: §2.1.
-  (2016) Coronary centerline extraction via optimal flow paths and cnn path pruning. In Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, S. Ourselin, L. Joskowicz, M. R. Sabuncu, G. Unal, and W. Wells (Eds.), pp. 317–325. External Links: Cited by: §1.
-  (2017) Efficient multi-scale 3d cnn with fully connected crf for accurate brain lesion segmentation. Medical image analysis 36, pp. 61–78. Cited by: §1.
-  (2002) CPR: curved planar reformation. In VIS, Cited by: §1.
-  (2012) Automatic coronary extraction by supervised detection and shape matching. In ISBI, Cited by: §1, Table 1.
-  (2007) Vessels as 4-d curves: global minimal 4-d paths to extract 3-d tubular surfaces and centerlines. IEEE transactions on medical imaging 26 (9), pp. 1213–1223. Cited by: §1.
-  (2017) A survey on deep learning in medical image analysis. Medical image analysis 42, pp. 60–88. Cited by: §1.
-  (2017) VTrails: inferring vessels with geodesic connectivity trees. In International Conference on Information Processing in Medical Imaging, pp. 672–684. Cited by: §1.
-  (2001) A sequential 3d thinning algorithm and its medical applications. In IPMI, Cited by: §2.2.
-  (2000) Fast connected component labeling algorithm using a divide and conquer technique.. Computers and Their Applications 4, pp. 4–7. Cited by: §2.2.
-  (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In CVPR, Cited by: Table 2.
-  (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In NIPS, Cited by: §2.2, Table 2.
-  (2015) U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi (Eds.), pp. 234–241. External Links: Cited by: §2.1.
-  (2009) Standardized evaluation methodology and reference database for evaluating coronary artery centerline extraction algorithms. Medical image analysis 13 (5), pp. 701–714. Cited by: §3, §3.
-  (2018) Extracting tree-structures in ct data by tracking multiple statistically ranked hypotheses. arXiv preprint arXiv:1806.08981. Cited by: §1.
-  (2001) Highly automated segmentation of arterial and venous trees from three-dimensional magnetic resonance angiography (mra). The international journal of cardiovascular imaging 17 (1), pp. 37–47. Cited by: §1.
-  (2008) Automatic coronary tree modeling. The Insight Journal. Cited by: Table 1.
-  (2010) Dijkstra’s algorithm applied to 3d skeletonization of the brain vascular tree: evaluation and application to symbolic. In EMBC, Cited by: §2.3.
-  (2018) Dynamic graph cnn for learning on point clouds. arXiv preprint arXiv:1801.07829. Cited by: §2.2, Table 2.
-  (2019) Coronary artery centerline extraction in cardiac ct angiography using a cnn-based orientation classifier. Medical image analysis 51, pp. 46–60. Cited by: §1.
-  (2012) Automatic centerline extraction of coronary arteries in coronary computed tomographic angiography. The international journal of cardiovascular imaging 28 (4), pp. 921–933. Cited by: §1, Table 1.
Recurrent saliency transformation network: incorporating multi-stage visual cues for small organ segmentation. In CVPR, Cited by: §1.
-  (2008) Shape and appearance models for automatic coronary artery tracking. the midas journal. In MICCAI Workshop–Grand Challenge Coronary Artery Tracking. http://hdl.handle.net/10380/1420, Cited by: Table 1.
-  (2019) Automatic quantitative analysis of pulmonary vascular morphology in ct images. Medical physics 46 (9), pp. 3985–3997. Cited by: §1.
Coronary motion estimation from cta using probability atlas and diffeomorphic registration. In MIVR, Cited by: §1.
-  (2013) Robust and accurate coronary artery centerline extraction in cta by combining model-driven and data-driven approaches. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2013, K. Mori, I. Sakuma, Y. Sato, C. Barillot, and N. Navab (Eds.), pp. 74–81. External Links: Cited by: §1, Table 1.