Type-B aortic dissection (TBAD) is the surging of blood through a tear in the aortic intima with separation of the intima and media, and creation of a false lumen (channel) as shown in Fig. 1, which is one of the most serious cardiovascular events. TBAD affecting 3 to 4 per 100 000 people per year [Karthikesalingam2010TheDA]. Approximately 20% of patients with TBAD die before admission [Karthikesalingam2010TheDA], without treatment, 1 to 3% patients die per hour during the first 24 hours, 30% at the first week, 80% at 2 weeks, and 90% at the first year [Hagan2000TheIR]. With the thoracic endovascular aortic repair (TEVAR) surgery and proper prognosis, patients are reportedly yielding an impressively low 30-day mortality rate of 10% or less [Hagan2000TheIR]. Recently, TBAD has attracted a lot of attention due to its growing yearly incidence [suzuki2003clinical], and the severity of disease prognosis.
Computed Tomography Angiography (CTA) is routinely adopted for the diagnosis, surgical planning, and prognosis of TBAD. Particularly, quantification assessment of anatomical features in CTA plays a key role in surgical procedure and treatment planning for prognosis. And segmentation of true lumen (TL), false lumen (FL) , and false lumen thrombus (FLT) is a significant step of the quantification assessment. However, manual segmentation by slice is time-consuming and requires expertise, while current computer-aided approaches focus on the segmentation of the entire aorta, which is unable to segment TL, FL, or FLT, automatic segmentation of substructures of TBAD is urgently needed. And there are already some studies trying to solve this problem. Specifically, Melito et al. use the adaptive algorithm together and the metamodel technique of Polynomial-Chaos Kriging define the areas in the cross-section plane in which a point can be used to enrich the dissected segmentation for aorta dissection reconstruction. During establishing the mathematical and computational models of aorta dissection, the level of uncertainty is extremely high. They point out that "One of the leading causes of this uncertainty is the lack of useful datasets" [melito2019reliability]. Gamechi et al. propose a fully automatic method combining multi-atlas registration, aorta centerline extraction, and an optimal surface segmentation approach to extract the aorta surface around the centerline. The fully automatic method they propose can assess diameters in the thoracic aorta reliably even in non-ECG-gated, non-contrast CT scans, which could be a promising tool to assess aorta dilatation in screening and in clinical practice. However, the method they propose still has no FLT detection ability mainly due to the lack of FLT enabled dataset [gamechi2019automated]
. Particularly, there are already some works using neural networks to automatically segment TL, FL, and Aorta[li2018multi, cao2019fully]
. Li et al. report a fully automatic approach based on a 3-D multi-task deep convolutional neural network that segments the entire aorta and true-false lumen from CTA images in a unified framework. The approach they report achieves a mean dice similarity score(DSC) of 0.910, 0.849, and 0.821 for the entire aorta, true lumen, and false lumen respectively. Cao et al. also use a convolutional neural network to solve the problems and achieves above 90% of the mean Dice coefficients of each lumen of TBAD when not considering FLT. They provide a promising approach for accurate and efficient segmentation of TBAD and make it possible for automated measurements of TBAD anatomical features. However, existing works nowadays only focus on one of or both TL and FL[melito2019reliability, gamechi2019automated, li2018multi, cao2019fully], and FLT information is poorly explored, partially because of the lack of a dataset. There are some other works considering thrombus in other diseases such as an abdominal aortic aneurysm ([Lisowska2017ThrombusDI, Yong2017LinearregressionCN, Lopez-Linares2018]), however, TBAD research has not yet advanced to the quantitative measurement of FLT like abdominal aortic aneurysm.
In fact, quantification assessment of FLT is also critical for surgical planning and prognosis. First, the FLT description in clinical radiology reports plays a pivotal role in guiding the endovascular intervention surgery [Dohle2017TheIO]. Second, FLT greatly affects patients’ postoperative complications [Higashigaito2019AorticGA] thus is also a significant independent predictor of post-discharge mortality in prognosis [Higashigaito2019AorticGA, Trimarchi2013ImportanceOF]. Automatic, efficient, and accurate assessment of FLT is particularly useful for doctors to make a decision on TBAD.
In this paper, we propose ImageTBAD, the first 3D CTA image dataset of TBAD with annotation of TL, FL, and FLT. For simplification of discussion, FL is the part of traditional FL without FLT in our paper. The proposed dataset contains 100 TBAD CTA images, which is of decent size compared with existing medical imaging datasets. Compared with TL and FL, FLT can appear in almost anywhere along the aorta with irregular shapes, which introduces many challenges to accurate segmentation of it. FLT segmentation represents a wide class of segmentation problems where targets exist in a variety of positions with irregular shapes. We further proposed a baseline method based on 3D U-net [cciccek20163d] for automatic segmentation of TBAD. Results show that the baseline method can achieve comparable results with existing works on the aorta and TL segmentation. However, the segmentation accuracy of FLT is the only 52%, which leaves large room for improvement and also shows the challenge of our dataset. To facilitate further research on this challenging topic, our dataset and codes are released to the public [ourdataset].
|Sex = Female (%)||31(31%)|
|Age (Mean SD)||52.5 11.3|
|Manufacturer = Philips (%)||77(77%)|
|Spacing between slice ()||0.75|
|Size of the images ()||(135416)|
|Typical voxel size ()||0.250.250.25|
2 The Image-TBAD Dataset
The ImageTBAD dataset consists of a total of 100 3D CTA images gathered from Guangdong Provincial Peoples’ Hospital from January 1, 2013, to April 23, 2015. Images are acquired from two kinds of scanners (Siemens SOMATOM Force, and Philips 256-slice Brilliance iCT system), the characteristics of the ImageTBAD dataset is detailed in Tbale 1. All the images are pre-operative TBAD CTA images whose top and bottom are around the neck and the brachiocephalic vessels, respectively, in the axial view. The segmentation labeling is performed by a team of two cardiovascular radiologists who have extensive experience with TBAD. The segmentation label of each image is fulfilled by one radiologist and checked by the other. The time to label each image is around 1-1.5 hours. The segmentation includes three substructures: TL, FL, and FLT. There are 68 images containing FLT while 32 images are free of FLT.
By analyzing all the labels, we find the segmentation of FLT is challenging due to the following two reasons. First, FLT can appear almost anywhere along the aorta, with irregular shapes, although most FLT appear at the top of the aorta. Fig. 2 shows a variety of relative positions of FLT. Fig. 2(a-c) shows the most common locations of FLT, while Fig. 2(d) is also common in clinic. Fig. 2(e-h) show some typical cases where FLT is distributed along with the whole FL and discontinued in multiple locations. Most FLTs exist at the surface of the aorta, but there are also some located at the center of the aorta and between the FL and the TF.
Within the eight cases in Fig. 2, we can notice the largest variety of the shapes of FLT. Most FLTs are rather thin and long, while some others are a pile at the top of the aorta. In addition, some FLTs are small which is relatively difficult to segment as shown in Fig. 2(g). Second, the contrast between FLT and other tissues is rather low. As shown in Fig. 3, the intensity of the FLT and the nearby tissues are almost the same which is not easy to be visually recognized. By zooming the area of the boundary in, we can notice some parts of the boundary as shown in Fig. 3(a)(b), but some are still with high uncertainty as shown in Fig. 3(c). The low contrast would bring more challenges to FLT segmentation.
3 Method and Experiment
3.1 The Baseline Method
By analyzing the dataset, we discover the following three phenomenons. First, the segmentation area is usually rather long in the axial view, which needs to be considered in the design of the input size. Second, the target segmentation is rather small compared with the size of the input, and processing the whole image is not efficient. Third, in most conditions, the combination of TL, FL, and FLT has a similar shape of the aorta. In fact, the part corresponding to FLT is a part of the aorta in normal anatomy. We can also obtain FLT by removing TL and FL from the combination of the three. This approach is expected to be more effective than direct segmentation of FLT because the complexity of shapes and positions of FLT can be avoided. For simplicity of discussion, the combination of the three parts is donated as the aorta.
Based on the above observations, we propose a baseline method which is a processing pipeline shown in Fig. 4. The processing pipeline includes two steps: region of interest (RoI) extraction, and RoI segmentation.
RoI extraction: The RoI extraction aims to obtain a precise bounding box of the target area, which is fulfilled with two croppings. The first cropping obtains a rough bounding box by segmenting the aorta on a resized input (original size to 646464) using 3D U-net. Based on the rough bounding box, the rough RoI is cropped from the original input, and then resized to SS2S. The cropping refinement is further proceeded on the rough RoI for aorta segmentation, and a relatively more precise bounding box of the RoI is obtained.
RoI segmentation: The RoI segmentation performs segmentation tasks on the refined RoI. We discuss two approaches: Approach A, we combine the TL and FL segmentation, and the aorta segmentation; Approach B, we perform direct segmentation of the three. In Approach A, we suppose to easily get FLT once we obtain both TL and FL and aorta according to our discovery. Note that all the modules adopt the same 3D U-net structure as shown in Fig. 4. Four resolution levels are adopted each of which contains two convolutional layers and one pooling/up-convolutional layer. The number of filters is , , , and for the four resolution levels, respectively. and the input size vary for different modules as discussed above. Post-processing only including upsampling to the original size is performed.
We implemented our baseline method using PyTorch based on[isensee2018nnu]
. Both Dice loss and cross entropy loss were used, and the number of training epochs was 5 for all 3D U-nets. Data augmentation and normalization were also adopted with the same configuration as in[payer2017multi] for 3D U-net. For both networks and all the analyses, three-fold cross validation was performed (about 33 images for testing, and 67 images for training). We split the dataset so that the number of images containing FLT in each fold were the same.
We implemented two configurations, with and , respectively. Accordingly, and the batch size was 4 when , and and the batch size was 3 when . All the experiments ran on a Nvidia GTX 1080Ti GPU with 11GB memory.
Dice score and Hausdorff distance were selected as the metrics for evaluation. For images without FLT, the Dice score is 1 if there is no FLT in the segmentation result, otherwise 0. As Approach B in RoI segmentation is similar to the methods that achieves the SOTA results in the TBAD [li2018multi, cao2019fully], we compared our method with theirs though their dataset and methods focused on the segmentation of FLT. Meanwhile, Hausdorff distance evaluated the shape similarity of propose method, which is formulated as follow,
where G and S represent ground truth and prediction segmentation, respectively.
Differences between results are compared using the independent two-sample t-test. A p-value of less than 0.05 in the independent two-sample t-test is considered as statistical significance.
|Approach A||Approach B||t - value||p|
Mean and standard deviation of Dice score of baseline method, and t-test value between the Approach A and Approach B for four substructures segmentation in TBAD.
|Approach A||Approach B||t - value||p|
4 Results and Discussion
Table 2 and Table 3 demonstrate that the mean and standard deviation of Dice score and Hausdorff distance of baseline methods (Approach A and Approach B), and their t-test value and p-value for four substructures segmentation in TBAD, respectively. In terms of different substructures, both Approach A and Approach B achieves the highest scores on aorta with small Hausdorff distances. However, both two methods fail to segment the TL, FL, and FLT well, for the three are parts of the aorta without remarkable boundaries, thus relatively harder to segment them. The Dice score and Hausdorff distance of TL beats that of FL, which may be caused by the low contrast between FL and FLT. FLT obtains the lowest performance due to its great challenges discussed in Section 2. As for the two methods, though Approach A with a multi-task segmentation module achieves a bit higher Dice score with a lower Hausdorff distance than Approach B using direct segmentation, it fail to achieve higher performance on the other two parts especially on FLT. Approach B obtains a large improvement over Approach A on FLT. This may due to the fact that direct segmentation has more constraint to more accurately define FLT than multi-task segmentation. On the other hand, we also notice some impacts from the input size. The Dice score of =96 is slightly higher than that of =64 due to the higher resolution of =96. However, the improvement is small, and there is no improvement for FLT, which indicates that higher resolution has very limited success on FLT segmentation. Particularly, for all the 32 images without FLT, the baseline method with both configurations correctly obtain the results with Dice score of 1, which indicates that the FLT segmentation accuracy for images with FLT are much lower (about 20%) than 52%.
Existing works most relevant to ours are the works proposed by a group from Tsinghua University [li2018multi][cao2019fully] though the dataset and labels are different. The method [li2018multi] achieves Dice scores of 0.92, 0.85, and 0.85 on aorta, TL, and FL, respectively on the same machine (11 GB GPU memory) as ours. The improved version [cao2019fully] obtains Dice scores of 0.93, 0.93, and 0.91, on aorta, TL, and FL, respectively on a more powerful machine (32 GB GPU memory). Compared with [li2018multi], ours achieves almost the same performance on aorta and TL, but much lower on FL. While compared with [cao2019fully], ours obtains comparable performance only on aorta, but much lower on TL and FL. The comparable results on aorta indicates that our baseline method is also a powerful one. The gaps in TL and FL may due to the difference on datasets, labels, and method details.
Though with these difference, we can still notice that accurate segmentation of FLT is rather challenging. We hope our dataset and baseline method could help fill the gap and tackle this challenge.
Good segmentation: Examples of good segmentation results are shown in Fig. 5. Overall, the segmentation results have a good match with the ground truth. However, we can still notice that compared with TL and FL, FLT has more segmentation flaws, which corresponds well to the Dice scores in Table 2. There is a tiny FL island at the top of the aorta which should be FLT as shown in Fig. 5(a). Another three tiny FLT islands exist at the similar position which should be FL as shown in Fig. 5(f), Fig. 5(h) and Fig. 5(h), respectively. The most serious flaw of FLT is the inaccurate segmentation of its boundaries. As shown in Fig. 5(b)(e)(f), there is noticeable error of the boundary segmentation. The situations in Fig. 5(g)(h) is much worse, and a large part of FLT is misclassified as FL. Most of the inaccurate boundary segmentation happens at the descending aorta. Its low performance is usually caused by the low contrast, which also degrades the segmentation performance of FL. TL usually has a much better performance as its contrast is much higher, and there are only some tiny flaws as shown in Fig. 5(c).
Poor segmentation: Examples of poor segmentation results are shown in Fig. 6. Overall, there exists serious segmentation error especially for FLT. With the context of TL and FL, the shape of FLT in Fig. 6(a) can be easily recognized by humans. However, only part of the shape is correctly segmented because of the low contrast as shown in the zoomed CTA image. A part of FLT is lost in Fig. 6(d)(e) which is due to the low contrast. The qualities get worse in both Fig. 6(b)(c) in which LFT are almost totally lost. The boundaries is difficult to visually tell in Fig. 6(b)(c). There are also some inaccurate segmentation between TL and FL shown in 6(d)(e). The incorrect connection exists between TL and FL in Fig. 6(d), and the low contrast in CTA images leads to the inaccurate segmentation between FL and TL as shown in Fig. 6(e).
In this paper we introduce the ImageTBAD dataset to the community, which is the first 3D computed tomography angiography (CTA) image dataset of TBAD with annotation of true lumen (TL), false lumen (FL) and false lumen thrombus (FLT). We further propose a baseline method based on 3D U-net for automatic segmentation of TBAD. Results show that the baseline method can achieve comparable results with existing works on aorta and TL segmentations. However, the segmentation accuracy of FLT is only 52%, which leaves large room for improvement and proves the challenge of our dataset. FLT segmentation represents a wide class of segmentation problems where targets exist in a variety of positions with irregular shapes. We hope that the open-sourced code of our baseline method and dataset can encourage the community to tackle this problem.
Conflict of Interest Statement
The authors declare no conflict of interest.
Zeyang Yao, Hailong Qiu contributed to data collection. Haiyun Yuan, Jian Zhuang, Jiawei Zhang, Qianjun Jia, Tianchen Wang, and Yiyu Shi contributed to analysis and writing. Meiping Huang, Yuhao Dong, and Xiaowei Xu contributed to project planning, development, discussion and writing.
This work was supported by the National key Research and Development Program of China (No. 2018YFC1002600), the Science and Technology Planning Project of Guangdong Province, China (No. 2017B090904034, No. 2017B030314109, No. 2018B090944002, No. 2019B020230003), Guangdong Peak Project (No. DFJH201802), the National Natural Science Foundation of China (No. 62006050).
This work was approved by the Research Ethics Committee of Guangdong General Hospital, Guangdong Academy of Medical Science under Protocol No. 20140316. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.