Joint Left Atrial Segmentation and Scar Quantification Based on a DNN with Spatial Encoding and Shape Attention

by   Lei Li, et al.
FUDAN University

We propose an end-to-end deep neural network (DNN) which can simultaneously segment the left atrial (LA) cavity and quantify LA scars. The framework incorporates the continuous spatial information of the target by introducing a spatially encoded (SE) loss based on the distance transform map. Compared to conventional binary label based loss, the proposed SE loss can reduce noisy patches in the resulting segmentation, which is commonly seen for deep learning-based methods. To fully utilize the inherent spatial relationship between LA and LA scars, we further propose a shape attention (SA) mechanism through an explicit surface projection to build an end-to-end-trainable model. Specifically, the SA scheme is embedded into a two-task network to perform the joint LA segmentation and scar quantification. Moreover, the proposed method can alleviate the severe class-imbalance problem when detecting small and discrete targets like scars. We evaluated the proposed framework on 60 LGE MRI data from the MICCAI2018 LA challenge. For LA segmentation, the proposed method reduced the mean Hausdorff distance from 36.4 mm to 20.0 mm compared to the 3D basic U-Net using the binary cross-entropy loss. For scar quantification, the method was compared with the results or algorithms reported in the literature and demonstrated better performance.


page 3

page 6

page 8


AtrialJSQnet: A New Framework for Joint Segmentation and Quantification of Left Atrium and Scars Incorporating Spatial and Shape Information

Left atrial (LA) and atrial scar segmentation from late gadolinium enhan...

Cardiac Segmentation from LGE MRI Using Deep Neural Network Incorporating Shape and Spatial Priors

Cardiac segmentation from late gadolinium enhancement MRI is an importan...

Left Ventricle Segmentation and Quantification from Cardiac Cine MR Images via Multi-task Learning

Segmentation of the left ventricle and quantification of various cardiac...

VoxelAtlasGAN: 3D Left Ventricle Segmentation on Echocardiography with Atlas Guided Generation and Voxel-to-voxel Discrimination

3D left ventricle (LV) segmentation on echocardiography is very importan...

Atrial fibrosis quantification based on maximum likelihood estimator of multivariate images

We present a fully-automated segmentation and quantification of the left...

Deep Learning from Label Proportions for Emphysema Quantification

We propose an end-to-end deep learning method that learns to estimate em...

Atrial scars segmentation via potential learning in the graph-cuts framework

Late Gadolinium Enhancement Magnetic Resonance Imaging (LGE MRI) emerged...

1 Introduction

Atrial fibrillation (AF) is the most common cardiac arrhythmia which increases the risk of stroke, heart failure and death [2]. Radiofrequency ablation is a promising procedure for treating AF, where patient selection and outcome prediction of such therapy can be improved through left atrial (LA) scar localization and quantification. Atrial scars are located on the LA wall, thus it normally requires LA/ LA wall segmentation to exclude confounding enhanced tissues from other substructures of the heart. Late gadolinium enhanced magnetic resonance imaging (LGE MRI) has been an important tool for scar visualization and quantification. Manual delineations of LGE MRI can be subjective and labor-intensive. However, automating this segmentation remains challenging, mainly due to the various LA shapes, thin LA wall, poor image quality and enhanced noise from surrounding tissues.

Limited studies have been reported in the literature to develop automatic LA segmentation and scar quantification algorithms. For LA segmentation, Xiong et al. proposed a dual fully convolutional neural network (CNN)

[11]. In an LA segmentation challenge [14], Chen et al. presented a two-task network for atrial segmentation and post/ pre classification to incorporate the prior information of the patient category [1]. Nunez et al. achieved LA segmentation by combining multi-atlas segmentation and shape modeling of LA [9]. Recently, Yu et al. designed an uncertainty-aware semi-supervised framework for LA segmentation [12]. For scar quantification, most of the current works adopted threshold-based methods that relied on manual LA wall segmentation [6]

. Some other conventional algorithms, such as Gaussian mixture model (GMM) 

[4], also required an accurate initialization of LA or LA wall. However, automatic LA wall segmentation is complex and challenging due to its inherent thin thickness ( mm) [5]. Recent studies show that the thickness can be ignored, as clinical studies mainly focus on the location and extent of scars [10, 7]. For example, Li et al. proposed a graph-cuts framework for scar quantification on the LA surface mesh, where the weights of the graph were learned via a multi-scale CNN [7]. However, they did not achieve an end-to-end training, i.e., the multi-scale CNN and graph-cuts were separated into two sub-tasks.

Recently, deep learning (DL)-based methods have achieved promising performance for cardiac image segmentation. However, most DL-based segmentation methods are trained with a loss only considering a label mask in a discrete space. Due to the lack of spatial information, predictions commonly tend to be blurry in boundary, and it leads to noisy segmentation with large outliers. To solve this problem, several strategies have been employed, such as graph-cuts/ CRF regularization 

[7, 3], and deformation combining shape priors [13].

Figure 1: The proposed MTL-SESA network for joint LA segmentation and scar quantification. Note that the skip connections between the encoder and two decoders are omitted here.

In this work, we present an end-to-end multi-task learning network for joint LA segmentation and scar quantification. The proposed method incorporates spatial information in the pipeline to eliminate outliers for LA segmentation, with additional benefits for scar quantification. This is achieved by introducing a spatially encoded loss based on the distance transform map, without any modifications of the network. To utilize the spatial relationship between LA and scars, we adopt the LA boundary as an attention mask on the scar map, namely surface projection, to achieve shape attention. Therefore, an end-to-end learning framework is created for simultaneous LA segmentation, scar projection and quantification via the multi-task learning (MTL) network embedding the spatial encoding (SE) and boundary shape attention (SA), namely MTL-SESA network.

2 Method

Fig. 1 provides an overview of the proposed framework. The proposed network is a modified U-Net consisting of two decoders for LA segmentation and scar quantification, respectively. In Section 2.1

, a SE loss based on the distance transform map is introduced as a regularization term for LA segmentation. For scar segmentation, a SE loss based on the distance probability map is employed, followed by a spatial projection (see Section

2.2). Section 2.3 presents the specific SA scheme embedded in the MTL network for the predictions of LA and LA scars in an end-to-end style.

2.1 Spatially Encoded Constraint for LA Segmentation

A SE loss based on the signed distance transform map (DTM) is employed as a regularization term to represent a spatial vicinity to the target label. Given a target label, the signed DTM for each pixel can be defined as:


where and respectively indicate the region inside and outside the target label, denotes the surface boundary, represents the distance from pixel to the nearest point on , and

is a hyperparameter. The binary cross-entropy (BCE) loss and the additional SE loss for LA segmentation can be defined as:


where and () are the prediction of LA and its ground truth, respectively, and denotes element-wise product.

2.2 Spatially Encoded Constraint with an Explicit Projection for Scar Quantification

For scar quantification, we encode the spatial information by adopting the distance probability map of normal wall and scar region as the ground truth instead of binary scar label. This is because the scar region can be very small and discrete, thus its detection presents significant challenges to current DL-based methods due to the class-imbalance problem. In contrast to traditional DL-based algorithms optimizing in a discrete space, the distance probability map considers the continuous spatial information of scars. Specifically, we separately obtain the DTM of the scar and normal wall from a manual scar label, and convert both into probability maps . Here and is the nearest distance to the boundary of normal wall or scar for pixel . Then, the SE loss for scar quantification can be defined as:


where () is the predicted distance probability map of both normal wall and scar region. Note that the situation of sometimes exists. One can compare these two probabilities to extract scars instead of employing a fixed threshold.

To ignore the wall thickness which varies from different positions and patients [5], the extracted scars are explicitly projected onto the LA surface. Therefore, the volume-based scar segmentation is converted into a surface-based scar quantification through the spatially explicit projection. However, the pixel-based classification in the surface-based quantification task only includes very limited information, i.e., the intensity value of one pixel. In contrast to extracting multi-scale patches along the LA surface [7], we employ the SE loss to learn the spatial features near the LA surface. Similar to [7], the SE loss can also be beneficial to improving the robustness of the framework against the LA segmentation errors.

2.3 Multi-task Learning with an End-to-end Trainable Shape Attention

To employ the spatial relationship between LA and atrial scars, we design an MTL network including two decoders, i.e., one for LA and the other for scar segmentation. As Fig. 1 shows, the Decoder is supervised by and , and the Decoder is supervised by . To explicitly learn the relationship between the two tasks, we extract the LA boundary from the predicted LA as an attention mask for the training of Decoder, namely explicit projection mentioned in Section 2.2. An SA loss is introduced to enforce the attention of Decoder on the LA boundary:


where , , and is the boundary attention mask, which can be generated from the gold standard segmentation of LA () as well as the predicted LA (). Hence, the total loss of the framework is defined by combining all the losses mentioned above:


where , , and are balancing parameters.

3 Experiments

3.1 Materials

3.1.1 Data Acquisition and Pre-processing.

The data is from the MICCAI2018 LA challenge [14]. The 100 LGE MRI training data, with manual segmentation of LA, consists of 60 post-ablation and 40 pre-ablation data. In this work, we chose the 60 post-ablation data for manual segmentation of the LA scars and employed them for experiments. The LGE MRIs were acquired with a resolution of mm and reconstructed to mm. All images were cropped into a unified size of

centering at the heart region and were normalized using Z-score. We split the images into two sets, i.e., one with 40 images for training and the other with 20 for the test.

3.1.2 Gold Standard and Evaluation.

The challenge provides LA manual segmentation for the training data, and scars of the 60 post-ablation data were manually delineated by a well-trained expert. These manual segmentations were considered as the gold standard. For LA segmentation evaluation, Dice volume overlap, average surface distance (ASD) and Hausdorff distance (HD) were applied. For scar quantification evaluation, the manual and (semi-) automatic segmentation results were first projected onto the manually segmented LA surface. Then, the Accuracy measurement of the two areas in the projected surface, Dice of scars (Dice) and generalized Dice score (Dice) were used as indicators of the accuracy of scar quantification.

3.1.3 Implementation.

The framework was implemented in PyTorch, running on a computer with 1.90 GHz Intel(R) Xeon(R) E5-2620 CPU and an NVIDIA TITAN X GPU. We used the SGD optimizer to update the network parameters (weight decay=0.0001, momentum=0.9). The initial learning rate was set to 0.001 and divided by 10 every 4000 iterations. The balancing parameters in Section

2.3, were set as follows, , , and , where and was multiplied by 1.1 every 200 iterations. The inference of the networks required about 8 seconds to process one test image.

3.2 Result

Figure 2: Quantitative and qualitative evaluation results of the proposed SE loss for LA segmentation: (a) Dice and HD of the LA segmentation results after combining the SE loss, i.e., U-Net-SE with different for DTM; (b) 3D visualization of the LA segmentation results of three typical cases by U-Net-BCE and U-Net-SE.
Method  Dice  ASD (mm)  HD (mm)

Table 1: Summary of the quantitative evaluation results of LA segmentation. Here, U-Net uses the original U-Net architecture for LA segmentation; MTL means that the methods are based on the architecture in Fig. 1

with two decoders; BCE, SE, SA and SESA refer to the different loss functions. The proposed method is denoted as MTL-SESA.

Method  Accuracy   Dice   Dice
LA+Otsu [10]
LA+LearnGC [7]

Table 2: Summary of the quantitative evaluation results of scar quantification. Here, LA denotes that scar quantification is based on the manually segmented LA, while LA indicates that it is based on the U-Net-BCE segmentation; U-Net is the scar segmentation directly based on the U-Net architecture with different loss functions; The inter-observer variation (Inter-Ob) is calculated from randomly selected twelve subjects.

Figure 3: 3D visualization of the LA scar localization by the eleven methods. The scarring areas are labeled in orange on the LA surface, which is constructed from LA labeled in blue.

3.2.1 Parameter Study.

To explore the effectiveness of the SE loss, we compared the results of the proposed scheme for LA segmentation using different values of for DTM in Eq. (1). Fig. 2 (a) provides the results in terms of Dice and HD, and Fig. 2 (b) visualizes three examples for illustrating the difference of the results using or without using the SE loss. One can see that with the SE loss, U-Net-SE evidently reduced clutter and disconnected parts in the segmentation compared to U-Net-BCE, and significantly improved the HD of the resulting segmentation (), though the Dice score may not be very different. Also, U-Net-SE showed stable performance with different values of except for too extreme values. In the following experiments, was set to 1.

3.2.2 Ablation Study.

Table 1 and Table 2 present the quantitative results of different methods for LA segmentation and scar quantification, respectively. For LA segmentation, combining the proposed SE loss performed better than only using the BCE loss. For scar quantification, the SE loss also showed promising performance compared to the conventional losses in terms of Dice. LA segmentation and scar quantification both benefited from the proposed MTL scheme comparing to achieving the two tasks separately. The results were further improved after introducing the newly-designed SE and SA loss in terms of Dice (), but with a slightly worse Accuracy () and Dice (). Fig. 3 visualizes an example for illustrating the segmentation and quantification results of scars from the mentioned methods in Table 2. Compared to U-Net-BCE and U-Net-Dice, MTL-BCE improved the performance, thanks to the MTL network architecture. When the proposed SE and SA loss were included, some small and discrete scars were also detected, and an end-to-end scar quantification and projection was achieved.

3.2.3 Comparisons with Literature.

Table 2 and Fig. 3 also present the scar quantification results from some state-of-the-art algorithms, i.e., Otsu [10], multi-component GMM (MGMM) [8], LearnGC [7] and U-Net with different loss functions. The three (semi-) automatic methods generally obtained acceptable results, but relied on an accurate initialization of LA. LearnGC had a similar result compared to MGMM in Dice based on LA, but its Accuracy and Dice were higher. The proposed method performed much better than all the automatic methods in terms of Dice with statistical significance (). In Fig. 3, one can see that Otsu and U-Net tended to under-segment the scars. Though including Dice loss could alleviate the class-imbalance problem, it is evident that the SE loss could be more effective, which is consistent with the quantitative results in Table 2. MGMM and LearnGC both detected most of the scars, but LearnGC has the potential advantage of small scar detection. The proposed method could also detect small scars and obtained a smoother segmentation result.

4 Conclusion

In this work, we have proposed an end-to-end learning framework for simultaneous LA segmentation and scar quantification by combining the SE and SA loss. The proposed algorithm has been applied to 60 image volumes acquired from AF patients and obtained comparable results to inter-observer variations. The results have demonstrated the effectiveness of the proposed SE and SA loss, and showed the superiority of segmentation performance over the conventional schemes. Particularly, the proposed SE loss substantially reduced the outliers, which frequently occurs in the prediction of DL-based methods. Our technique can be easily extended to other segmentation tasks, especially for discrete and small targets such as lesions. A limitation of this work is that the gold standard was constructed from the manual delineation of only one expert. Besides, the target included in this study is only post-ablation AF patients. In future work, we will combine multiple experts to construct the gold standard, and consider both pre- and post-ablation data.

4.0.1 Acknowledgement.

This work was supported by the National Natural Science Foundation of China (61971142), and L. Li was partially supported by the CSC Scholarship.


  • [1] C. Chen, W. Bai, and D. Rueckert (2018) Multi-task learning for left atrial segmentation on GE-MRI. In International Workshop on Statistical Atlases and Computational Models of the Heart, pp. 292–301. Cited by: §1.
  • [2] S. S. Chugh, R. Havmoeller, K. Narayanan, D. Singh, M. Rienstra, E. J. Benjamin, R. F. Gillum, Y. Kim, J. H. McAnulty Jr, Z. Zheng, et al. (2014) Worldwide epidemiology of atrial fibrillation: a global burden of disease 2010 study. Circulation 129 (8), pp. 837–847. Cited by: §1.
  • [3] K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, D. Rueckert, and B. Glocker (2017) Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical image analysis 36, pp. 61–78. Cited by: §1.
  • [4] R. Karim, A. Arujuna, R. J. Housden, J. Gill, H. Cliffe, K. Matharu, J. Gill, C. A. Rindaldi, M. O’Neill, D. Rueckert, et al. (2014) A method to standardize quantification of left atrial scar from delayed-enhancement MR images. IEEE journal of translational engineering in health and medicine 2, pp. 1–15. Cited by: §1.
  • [5] R. Karim, L. Blake, J. Inoue, Q. Tao, S. Jia, R. J. Housden, P. Bhagirath, J. Duval, M. Varela, J. M. Behar, et al. (2018)

    Algorithms for left atrial wall segmentation and thickness–evaluation on an open-source CT and MRI image database

    Medical image analysis 50, pp. 36–53. Cited by: §1, §2.2.
  • [6] R. Karim, R. J. Housden, M. Balasubramaniam, Z. Chen, D. Perry, A. Uddin, Y. Al-Beyatti, E. Palkhi, P. Acheampong, S. Obom, et al. (2013) Evaluation of current algorithms for segmentation of scar tissue from late gadolinium enhancement cardiovascular magnetic resonance of the left atrium: an open-access grand challenge. Journal of Cardiovascular Magnetic Resonance 15 (1), pp. 105. Cited by: §1.
  • [7] L. Li, F. Wu, G. Yang, L. Xu, T. Wong, R. Mohiaddin, D. Firmin, J. Keegan, and X. Zhuang (2020) Atrial scar quantification via multi-scale CNN in the graph-cuts framework. Medical Image Analysis 60, pp. 101595. Cited by: §1, §1, §2.2, §3.2.3, Table 2.
  • [8] J. Liu, X. Zhuang, L. Wu, D. An, J. Xu, T. Peters, and L. Gu (2017) Myocardium segmentation from DE MRI using multicomponent Gaussian mixture model and coupled level set. IEEE Transactions on Biomedical Engineering 64 (11), pp. 2650–2661. Cited by: §3.2.3, Table 2.
  • [9] M. Nuñez-Garcia, X. Zhuang, G. Sanroma, L. Li, L. Xu, C. Butakoff, and O. Camara (2018) Left atrial segmentation combining multi-atlas whole heart labeling and shape-based atlas selection. In International Workshop on Statistical Atlases and Computational Models of the Heart, pp. 302–310. Cited by: §1.
  • [10] D. Ravanelli, E. C. dal Piaz, M. Centonze, G. Casagranda, M. Marini, M. Del Greco, R. Karim, K. Rhode, and A. Valentini (2013) A novel skeleton based quantification and 3-D volumetric visualization of left atrium fibrosis using late gadolinium enhancement magnetic resonance imaging. IEEE transactions on medical imaging 33 (2), pp. 566–576. Cited by: §1, §3.2.3, Table 2.
  • [11] Z. Xiong, V. V. Fedorov, X. Fu, E. Cheng, R. Macleod, and J. Zhao (2018) Fully automatic left atrium segmentation from late gadolinium enhanced magnetic resonance imaging using a dual fully convolutional neural network. IEEE transactions on medical imaging 38 (2), pp. 515–524. Cited by: §1.
  • [12] L. Yu, S. Wang, X. Li, C. Fu, and P. Heng (2019) Uncertainty-aware self-ensembling model for semi-supervised 3D left atrium segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 605–613. Cited by: §1.
  • [13] Q. Zeng, D. Karimi, E. H. Pang, S. Mohammed, C. Schneider, M. Honarvar, and S. E. Salcudean (2019) Liver segmentation in magnetic resonance imaging via mean shape fitting with fully convolutional neural networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 246–254. Cited by: §1.
  • [14] J. Zhao and Z. Xiong (2018) 2018 atrial segmentation challenge. Note: Cited by: §1, §3.1.1.