Distance Map Loss Penalty Term for Semantic Segmentation

by   Francesco Caliva, et al.
UC San Francisco

Convolutional neural networks for semantic segmentation suffer from low performance at object boundaries. In medical imaging, accurate representation of tissue surfaces and volumes is important for tracking of disease biomarkers such as tissue morphology and shape features. In this work, we propose a novel distance map derived loss penalty term for semantic segmentation. We propose to use distance maps, derived from ground truth masks, to create a penalty term, guiding the network's focus towards hard-to-segment boundary regions. We investigate the effects of this penalizing factor against cross-entropy, Dice, and focal loss, among others, evaluating performance on a 3D MRI bone segmentation task from the publicly available Osteoarthritis Initiative dataset. We observe a significant improvement in the quality of segmentation, with better shape preservation at bone boundaries and areas affected by partial volume. We ultimately aim to use our loss penalty term to improve the extraction of shape biomarkers and derive metrics to quantitatively evaluate the preservation of shape.


page 2

page 3


Active Boundary Loss for Semantic Segmentation

This paper proposes a novel active boundary loss for semantic segmentati...

Learning Non-Unique Segmentation with Reward-Penalty Dice Loss

Semantic segmentation is one of the key problems in the field of compute...

Edge-Preserving Guided Semantic Segmentation for VIPriors Challenge

Semantic segmentation is one of the most attractive research fields in c...

Incorporating Boundary Uncertainty into loss functions for biomedical image segmentation

Manual segmentation is used as the gold-standard for evaluating neural n...

Learning Crisp Boundaries Using Deep Refinement Network and Adaptive Weighting Loss

Significant progress has been made in boundary detection with the help o...

InverseForm: A Loss Function for Structured Boundary-Aware Segmentation

We present a novel boundary-aware loss term for semantic segmentation us...

A novel shape-based loss function for machine learning-based seminal organ segmentation in medical imaging

Automated medical image segmentation is an essential task to aid/speed u...

1 Introduction

The segmentation of medical images enables the quantitative analysis of anatomical structures. In both 2D and 3D medical imaging data, state of the art segmentation performance has been achieved using Convolutional Neural Networks, with U-Net [ronneberger2015u], V-Net [milletari2016v], and variants thereof. In this work, the original V-Net architecture was chosen as the end-to-end encoder-decoder architecture, because of its peculiar capability of learning a residual function within each down- and up- sampling stage. This alleviates the problem of overfitting and vanishing gradients, with the added benefit of faster convergence [he2016deep]

. The loss function proposed in 

[milletari2016v], is the baseline of our experiments. V-Net aims to minimize the soft Dice loss, derived from the Dice coefficient


where the sum runs over all the and volume voxels of the generated segmentation and the relative ground truth masks respectively.

We conducted an initial experiment using a V-Net architecture to segment knee bones in 3D MRIs. In agreement with [milletari2016v] we observed superior segmentation performance when using Dice loss compared to the weighted log-likelihood loss. Irrespective of choice of loss function, most errors were located at the proximity of bone boundaries. This work proposes a simple strategy to penalize segmentation errors at object boundaries utilizing distance maps generated on the segmentation ground truth. The approach is similar to [kervadec2018boundary]. Nevertheless, we train with a distance based loss penalty from the beginning, while [kervadec2018boundary] proposes a fine tuning like strategy. Furthermore, we extend the approach to a 3D and multi-class context, and we are driven by different motivations: in [kervadec2018boundary], the goal is to deal with highly imbalanced dataset, whereas our focus is accurate segmentation of object boundaries. We also conduct a more thorough comparison with other state of the art attention-based losses. Finally, application of our method to highly imbalanced datasets is straightforward.

2 Methods and Experiments

The Osteoarthritis Initiative (OAI) dataset is comprised of knee MR scans from 4,796 unique patients scanned at 10 different time points, MR acquisition described in [norman2018use].Forty unique patients were manually segmented obtaining ground truth masks for the distal femur, proximal tibia, and patella. These were used to evaluate our proposed method with a 25/5/10 train/valid/test split.


Figure 1: (a) Ground truth segmentation and (b) distance map, bone boundaries in white

Error-penalizing distance maps (fig:distancemaps) were generated by computing the distance transform on the segmentation masks and then reverting them, by voxel-wise subtracting the binary segmentation from the mask overall max distance value. This procedure aims to compute a distance mask where pixels in proximity of the bones are weighted more, compared to those located far away. An identical procedure was conducted on the negative version of the segmentation mask to calculate a distance map inside the bones. To account for differences in bone size, with the femur being and times larger than tibia and patella respectively, inner distance maps for each bone were independently computed and subsequently combined. The generated maps were utilized to penalize prediction errors during training. In practice, the aim is to minimize the “penalized” multi-class cross entropy loss in eq:loss,


where the two sums run over the i samples and the j classes, and is the Hadamard product. Adding 1 to has the effect of mitigating the vanishing gradient issue. We benchmarked the proposed penalizing term against commonly used loss functions, including soft-dice loss, focal loss [lin2017focal], and the confident predictions penalizing loss proposed in [pereyra2017regularizing]. A V-Net architecture was trained using mini-batch Gradient Descent with Adam Optimizer [kingma2014adam] (learning rate ) and random in-plane rotations as augmentations. MATLAB [matlab1760mathworks]

and Tensorflow 1.12 

[abadi2016tensorflow] were run on an Intel®Xeon (R) Gold 6130 CPU @ 2.10GHz, four GPUs and 376GB of RAM.


Figure 2: Posterior view of distal femur and proximal tibia for a single test patient, predicted segmentation’s absolute distance from ground truth. 0 is a perfect segmentation.

3 Results and Conclusions

Predicted segmentation masks were post-processed by applying 3D morphological closing and extraction of the three largest connected components. To demonstrate the utility of our loss penalty term, we compare it to other successful methods in Figure LABEL:fig:results and Figure LABEL:fig:stats, using error maps and the following metrics: global Dice score (G-DSC), boundary Dice score (B-DSC), and its relaxed version which expands boundaries by a certain tolerance. Our method produces high-quality segmentations, with accurate results even in regions with significant partial voluming (intercondyle notch, tibial condyles). B-DSC of our proposed loss shows a significant improvement in edge detection ( vs Dice loss vs [pereyra2017regularizing] vs focal loss ). This superior performance is maintained globally (G-DSC) ( vs Dice loss vs [pereyra2017regularizing] vs focal loss ). We observed that guiding the network with a shape-aware loss function is a promising method to improve segmentation performance.

This work was supported by the NIH/NIAMMS R00AR070902


Appendix A Additional Results


Figure 3: Performance comparison of the proposed distance map penalizing loss term against the Dice Loss function, confident predictions penalizing loss and the focal loss. (a) Global Dice Score Coefficient G-DSC, (b) Boundary Dice Score Coefficient B-DSC, and (c-f) relaxed B-DSC tolerance 1 to 4 voxels are reported.