Log In Sign Up

An Elastic Interaction-Based Loss Function for Medical Image Segmentation

Deep learning techniques have shown their success in medical image segmentation since they are easy to manipulate and robust to various types of datasets. The commonly used loss functions in the deep segmentation task are pixel-wise loss functions. This results in a bottleneck for these models to achieve high precision for complicated structures in biomedical images. For example, the predicted small blood vessels in retinal images are often disconnected or even missed under the supervision of the pixel-wise losses. This paper addresses this problem by introducing a long-range elastic interaction-based training strategy. In this strategy, convolutional neural network (CNN) learns the target region under the guidance of the elastic interaction energy between the boundary of the predicted region and that of the actual object. Under the supervision of the proposed loss, the boundary of the predicted region is attracted strongly by the object boundary and tends to stay connected. Experimental results show that our method is able to achieve considerable improvements compared to commonly used pixel-wise loss functions (cross entropy and dice Loss) and other recent loss functions on three retinal vessel segmentation datasets, DRIVE, STARE and CHASEDB1.


High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

Deep learning methods have contributed substantially to the rapid advanc...

Reducing the Hausdorff Distance in Medical Image Segmentation with Convolutional Neural Networks

The Hausdorff Distance (HD) is widely used in evaluating medical image s...

Region Growing with Convolutional Neural Networks for Biomedical Image Segmentation

In this paper we present a methodology that uses convolutional neural ne...

Region-wise Loss for Biomedical Image Segmentation

We propose Region-wise (RW) loss for biomedical image segmentation. Regi...

Beyond the Pixel-Wise Loss for Topology-Aware Delineation

Delineation of curvilinear structures is an important problem in Compute...

Boundary loss for highly unbalanced segmentation

Widely used loss functions for convolutional neural network (CNN) segmen...

Offset Curves Loss for Imbalanced Problem in Medical Segmentation

Medical image segmentation has played an important role in medical analy...

1 Introduction

The medical image segmentation task is to automatically extract the region of interest, such as organs and lesions in medical images. Segmentation plays a critical role in medical image analysis since the information of the segmented region (such as length, width, angles) can be used for further diagnosis and treatment of related diseases. How to locate the region of interest in an exact manner is a major challenge.

In recent years, convolutional neural networks (CNNs) have shown ground-breaking effects to medical image segmentation task and different approaches have been proposed [11, 18, 17, 12]

. The widely adopted choice of the loss function in CNN methods is pixel-wise loss functions such as cross entropy. Pixel-wise loss functions classify each pixel of an image individually by comparing the corresponding ground truth and learning each pixel in an image equally. As a result, this type of loss functions are not necessarily robust in detecting the details and small structures which are critical in medical image analysis. For instance, in the retinal vessel segmentation task, the long, thin and weak vessels can be disconnected easily or even missed under the supervision of a pixel-wise loss function.

Before the flourishing of CNNs, variational image segmentation approaches [1, 16, 23] based on intensity or textural information were widely used in dealing with the segmentation task. Among them, the active contour methods define an evolving curve under the force derived by some energy functional, and this evolving curve moves to minimize this energy. Different energy functionals have been proposed in the active contour methods, such as the balloon force model [8], region-based methods [13, 2]

, gradient vector flow method (GVF) 

[22] and elastic interaction model [21, 20]. Compared to the pixel-wise loss functions in CNN models, the curve evolution based on energy functionals in the active contour methods considers the topological or physical properties of the target objects rather than learning each pixel individually. However, these methods can only process one image at a time and cannot handle massive images in a collective way as in CNN methods.

Recently, methods have been proposed [3, 10, 9] to train a CNN by minimizing curve-evolving energy functionals. In [3] and [10], the image segmentation task was represented as a minimization problem of the Mumford-Shah functional. While in [9], an objective minimization function based on boundary integral was defined. Despite these efforts, developing robust medical image segmentation methods, especially for long, thin structures in medical images, is still a challenging task.

In this paper, we propose an elastic interaction-based loss function in a deep learning framework specifically for the long, thin structures that are commonly seen in medical images. Under the supervision of this loss function, the predicted region will be attracted strongly by the object boundary through a global elastic interaction. Moreover, the self-interaction of the boundary of the predicted region will tend to smooth its own shape and will keep the predicted boundary connected. Due to these properties, this new loss function shows great advantages in tackling the segmentation of thin, long structures, such as vessels, in medical images. Also, one of the difficulties for most of the energy functional-based loss functions during training is the unstable performance in early steps, because the random initial guess in CNN may cause instabilities. Due to the long range nature, the minimization of the elastic interaction energy functional is not sensitive to the initialization. As a result, the proposed new CNN method with this elastic interaction-based loss function has demonstrated stable performance in the early stage of the training, as shown by the results of experiments in Section 3.

The proposed elastic interaction-based loss function can be implemented efficiently using the Fast Fourier Transform (FFT), making it easy to incorporate in CNN. We examine our new loss function on three retinal image datasets, DRIVE, STARE and CHASEDB1, and the results indicate that our method is able to achieve a considerable improvement compared to commonly used pixel-wise loss functions (cross entropy and dice loss) and other recent loss functions.

2 Methodology

2.1 Review of the framework of Deep Neural Network

In this section, we will briefly review the framework of deep neural network (DNN) under supervised learning. In supervised learning, the objective function (i.e. the loss function ) that we want to minimize is a function between the output of DNN and ground truth. The output of DNN is generated by forward propagation of the input, i.e. the input of DNN is processed by each hidden layer and activation layer in this network in the forward direction. Then the parameters within each hidden layer are updated by gradient descent in back propagation. The entire process of DNN is as follows,


where is the loss function to be minimized. are input and output of network and ground truth, respectively, denotes the number of data and is the number of iterations, are the parameters to be trained, is nonlinear scalar function, means element-wise operation, and is step size in gradient descent. In this work, we will proposed a new objective function for this DNN process.

2.2 Elastic Interaction-Based Loss Function

Our elastic interaction-based loss function is inspired by the elastic interaction between dislocations, which are line defects in crystals [5]. These dislocations are connected lines and the elastic interaction between them is long-range. Inspired by this interaction energy, in deep learning segmentation task, we consider the boundary of the ground truth and that of the prediction as two curves with their own energies, during training process, the evolution of the boundary of predicted region is under the supervision of their elastic interaction energy.

Consider a parameterized curve in the plane, which represents the boundary of a region. The curve has an orientation that could be either clockwise or counterclockwise. The elastic energy of the curve is


where is the curve with another parameter , vector with being the unit tangent vector and being the line element of the curve , and same for of , is the vector between the points on and on , and the distance between these two points is . This is the elastic energy stored in three dimensional space of a dislocation line in the plane. (Note that this is a simplified version of the dislocation interaction energy [5]. Prefactor of the energy has also been omitted in this formulation.)

The notation in Eq. (2) can also be understood as a collection of curves. Especially, for a system of two curves and , the total elastic energy is


In this expression, the first two terms are the self-energies of the two curves, respectively, and the last term is the interaction energy between them. Note that this total energy of two curves and depend on their orientations.

For a system of two coincident curves with opposite orientations, i.e., , the total energy in Eq. (3) vanishes. This means a perfect segmentation result if curve is the boundary of the object (the ground truth) and is the evolving boundary of prediction [21].

Using an artificial time , we describe the evolution of a curve by minimizing the total energy in the steepest descent direction, which is , where is the velocity of the curve and is the force on it. (This is essentially the dynamics of dislocations with mobility in its unit.) The obtained velocity of the curve is


where is the vector between the point and a point on , and is the distance between the two points. and are the unit tangent vector and unit normal vector of at the point , respectively, and and are the unit tangent vector and unit normal vector of at the point . In the derivation of this equation, we have used , , , where is the unit vector in the direction. Note that in Eq. (4), could be a collection of curves.

Therefore, in the image space, derived from (2) by using for the ground truth whose boundary is , and for the moving curve, where and are regularized Dirac delta functions of curves and , the proposed loss function in CNN is,


where is the level set representation of the moving curve [14], and

is a hyperparameter. Here

is a smoothing Heaviside function which controls the width of the contour by regularization parameter ,


The parameter and the Heaviside function control the stregth of the elastic energy generated by the moving curve compared with that generated by the object boundary

. In CNN context, we applied HardTanh activation function whose range is

instead of using the above for convenience. The level set representation can be computed by , where is the softmax output of target class.

Similarly, we describe the velocity of curve (the boundary of predicted region) derived from  (4) as,


The velocity of curve is the gradient of loss with respect to the level representation of predicted curve , and it will be used in back propagation for training.

2.3 Efficient Computation for Loss function and the Backward Gradient

In order to compute the loss function  (5) and gradient  (7) efficiently in CNN, we reformulate them by Fast Fourier Transform which reduces the computational complexity from to .

Assume the Fourier transform of in equation (5) is , where are the frequencies in Fourier space. It can be calculated that out the fourier transform of is . Therefore, by Parseval’s identity, the loss function can be expressed in the Fourier space as


From Eq. (7), the Fourier transform of the gradient of with respect to the output is


According to this equation, we obtain the gradient for back propagation by inverse Fourier transform


2.4 Discussion on connectivity and fast convergence due to the strong long-range attractive interaction

The elastic interaction between two curves and is strongly attractive when the two curves have opposite orientations, as can be seen from the interaction energy given by the last term in Eq. (3). When the moving curve (the predicted boundary) is set to have an opposite orientation with that of the object boundary (ground truth), the moving curve will be attracted to the object boundary to minimize the total energy in Eq. (3), see Fig. 1(a); and when the two curves coincide, the energy minimum state is reached and the object is identified perfectly by the moving curve.

Figure 1: (a) Elastic interaction between the moving curve (the predicted boundary) and the ground truth (object boundary) . Arrows on the moving curve show schematically the directions of the interaction force acting on the moving curve . (b) Schematic illustration of recombination of a disconnected moving curve under elastic interaction in vessel segmentation. The black contour represents the true vessel, and the blue region represents the prediction. The boundaries of the disconnected parts of the prediction will be attracted to each other and recombine under the elastic interaction. Red arrows show the directions of the interaction force.

This interaction between two curves is long-range because the energy density is inversely proportional to the distance between two respective points on them, which decays very slowly as the distance approaches to infinity, see the last term in Eq. (3).

It can be shown that near a curve , the interaction force experienced by a point on another curve given in Eq. (4) has the asymptotic property [5, 21]


where is the distance from the point on curve to the curve . Therefore, the elastic interaction provides a strong attractive force to recombine a disconnected moving curve. See a schematic illustration in vessel segmentation in Fig. 1(b). (Here and are the two branches of the moving curve.)

Moreover, the force generated by the moving curve on itself is proportional to , where is the curvature of the curve, and is the half width of the smooth Heaviside function [5, 21]. This self-force has the effect to smooth the moving curve.

3 Experiment

3.1 Dataset

DRIVE: The Digital Retinal Images for Vessel Extraction (DRIVE) dataset [19] is provided by a diabetic retinopathy screening program in the Netherlands. This dataset includes 40 images ( pixels) which are divided into a training set and test set, both containing 20 images and the corresponding ground truth. In our experiment, we split 5 images from training set as validation set. We train the DNN for vessel segmentation on 15 training data and evaluate our algorithm with 20 test images.

STARE: STARE dataset  [6, 7] consists of 20 color retinal images with size pixels. 15 images were used for training and the remaining were used for validation. The hand labeled ground truth are provided for all images in this dataset.

CHASEDB1: This dataset [4] includes 30 color images with size pixels. We split 5 images for validation, 25 images for training. The ground truth is provided for each image.

3.2 Implementation Details

Data preprocessing: Since the retinal images are low contrast usually, we selected the green channels of images for training because the green channels have higher contrast than the red and blue channels. Then all values in images are normalized into .

Architecture and training: In our experiment, we employed U-Net [17]

as the end-to-end convolutional neural network architecture on our experiment. Based on this architecture, we trained our model 50 epochs with a batch size equal to 6 on training dataset. Adam optimizer was used to optimize this model with learning rate of 0.001. For our elastic interaction loss function in Eq. (

5), we set hyperparameter , and

in the Hardtanh (smoothing Heaviside) function. We implemented our model based on Pytorch

[15] and ran all the experiments by the machine equipped with an TESLA P100 with 16GBs of memory.

Evaluation metrics:

To evaluate the performance of segmentation, sensitivity, specificity, f1 score and AUC (the area under the receiver operating characteristics curve) were calculated.

3.3 Results

Dataset Method Sen Spec F1 score AUC
CE 0.9667 0.7721 0.8025 0.8694
DICE 0.9615 0.7287 0.7658 0.8451
DRIVE AC [3] 0.9647 0.7682 0.7921 0.8664
SL [9] 0.9660 0.7940 0.8043 0.8800
Proposed EL 0.9664 0.8067 0.8093 0.8866
CE 0.9564 0.7296 0.7214 0.8430
DICE 0.9403 0.6588 0.6212 0.7996
STARE AC [3] 0.9419 0.6461 0.6144 0.7940
SL [9] 0.9540 0.7434 0.7146 0.8487
Proposed EL 0.9576 0.7449 0.7304 0.8513
CE 0.9526 0.8408 0.8245 0.8967
DICE 0.9276 0.7243 0.7186 0.8259
CHASEDB1 AC [3] 0.9506 0.8215 0.8145 0.8861
SL [9] 0.9545 0.8207 0.8258 0.8876
Proposed EL 0.9526 0.8428 0.8248 0.8977
Table 1: Results of the proposed elastic loss function (EL) and other loss functions on DRIVE, STARE and CHASEB1. Sen, Spec, F1 score, AUC in the table are Sensitivity, Specificity, F1 score and the area under the receiver operating characteristics curve, respectively.

Quantitative evaluation: We evaluated our model with the elastic interaction-based loss function (EL) on three datasets comparing to models with cross entropy loss function (CE), the dice coefficient loss (DICE), active contour loss (AC) [3] and surface loss [9]. The quantitative results of our proposed loss on three datasets are shown in Table 1. The experimental results show that our proposed loss function has better performance than other loss functions by achieving highest Sensitivity, Specificity, F1 scores and AUC on DRIVE and STARE, highest AUC and Specificity on CHASEDB1.

We examine the stability and convergence of Deep neural network in training steps with our proposed elastic interaction-based loss (EL) compared with those of traditional loss functions (CE and Dice) and other energy functional-based loss functions (AC and SL). In Fig.2, we can observe that the performance of network under the guidance of EL in early training steps is more stable than that of other two energy functional-based loss functions, and it also shows higher F1 scores compared with those of other two loss functions. Similarly, in Fig.3, compared to performance of traditional loss functions, EL shows more stable training performance, faster convergence and better result.

Figure 2: Validation F1 scores durning training steps on DRIVE dataset with three energy functional-based loss functions. EL: red curve, AC: green curve and SL: blue curve.
Figure 3: Validation F1 scores durning training steps on DRIVE dataset with EL and other two traditional loss functions CE and Dice. EL: red curve, CE: green curve and Dice: blue curve.

Qualitative evaluation: Figure 4 shows the qualitative results of different loss functions. From Fig.4, we can see that the results under the proposed loss outperform other loss functions because our method can segment the much more details of vessel among all of the mentioned loss functions. Specifically, in the first row of Fig.4, the result under the supervision of cross entropy can not extract the thin and long vessel within red box and the prediction is disconnected, thus this segmentation lost a lot of details. While in the prediction from our proposed loss (EL), we can see that it segments the full vessels successfully and the predicted vessels are connected. Therefore, we conclude that the elastic interaction loss function has better performance than other three loss functions in extracting details of images (especially for thin and long structures). Same for the second row.

Figure 4: The example test results obtained by our proposed loss and other loss functions. From left to right, original images, ground truth, segmentation result by cross entropy (CE), our proposed loss (EL), surface loss (SL),active contour loss (AC).

Computational time: As we mentioned in Section 2.3, computational complexity of our proposed elastic loss (EL) is . In practice, running time of EL durning the training and test processes on our experimental device mentioned on Section 3.2 is comparable with those of traditional loss functions. For DRIVE dataset, it took s on training and took s to segment one test image. CE and DICE took s and s on training, respectively and they use s on segmenting each test image. While for STARE, the running time of EL (Training: s, Test per image: s) is also very close to that of CE (Training: s, Test per image: s) and DICE (Training: s, Test per image: s). Those loss functions showed similar behavior on CHASEDB1.

4 Conclusions and Discussion

In this paper, we propose a new elastic interaction-based loss function that can connect the whole segmented boundary strongly, thus it has great advantages in the segmentation of details of images in medical image datasets. Experimental results show that the proposed loss function indeed enhances the ability of CNN to segment medical images with complex structures and is able to achieve better performance comparing to the commonly used loss functions and other energy functional based loss functions. We would like to remark that the proposed loss function is developed to detect the details of long, thin structures such as vessels in medical images. When it applies to general images without these structures, the proposed method may not necessarily have obvious advantages.


This work was supported by The Hong Kong University of Science and Technology IEG19SC04.


  • [1] Chan, T.F., Shen, J., Vese, L.: Variational pde models in image processing. Notices of the AMS 50(1), 14–26 (2003)
  • [2] Chan, T.F., Vese, L.A.: Active contours without edges. IEEE Transactions on image processing 10(2), 266–277 (2001)
  • [3]

    Chen, X., Williams, B.M., Vallabhaneni, S.R., Czanner, G., Williams, R., Zheng, Y.: Learning active contour models for medical image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 11632–11640 (2019)

  • [4] Fraz, M.M., Remagnino, P., Hoppe, A., Uyyanonvara, B., Rudnicka, A.R., Owen, C.G., Barman, S.A.: An ensemble classification-based approach applied to retinal blood vessel segmentation. IEEE Transactions on Biomedical Engineering 59(9), 2538–2548 (2012)
  • [5] Hirth, J.P.: Theory of dislocations. Krieger Pub. Co., Malabar, Fla., 2nd, reprint ed.. edn. (1992)
  • [6] Hoover, A., Kouznetsova, V., Goldbaum, M.: Locating blood vessels in retinal images by piecewise threshold probing of a matched filter response. IEEE Transactions on Medical imaging 19(3), 203–210 (2000)
  • [7] Hoover, A., Goldbaum, M.: Locating the optic nerve in a retinal image using the fuzzy convergence of the blood vessels. IEEE transactions on medical imaging 22(8), 951–958 (2003)
  • [8] Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. International journal of computer vision 1(4), 321–331 (1988)
  • [9] Kervadec, H., Bouchtiba, J., Desrosiers, C., Granger, E., Dolz, J., Ayed, I.B.: Boundary loss for highly unbalanced segmentation. arXiv preprint arXiv:1812.07032 (2018)
  • [10] Kim, B., Ye, J.C.: Mumford–shah loss functional for image segmentation with deep learning. IEEE Transactions on Image Processing 29, 1856–1866 (2019)
  • [11] Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., Van Der Laak, J.A., Van Ginneken, B., Sánchez, C.I.: A survey on deep learning in medical image analysis. Medical image analysis 42, 60–88 (2017)
  • [12] Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV). pp. 565–571. IEEE (2016)
  • [13] Mumford, D., Shah, J.: Optimal approximations by piecewise smooth functions and associated variational problems. Communications on pure and applied mathematics 42(5), 577–685 (1989)
  • [14] Osher, S., Sethian, J.A.: Fronts propagating with curvature-dependent speed: algorithms based on hamilton-jacobi formulations. Journal of computational physics 79(1), 12–49 (1988)
  • [15] Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
  • [16] Pham, D.L., Xu, C., Prince, J.L.: Current methods in medical image segmentation. Annual review of biomedical engineering 2(1), 315–337 (2000)
  • [17] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)
  • [18] Shen, D., Wu, G., Suk, H.I.: Deep learning in medical image analysis. Annual review of biomedical engineering 19, 221–248 (2017)
  • [19] Staal, J., Abràmoff, M.D., Niemeijer, M., Viergever, M.A., Ginneken, B.V.: Ridge-based vessel segmentation in color images of the retina. IEEE Transactions on Medical Imaging 23(4), 501–509 (2004)
  • [20] Xiang, Y., Chung, A.C., Ye, J.: A new active contour method based on elastic interaction. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). vol. 1, pp. 452–457. IEEE (2005)
  • [21] Xiang, Y., Chung, A.C., Ye, J.: An active contour model for image segmentation based on elastic interaction. Journal of computational physics 219(1), 455–476 (2006)
  • [22] Xu, C., Prince, J.L.: Snakes, shapes, and gradient vector flow. IEEE Transactions on image processing 7(3), 359–369 (1998)
  • [23] Zhao, F., Xie, X.: An overview of interactive medical image segmentation. Annals of the BMVA 2013(7), 1–22 (2013)