Global and Local Information Based Deep Network for Skin Lesion Segmentation

03/16/2017 ∙ by Jin Qi, et al. ∙ 0

With a large influx of dermoscopy images and a growing shortage of dermatologists, automatic dermoscopic image analysis plays an essential role in skin cancer diagnosis. In this paper, a new deep fully convolutional neural network (FCNN) is proposed to automatically segment melanoma out of skin images by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN is capable of using both local and global information to segment melanoma by adopting skipping layers. The public benchmark database consisting of 150 validation images, 600 test images and 2000 training images in the melanoma detection challenge 2017 at International Symposium Biomedical Imaging 2017 is used to test the performance of our algorithm. All large size images (for example, 4000× 6000 pixels) are reduced to much smaller images with 384× 384 pixels (more than 10 times smaller). We got and submitted preliminary results to the challenge without any pre or post processing. The performance of our proposed method could be further improved by data augmentation and by avoiding image size reduction.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Over five million cases of skin cancer are newly diagnosed in the United States annually[1]. Early detection of melanoma, one of the most lethal forms of skin cancer, is critical in finding curable melanomas as well as increasing survival rate[2, 3]. With a large influx of dermoscopy images, arriving of inexpensive consumer dermatoscope attachments for smart phone[4] and a growing shortage of dermatologists[5], automatic dermoscopic image analysis plays an essential role in timely skin cancer diagnosis. Especially, lesion segmentation is one of the most important steps in dermoscopy image access.

However, automatic lesion segmentation from dermoscopy images is a challenging task. The relatively low contrast and obscure boundaries between early stage skin lesions and normal skin regions make it very difficult to distinguish lesion area from normal skin region. The situation deteriorates rapidly when skin lesions are blurred or excluded by artifacts such as natural hairs, veins, artificial ruler marks, color calibration charts, and air bubbles, etc. Some example images from ”ISBI 2017 Skin Lesion Analysis Towards Melanoma Detection Challenge” (ISBI2017 SLATMDC)[6] are shown in Fig.1.

Figure 1:

Difficult example images with low contrast, large intraclass variety, and artifacts. Ground truth contours are marked in blue color. First row: low contrast between skin lesion and normal skin regions; Second row: large intraclass variance; Last row: artifacts in images. All images are from ISBI2017 SLATMDC

[6].

Many works have been done to try to deal with this difficult problem. Previously, some researchers proposed to segment lesions based on hand-crafted features[7, 8, 9, 10, 11]. Recently, convolutional neural network(CNN) is adopted to improve medical image segmentation accuracy[12, 13]. Deep CNN based skin cancer recognition has been initially proven to achieve dermatologist-level classification accuracy[14]. Fully convolutional neural network(FCNN) has shown promising segmentation results in natural images(PASCAL VOC) by fusing multi-scale prediction maps from multiple layers in itself[15]. FCNN is based on a simple end-to-end learning and multi-scale contextual information is automatically explored in FCNN framework.

Inspired by [15]

, we propose a new FCNN framework for skin lesion segmentation in this paper. We pick up the VGG 16-layer net(publicly available from the Caffe model zoo) pretrained in ImageNet, discard its last classification layer, and replace all fully connected layers by convolution layers with randomly initialized weights. The weights in the remaining layers are kept to tackle small medical image dataset problem based on transfer learning. We add skipping layers to fuse multi-scale prediction maps from multiple layers. Different from

[15] which fuses prediction maps by pixel-wise plus operation, we treat multi-scale prediction maps as multi-scale feature maps and fuse them by concatenation operation. With our concatenating fusing strategy to keep fused features in higher space, we can capture more distinguishing local features and global features. Then we add a randomly initialized convolution layer as the last layer to finish prediction with fused features as inputs.

In [16], very deep fully convolutional residual network (FCRN) was proposed to do skin lesion segmentation. The authors in [16] focused on residual network and very deep net(more than 50 layers). However, our proposed FCNN has different structures and is relatively shallower(less than 20 layers). Our experimental results are still comparable with that from [16]. Our major contributions in this paper are summarized as follows:

  • We design a novel fully convolutional neural network for skin lesion segmentation in dermatoscopic images with end-to-end learning, transfer learning, and pixel by pixel prediction.

  • We treat multi-scale prediction maps as feature maps and propose a new fusing layer with a concatenation operation to combine these multi-scale contextual information to obtain better distinguishing features. We also add a convolution layer as the last layer of our FCNN to take the fused features as inputs to finish prediction.

  • We evaluate our proposed FCNN on the public dermatoscopy image dataset from ”ISBI 2017 Skin Lesion Analysis Towards Melanoma Detection Challenge”[6]. We got and submitted preliminary results to the challenge without any pre or post processing.

The remainder of this paper is organized as follows. We detail our method in Section 2. Experimental results are reported in Section 3. Section 4 is our discussion and conclusion.

2 Proposed Fully Convolutional Network

In this section, we detail our proposed fully convolutional network for skin lesion segmentation. Due to relatively small number of medical images and labels available, we begin with pretrained network in image classification task to design our own dense prediction network in order to use transfer learning. Skip layers are added to incorporate multi-scale information into our network. The architecture of our proposed FCNN is illustrated in Fig. 2.

Figure 2: Architecture of our proposed FCNN. Blue vertical bar with solid contour: convolution layer; red vertical bar with dash contour: pooling layer; Blue small square: convolution layer; Blue horizontal bar: deconvolution layer for upsampling; Blue rectangular cube: fusing layer with concatenation operation. Our FCNN learns to use multi-scale information.

2.1 Basic Network Architecture

We pick up the pretrained VGG 16-layer net (publicly available from the Caffe model zoo) consisting of 13 convolution layers, 3 fully connected layers, 15 RELU layers, 5 pooling layers, 2 dropout layers and 1 softmax layer. This VGG net has been trained for natural image classification in ImageNet. For our dense segmentation purpose, we discard its last two classification layers (last fully connected layer and softmax layer) and replace the remaining 2 fully connected layers with randomly initialized convolution layers. The learned weights in the remaining layers are kept to tackle small medical image dataset problem based on transfer learning. Thus, The modified network is a fully convolutional network which is able to take arbitrary size of images as inputs.

2.2 Adding Skip Layers

We add skipping layers to fuse multi-scale prediction maps from multiple layers. Different from [15] which fuses prediction maps by pixel-wise sum operation, we treat multi-scale prediction maps as multi-scale feature maps and fuse them by concatenation operation. Specifically, after each pooling layer we add an convolution layer to compute a dense prediction map with 2 channels (number of classes for each pixel in this paper). The

convolution layer can be considered as a classifier to classify each pixel independently

[15]. An convolution layer is also attached to the last convolution layer in basic network from above subsection. Thus we already added 6 convolution layers in total to the basic network. The outputs of the 6 added convolution layers are 6 dense prediction maps (2 channels each map).

The above 6 dense prediction maps have different resolutions or sizes. We add multiple deconvolution layers[15] to upsample the 6 dense maps to the original size of the input image. The weights of deconvolution layers are randomly initialized and all of them can be learned automatically[15]. Then we add a concatenation layer to concatenate the 6 upsampled prediction maps (with the same size as input image) in third dimension to form a final prediction feature map with channels.

With our concatenating fusing strategy to keep fused features in higher space (12 dimension in this paper), we can capture more distinguishing local features and global features. Then we add a randomly initialized convolution layer after the above added concatenation layer as the last layer to compute the final prediction map with the concatenated feature map as input. In training stage, we add a softmax loss layer to fine-tune our network. In test, the loss layer is removed.

3 Experimental Results

3.1 Dataset

We use the public dermatoscopy image dataset from ”ISBI 2017 Skin Lesion Analysis Towards Melanoma Detection Challenge”[6] to evaluate the performance of our proposed method in this paper. This dataset consists of 150 validation images, 600 test images and 2000 training images. The ground truth binary mask images from expert manual segmentation are also provided with pixel value 255 and 0 indicating lesion pixel and skin pixel, respectively.

3.2 Implementation

Our proposed method is implemented with Maltlab by using the open source deep learning toolbox ”MatConvNet”[17]

. Our computer has a NVIDIA GTX1080 GPU card, Windows 7 system, an Intel i7-7700K processor with 4.5 GHz and 64G memory. In order to tackle the small medical data problem, we utilize a pretrained VGG 16-layer net (publicly available from the Caffe model zoo) to initialize some weights of our proposed FCNN based on transfer learning idea. Stochastic gradient descent(SGD) is used to fine-tune our proposed FCNN with batch size 6, momentum 0.9, weight decay 0.0001, learning rate 0.001.

3.3 Evaluation Metrics

We also use the same metrics as the challenge organizer to evaluate the performance of our proposed method. These metrics are sensitivity(SE), specificity(SP), accuracy(AC), Jaccard index(JA) and Dice coefficient(DI).

According to the challenge rules, these criteria are calculated for each test image and then are averaged on the whole test set to get the final results. Methods are ranked by their respective JA values.

3.4 Performance of Our Method

We demonstrate the good performance of our proposed method by giving qualitative segmentation results and quantitative results.

3.4.1 Qualitative Results

To show how our proposed method works on difficult images, we show our segmentation results of some challenging images in Fig. 3 where ground truth contours and our segmented contours are marked in blue and red color, respectively. Images with low contrast, irregular shapes and artifacts are shown in the first row, second row, last row of Fig. 3, respectively.

From Fig. 3, it can be seen that our method works very well in all of these challenging images.

Figure 3: Example segmentation results of challenging images. Ground truth contours and our detected contours are marked in blue color and red color, respectively. First row: low contrast images; Second row: large intraclass variance; Last row: artifacts.

3.4.2 Quantitative Results

We got and submitted preliminary results to the challenge without any pre or post processing.

4 Discussion and Conclusion

In this paper, we build a new FCNN for skin lesion segmentation. We got and submitted preliminary results to the challenge without any pre or post processing. The performance of our proposed method could be further improved by data augmentation and by avoiding image size reduction.

References

  • [1] Rebecca L. Siegel, Kimberly D. Miller, and Ahmedin Jemal, “Cancer statistics, 2016,” CA: A Cancer Journal for Clinicians, vol. 66, no. 1, pp. 7–30, 2016.
  • [2] Kenneth A. Freedberg, Alan C. Geller, Donald R. Miller, Robert A. Lew, and Howard K. Koh, “Screening for malignant melanoma: A cost-effectiveness analysis,” Journal of the American Academy of Dermatology, vol. 41, no. 5, pp. 738 – 745, 1999.
  • [3] Charles M. Balch and etc. Antonio C. Buzaid, “Final version of the american joint committee on cancer staging system for cutaneous melanoma,” Journal of Clinical Oncology, vol. 19, no. 16, pp. 3635–3648, 2001.
  • [4] MoleScope, https://molescope.com/,” .
  • [5] Alexa Boer Kimball and Jack S. Resneck Jr., “The {US} dermatology workforce: A specialty remains in shortage,” Journal of the American Academy of Dermatology, vol. 59, no. 5, pp. 741 – 745, 2008.
  • [6] “Skin lesion analysis toward melanoma detection: A challenge at the international symposium on biomedical imaging (ISBI) 2017, hosted by the international skin imaging collaboration (ISIC),” 2017.
  • [7] Tatiana Tommasi, Elisabetta La Torre, and Barbara Caputo, “Melanoma recognition using representative and discriminative kernel classifiers,” in

    Proceedings of the Second ECCV International Conference on Computer Vision Approaches to Medical Image Analysis

    , Berlin, Heidelberg, 2006, CVAMIA’06, pp. 1–12, Springer-Verlag.
  • [8] M. Emre Celebi and etc. Hassan A. Kingravi, “A methodological approach to the classification of dermoscopy images,” Computerized Medical Imaging and Graphics, vol. 31, no. 6, pp. 362 – 373, 2007.
  • [9] H. Ganster, P. Pinz, and etc. R. Rohrer, “Automated melanoma recognition,” IEEE Transactions on Medical Imaging, vol. 20, no. 3, pp. 233–239, March 2001.
  • [10] Gerald Schaefer, Bartosz Krawczyk, M. Emre Celebi, and Hitoshi Iyatomi, “An ensemble classification approach for melanoma diagnosis,” Memetic Computing, vol. 6, no. 4, pp. 233–240, 2014.
  • [11] M. H. Jafari and etc. S. Samavi, “Set of descriptors for skin cancer diagnosis using non-dermoscopic color images,” in 2016 IEEE International Conference on Image Processing (ICIP), Sept 2016, pp. 2638–2642.
  • [12] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-net: Convolutional networks for biomedical image segmentation,” may 2015.
  • [13] Hao Chen and etc. Xiaojuan Qi, “Dcan: Deep contour-aware networks for object instance segmentation from histology images,” Medical Image Analysis, vol. 36, pp. 135 – 146, 2017.
  • [14] Esteva Andre, Kuprel Brett, and etc. Novoa Roberto A., “Dermatologist-level classification of skin cancer with deep neural networks,” Nature, vol. 542, 2017.
  • [15] Jonathan Long, Evan Shelhamer, and Trevor Darrell, “Fully convolutional networks for semantic segmentation,” CoRR, vol. abs/1411.4038, 2015.
  • [16] L. YU, H. Chen, Q. Dou, J. Qin, and P. A. Heng, “Automated melanoma recognition in dermoscopy images via very deep residual networks,” IEEE Transactions on Medical Imaging, vol. PP, no. 99, pp. 1–1, 2016.
  • [17] A. Vedaldi and K. Lenc, “Matconvnet – convolutional neural networks for matlab,” in Proceeding of the ACM Int. Conf. on Multimedia, 2015.