Diabetes Mellitus (DM) commonly known as Diabetes, is a serious and chronic metabolic disease that is characterized by elevated blood glucose due to insufficient insulin produced by the pancreas (Type 1) and human body’s inability to use insulin effectively (Type 2) . It can further causes major life-threatening complications like potential blindness, cardiovascular, peripheral vascular and cerebrovascular diseases, kidney failure and Diabetic Foot Ulcers (DFU) which can lead to lower limb amputation 
. There is a meteoric rise in diabetes from 108 million patients to 422 million worldwide where the low/middle income countries are disproportionately affected. In 2012, over 1.5 million deaths were caused by diabetes only and 43% of these deaths are under the age of 70 years. It is estimated that by the end of 2035, around 600 million people will be suffering from DM. Every year, more than 1 million patients suffering from diabetes lose part of their the leg due to the failure to recognize and treat DFU appropriately .
In current practice, medical experts (DFU specialist and podiatrist) primarily examine and assess the DFU patients on visual inspection with manual measurements tools to determine the severity of DFU. They also use the high-resolution images to evaluate the state of DFU, which can further comprise of various important tasks in early diagnosis, keeping track of development and number of actions taken to treatment and management of DFU for each particular case: 1) the medical history of patient is evaluated; 2) a wound or DFU specialist examines the DFU thoroughly; 3) additional tests like CT scans, MRI, X-Ray may be useful to help develop a treatment plan . Usually, the DFU have irregular structures and uncertain outer boundaries. The appearance of DFU and its surrounding skin varies depending upon the various stages i.e. redness, callus formation, blisters, significant tissues types like granulation, slough, bleeding, scaly skin .
The skin surrounding around DFU is very important as its condition determines if the DFU is healing and is also a vulnerable area for extension [7, 8]. There are many factors that increase the risk of vulnerable skin such as ischemia, inflammation, abnormal pressure, maceration from exudates etc. Similarly, healthy skin around the DFU indicates good healing process. Surrounding skin is examined by inspection of colour, discharge and texture, and palpation for warmth, swelling and tenderness. On visual inspection, redness is suggestive of inflammation, which is usually due to wound infection. Black discoloration is suggestive of ischemia. White and soggy appearance is due to maceration and white and dry is usually due to increased pressure. It is important to recognize that skin appearances look different in different shades of skin. Lesions that appear red or brown in white skin, may appear black or purple in black or brown skin. Mild degrees of redness may be masked completely in dark skin.
have used an image capture box to capture image data and determined the area of DFU using cascaded two staged SVM-based classification. Similarly, computer methods based on manually engineered features or image processing approaches were implemented for segmentation of DFU and wound. The segmentation task was performed by extracting texture descriptors and colour descriptors on small patches of wound images, followed by machine learning algorithms to classify them into normal and abnormal skin patches[11, 12, 13, 14]
. As in many computer vision systems, the hand-crafted features are affected by skin shades, illumination, and image resolution. Also, these techniques struggled to segment the irregular contour of the DFU or wounds. Additionally, due to the complication of surrounding skin, it is almost impossible to include the broad description of surrounding skin. On the other hand, the unsupervised approaches rely upon image processing techniques, edge detection, morphological operations and clustering algorithms using different colour space to segment the wounds from images[15, 16, 17]. The majority of these methods involve manually tuning of the parameters according to the different input images which are very impractical in clinical perspective. In addition to the limitations of the segmentation algorithms, the state-of-the-art methods were validated on relatively small datasets, ranging from 10 to 172 images.
In this work, we propose automated segmentation of DFU and its surrounding skin by using fully connected networks. The contributions of this paper include
To overcome the deficiency of DFU dataset in the state of the art, we present the largest DFU dataset alongside with the annotated ground truth.
This is the first attempt in computer vision methods to segment the significant surrounding skin separately from the DFU.
We propose a two-tier transfer learning method by training the fully convolutional networks (FCNs) on larger datasets of images and use it as pre-trained model for the segmentation of DFU and its surrounding skin. The performance is compared to other deep learning framework and the state-of-the-art DFU/wound segmentation algorithms on our dataset.
This section describes the preparation of the dataset, this includes expert labelling of the DFU and surrounding skin on foot images. The description of segmentation using conventional methods and deep learning methods are detailed. Finally, the performance metrics used for validation are reported.
Ii-a DFU Dataset
A DFU dataset was collected over a five period at the Lancashire Teaching Hospitals and all the patients gave written informed consent. A subset of the images was used for this study, which include 600 DFU images and 105 healthy foot images. We received the NHS Research Ethics Committee approval with REC reference number 15/NW/0539 to use these images for our research. These DFU images were captured with Nikon D3300. Whenever possible, the images were acquired with close-ups of the full foot with the distance of around 30-40 cm with the parallel orientation to the plane of an DFU. The use of flash as the primary light source was avoided and instead, adequate room lights are used to get the consistent colours in images. To ensure the close range focus and avoiding the blurriness in images from the close distance, a Nikon AF-S DX Micro NIKKOR 40mm f/2.8G lens was used.
The ground truth annotation of our dataset was performed by a podiatrist specializing in the diabetic foot and validated by a consultant specializing in diabetes. We created ground truth for each image with DFU by using Hewitt et al.  annotator. For each DFU image (as illustrated in Fig. 1), the expert delineated the region of interest (ROI) as the combination of DFU and its surrounding skin. Then in each ROI, the two classes were labelled separately and exported to an XML file. These ground truths were further converted into the label image of single channel 8-bit paletted image (commonly known as Pascal VOC format for semantic segmentation) as shown in Fig. 1. In this format, index 0 maps to black pixels represent the background, index 1 (red) represents the surrounding skin and index 2 (green) as DFU. From 600 DFU images in our dataset, we produce 600 ROIs of DFU and 600 ROIs for surrounding skin around the DFU.
Ii-B Fully Convolutional Networks for DFU segmentation
Deep learning models proved to be powerful algorithms to retrieve hierarchies of features to achieve various tasks of computer vision. These convolutional neural networks, especially classification networks have been used to classify various classes of objects by assigning discrete probability distribution for each class. But, these networks have limitations as they are not able to classify multiple classes in a single image and figure out the position of the objects in images. FCNs instead produce segmentation by addressing these limitations by pixel-wise prediction rather than single probability distribution in the classification task for each image. Therefore, each pixel of a image is predicted for which class it belongs. The working of FCN architecture to produce pixel-wise prediction with the help of supervised pre-training using the ground truth is illustrated in Fig.2. Hence, these models have the ability to predict multiple objects of various classes and position of each object in images.
Ii-C Transfer Learning
We used two-tier transfer learning for FCNs to perform more effective segmentation on DFU dataset. In first tier transfer learning, relevant CNN models that are used to make FCNs are trained on the ImageNet dataset with millions images for fine-tuning the weights associated with initial convolutional layers. In second tier transfer learning, we trained the FCN models on the Pascal VOC segmentation dataset . These pre-trained models are used for training the FCN models on DFU dataset for better convergence of weights associated with all layers of network rather than random initialization of weights. The two-tier transfer learning is illustrated in Fig. 3.
The FCN-AlexNet is a fully convolutional network version of original classification model AlexNet by few adjustment of layers of networks for segmentation . This network was originally used for classification of 1000 different objects of classes on the ImageNet dataset. It emerged as winner of ImageNet ILSVRC-2012 competition in classification category by achieving 99% confidence . There are few customizations made in the classification network model in order to convert it into FCN to carry out dense prediction. In FCN-AlexNet, earlier CNN layers are kept the same to extract the features and fully connected layers which throw away the positional coordinates are convolutionalized with the equivalent convolutional layers by adjusting the size of filters according to the size of the input to these layers 
. After the extraction of coarser and high-level features from input images, to produce the pixel-wise prediction for every pixel of the input, the deconvolutional layers work exactly opposite to the convolutional layers and stride used in this layer is equal to the scaling factor used in the convolutional layers.
The input was 500
500 foot images and ground truth images (Pascal VOC format). In the end, the network prediction on test images was very close to the ground truth. We used the Caffe
framework to implement FCN-AlexNet. We have used these network parameters to train a model on the dataset i.e. 60 epochs, a learning method as stochastic gradient descent as rate of 0.0001 with a step-down policy and step size of 33%, and gamma is 0.1. The learning parameter is decreased by the factor of 100 due to the introduction of new convolutional layers instead of fully connected layers which result in improved performance of FCN-AlexNet and other FCNs.
Ii-C2 FCN-32s, FCN-16s, FCN-8s
FCN-32s, FCN-16s, and FCN-8s are three models inspired by the VGG-16 based net which is a 16 layer CNN architecture that participated in the ImageNet Challenge 2014 and secured the first position in localization and second place in classification competition [24, 21]. These models are customized with the different upsampling layers that magnify the output used in the original CNN model VGG-16. FCN-32s is same as of FCN-VGG16 in which fully connected layers are convolutionized and end to end deconvolution is performed with 32-pixel stride. The FCN-16s and FCN-8s additionally work on low-level features in order to produce more accurate segmentation. In FCN-16s, the final output is sum of upsampling of two layers i.e. upsampling of pool4 and 2upsampling of convolutional layer 7 whereas in FCN-8s, it is the sum of upsampling of pool3, 2upsampling of pool4 and 4upsampling of convolutional layer 7. Both models perform prediction on much more finer grained analysis i.e. 1616 pixel blocks for FCN-16s and 88 pixel blocks for FCN-8s. The suitable pre-trained models for each model are also used in the training. The same input images are used to train the model with same parameters as of FCN-AlexNet i.e. 60 epochs, a learning rate of 0.0001, and gamma of 0.1.
Iii Experiment and Result
As mentioned previously, we used the deep learning models for the segmentation task. The experiments were carried out on the DFU dataset of 600 DFU foot images that was splitted into the 70% training, 10% validation and 20% testing. We adopted 5-fold cross-validation. For training and validation using the deep learning architecture, we used 420 images and 60 images respectively from the 600 original DFU images. Finally, we tested our model predictions on 120 remaining images. Further, we tested the performance of the models on 105 healthy test images.
The performance evaluation of the FCN frameworks on the testing set is achieved with 3 different DFU regions due to the practical medical applications. The DFU regions are explained below:
The complete area determination (including Ulcer and Surrounding Skin).
The DFU region
The surrounding skin (SS) region
In Table I, we report Dice Similarity Coefficient (Dice), Sensitivity, Specificity, Matthews Correlation Coefficient (MCC)
as our evaluation metrics for segmentation of DFU region. In medical imaging,Sensitivity and Specificity are considered reliable evaluation metrics and where as for segmentation evaluation, Dice are popularly used by researchers.
where MP is model predictions by various FCNs and GT is the ground truth labels.
In performance measures, FCN-16s was the best performer and FCN-AlexNet emerged as the worst performer for various evaluation metrics among all the other FCN architectures. Though, FCN architectures achieve comparable results when the evaluation is considered in the complete region. But, there is a notable difference in the performance of FCN models when ulcer and especially surrounding skin regions are considered. FCN-16s has achieved the best score of 0.794 (0.104) in the ulcer region and 0.851 (0.148) in the surrounding skin region for Dice. whereas the FCN-32s achieved the best score 0.899 (0.072) in the complete area determination. Overall, the FCN models has very high Specificity for all the regions. Further, assessing the FCNs performance, we observed that FCN-16s and FCN-32s are better in Sensitivity. FCN-16s performed best in the ulcer and surrounding skin regions and FCN-32s has the best in complete region performance in segmenting the complete region in terms of Sensitivity, Dice and MCC. The results in Table I showed that the complete region segmentation has better performance than ulcer and surrounding skin in terms of Dice and MCC.
Finally, we tested the performance of the trained models on healthy foot images, they produced the highest specificity of 1.0 where neither DFU nor surrounding skin was detected.
Iii-a Inaccurate segmentation cases in FCN-AlexNet, FCN-32s, FCN-16s, FCN-8s
Although the results are promising, there are few inaccurate segmentation cases that achieve very low Dice score for each trained model as shown in Fig. 5. In the Fig. 4, there are few instances in which FCN-AlexNet and FCN-32s models are not able to detect the small DFU and distinct surrounding skin or detect very small part of them. As discussed earlier, DFU and surrounding skin regions have very irregular outer boundaries, FCN-AlexNet and FCN-32s always tend to draw more regular contour and struggled to draw irregular boundaries to perform accurate segmentation, whereas, FCN-16s and FCN-8s with smaller pixel stride were able to produce more irregular contours of both DFU and surrounding skin. But, in few test images, some part of both categories overlap in some region due to the distinct tissues of DFU looks like surrounding skin and vice versa.
In this work, we developed deep learning approaches to train various FCNs that can automatically detect and segment the DFU and surrounding skin area with a high degree of accuracy. These frameworks will be useful for segmenting the other skin lesions such as moles and freckles, spotting marks (extending the work by Alarifi et. al. ), pimples, other wound pathologies classification, infections like chicken pox or shingles. This work also lays the foundations for technology that may transform the detection and treatment of DFU. This work has been done to achieve future targets that include: 1) to determine the various pathologies of DFU as multi-class classification and segmentation; 2) developing the automatic annotator that can automatically delineate and classify the DFU and related pathology; 3) developing various user-friendly system tools including mobile applications for DFU recognition and segmentation [26, 27] and computer vision assisted remote telemedicine system for the detection of DFU and provide feedback for different pathologies of diabetic feet. Moreover, this research could be applied to other related medical fields, for example, breast ultrasound lesions segmentation .
-  W. J. Jeffcoate and K. G. Harding, “Diabetic foot ulcers,” The lancet, vol. 361, no. 9368, pp. 1545–1551, 2003.
-  S. Wild, G. Roglic, A. Green, R. Sicree, and H. King, “Global prevalence of diabetes estimates for the year 2000 and projections for 2030,” Diabetes care, vol. 27, no. 5, pp. 1047–1053, 2004.
-  K. Bakker, J. Apelqvist, B. Lipsky, J. Van Netten, and N. Schaper, “The 2015 iwgdf guidance documents on prevention and management of foot problems in diabetes: development of an evidence-based global consensus,” Diabetes/metabolism research and reviews, vol. 32, no. S1, pp. 2–6, 2016.
-  D. G. Armstrong, L. A. Lavery, and L. B. Harkless, “Validation of a diabetic wound classification system: the contribution of depth, infection, and ischemia to risk of amputation,” Diabetes care, vol. 21, no. 5, pp. 855–859, 1998.
-  M. Edmonds, “Diabetic foot ulcers,” Drugs, vol. 66, no. 7, pp. 913–929, 2006.
-  B. A. Lipsky, A. R. Berendt, H. G. Deery, J. M. Embil, W. S. Joseph, A. W. Karchmer, J. L. LeFrock, D. P. Lew, J. T. Mader, C. Norden et al., “Diagnosis and treatment of diabetic foot infections,” Clinical Infectious Diseases, vol. 39, no. 7, pp. 885–910, 2004.
-  D. L. Steed, D. Donohoe, M. W. Webster, and L. Lindsley, “Effect of extensive debridement and treatment on the healing of diabetic foot ulcers. diabetic ulcer study group.” Journal of the American College of Surgeons, vol. 183, no. 1, pp. 61–64, 1996.
-  S. Rajbhandari, N. Harris, M. Sutton, C. Lockett, S. Eaton, M. Gadour, S. Tesfaye, and J. Ward, “Digital imaging: an accurate and easy method of measuring foot ulcers,” Diabetic medicine, vol. 16, no. 4, pp. 339–342, 1999.
-  C. Liu, J. J. van Netten, J. G. Van Baal, S. A. Bus, and F. van Der Heijden, “Automatic detection of diabetic foot complications with infrared thermography by asymmetric analysis,” Journal of biomedical optics, vol. 20, no. 2, pp. 026 003–026 003, 2015.
-  L. Wang, P. Pedersen, E. Agu, D. Strong, and B. Tulu, “Area determination of diabetic foot ulcer images using a cascaded two-stage svm based classification,” IEEE Transactions on Biomedical Engineering, 2016.
-  M. Kolesnik and A. Fexa, “Multi-dimensional color histograms for segmentation of wounds in images,” in International Conference Image Analysis and Recognition. Springer, 2005, pp. 1014–1022.
-  M. Kolesnik and A. Fexa, “How robust is the svm wound segmentation?” in Signal Processing Symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic. IEEE, 2006, pp. 50–53.
-  E. S. Papazoglou, L. Zubkov, X. Mao, M. Neidrauer, N. Rannou, and M. S. Weingarten, “Image analysis of chronic wounds for determining the surface area,” Wound repair and regeneration, vol. 18, no. 4, pp. 349–358, 2010.
-  F. Veredas, H. Mesa, and L. Morente, “Binary tissue classification on wound images with neural networks and bayesian classifiers,” IEEE transactions on medical imaging, vol. 29, no. 2, pp. 410–427, 2010.
-  M. K. Yadav, D. D. Manohar, G. Mukherjee, and C. Chakraborty, “Segmentation of chronic wound areas by clustering techniques using selected color space,” Journal of Medical Imaging and Health Informatics, vol. 3, no. 1, pp. 22–29, 2013.
-  A. Castro, C. Bóveda, and B. Arcay, “Analysis of fuzzy clustering algorithms for the segmentation of burn wounds photographs,” in International Conference Image Analysis and Recognition. Springer, 2006, pp. 491–501.
D. H. Chung and G. Sapiro, “Segmenting skin lesions with partial-differential-equations-based image processing algorithms,”IEEE transactions on Medical Imaging, vol. 19, no. 7, pp. 763–767, 2000.
-  B. Hewitt, M. H. Yap, and R. Grant, “Manual whisker annotator (mwa): A modular open-source tool,” Journal of Open Research Software, vol. 4, no. 1, 2016.
-  O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.
-  M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman, “The pascal visual object classes (voc) challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, Jun. 2010.
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for
semantic segmentation,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.
-  A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
-  Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014, pp. 675–678.
-  K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
-  J. Alarifi, M. Goyal, A. Davison, D. Dancey, R. Khan, and M. H. Yap, “Facial skin classification using convolutional neural networks,” in Image Analysis and Recognition: 14th International Conference, ICIAR 2017, Montreal, QC, Canada, July 5–7, 2017, Proceedings, vol. 10317. Springer, 2017, p. 479.
-  M. H. Yap, C.-C. Ng, K. Chatwin, C. A. Abbott, F. L. Bowling, A. J. Boulton, and N. D. Reeves, “Computer vision algorithms in the detection of diabetic foot ulceration a new paradigm for diabetic foot care?” Journal of diabetes science and technology, p. 1932296815611425, 2015.
-  M. H. Yap, K. E. Chatwin, C.-C. Ng, C. A. Abbott, F. L. Bowling, S. Rajbhandari, A. J. Boulton, and N. D. Reeves, “ footsnap : A new mobile application for standardizing diabetic foot images,” Journal of Diabetes Science and Technology, p. 1932296817713761, 2017.
-  M. H. Yap, E. A. Edirisinghe, and H. E. Bez, “Fully automatic lesion boundary detection in ultrasound breast images,” SPIE Medical Imaging, vol. 65123I-65123I-8, 2007.