I Motivation and Overview
This paper is a small overview of Araguaia Medical Vision Lab (AMVL) at the International Skin Imaging Collaboration (ISIC) 2017 challenge, more specifically the skin lesion classification task. Our main objective is to perform an automatic classification of skin lesions on two main tasks, the Melanoma and Seborrheic Keratosis recognition, using the image dataset available by ISIC, which was already diagnosed by specialists and used as ground truth. The algorithm proposed a combination of deep convolutional neural networks (CNN), GoogleNet and AlexNet, fine-tuned  with augmented skin lesion images. The next sessions will describe how was the training and evaluation process.
Ii Image Pre-processing
The original dataset is composed by 2000 images, with 374 samples of Melanoma and 254 samples of Seborrheic Keratosis, the other 1372 are defined as Nevus. The images have different sizes from 1022 x 767 to 6748 x 4499.
The first step was split a validation set with around 20% of images from each class to evaluate the neural network performance during the training stage.
All train dataset pass through a pre-process filter which applied random shear, zoom, and vertical and horizontal shift and flip. This step was necessary to increase the dataset size (around 5 times), make it less unbalanced and improve the neural network accuracy.
Iii Network Training and Evaluation
Iii-a Seborrheic Keratosis Task
On Seborrheic classification task we got the best result resizing all training images to 350 x 350 without losing proportions and training an AlexNet pre-trained on ImageNet classification task dataset by the Berkeley Vision and Learning Center (ref). The network was trained over 30 epochs with a Stochastic Gradient Descent (SGD) using three stages, first 10 epochs with learning rate of, than 10 epochs with , and 10 more with . The top accuracy on validation set was , with on Seborrheic class and on non-Seborrheic. On validation dataset from ISIC the network we got accuracy and of Area under Roc Curve (AUC), with sensibility of and specificity of .
Iii-B Melanoma Task
On Melanoma task we got our best result doing an average between three networks, GoogleNet 256, GoogleNet 224 and AlexNet 224, where all was pre-trained on ImageNet classification dataset.
GoogleNet 256 was retrained over 72 epochs with 256 x 256 resized images with random crop to 224 x 224, which is the network standard size, the learning was with SGD and three learning rate steps , , , the best result was got with the last epoch.
GoogleNet 224 was retrained over 50 epochs with images resized to 224 x 224 without the random crop with SGD and learning steps of , , , and . Best result was at epoch 15
AlexNet 224 was retrained over 30 epochs with 224 x 224 images, SGD and three learning rate steps , , over 30 epochs, best result was at epoch 15.
The result was obtained by the arithmetic mean between the three networks softmax output, making it more balanced between classes. On validation dataset from ISIC the network we got accuracy and of AUC, with sensibility of and specificity of .
The CNNs got great results on both tasks, but seem to lack on sensibility, probably because the unbalanced dataset with few positive images, making it harder to generalize the visual features of lesions.
-  C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” CoRR, vol. abs/1409.4842, 2014. [Online]. Available: http://arxiv.org/abs/1409.4842
-  A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
-  C.-K. Shie, C.-H. Chuang, C.-N. Chou, M.-H. Wu, and E. Y. Chang, “Transfer representation learning for medical image analysis,” in Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual International Conference of the IEEE. IEEE, 2015, pp. 711–714.
-  Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprint arXiv:1408.5093, 2014.