Transfer Learning with Ensembles of Deep Neural Networks for Skin Cancer Classification in Imbalanced Data Sets

03/22/2021 ∙ by Aqsa Saeed Qureshi, et al. ∙ Helsingin yliopisto 0

Early diagnosis plays a key role in prevention and treatment of skin cancer.Several machine learning techniques for accurate classification of skin cancer from medical images have been reported. Many of these techniques are based on pre-trained convolutional neural networks (CNNs), which enable training the models based on limited amounts of training data. However, the classification accuracy of these models still tends to be severely limited by the scarcity of representative images from malignant tumours. We propose a novel ensemble-based CNN architecture where multiple CNN models, some of which are pre-trained and some are trained only on the data at hand, along with patient information (meta-data) are combined using a meta-learner. The proposed approach improves the model's ability to handle scarce, imbalanced data. We demonstrate the benefits of the proposed technique using a dataset with 33126 dermoscopic images from 2000 patients.We evaluate the performance of the proposed technique in terms of the F1-measure, area under the ROC curve (AUC-ROC), and area under the PR curve (AUC-PR), and compare it with that of seven different benchmark methods, including two recent CNN-based techniques. The proposed technique achieves superior performance in terms of all the evaluation metrics (F1-measure 0.53, AUC-PR 0.58, AUC-ROC 0.97).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

page 18

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Skin cancer is caused by the mutation within the DNA of skin cells, which cause the abnormal multiplication of skin cells (Armstrong and Kricker, 1995; Simões et al., 2015). Exposure of skin to sunlight is the most common cause of skin cancer but it can also develop on the unexposed areas of skin (Saladi and Persaud, 2005). Besides UV radiations other factors also cause skin cancer; like people having fair complexion, abnormal history of having moles, family history of having moles, weak immunity, exposure to certain substances are at higher risk to develop skin cancer (Diepgen and Mahler, 2002; Berlin et al., 2015). Change in size of mole, irregular shaped lesion, lesion with itching or pain, brownish spots on skin are some of the common signs of skin cancer.

In the beginning skin cancer develops on the epidermis (outer layer) of skin. For the diagnosis of skin cancer, specialist dermatologist observes the lesions and then perform biopsy of the sample and after that declare it as cancer or non-cancerous. Unfortunately, the whole procedure is time consuming and patient may develop the later stage of cancer. It is also observed that even a best dermatologist can diagnose cancer with 80% accuracy (Morton and Mackie, 1998). Also, skillful dermatologists are not available globally. In skin cancer diagnosis, the stage at which cancer is detected is important because mostly patient survives if cancer is diagnosed at early stage, but at later stage survival rate is less (Bray et al., 2018). As all the manual methods related to skin cancer detection take time and even there are chances of error, therefore machine learning based techniques are widely used in literature, which can automatically detect skin cancer and even type of cancer and thus improve the accuracy of the system (Murugan et al., 2019; Ballerini et al., 2013; Thomas et al., 2021; Lau and Al-Jumaily, 2009; Hosny et al., 2020; Dorj et al., 2018; Mahbod et al., 2019; Guo and Yang, 2018; Li and Shen, 2018; Hirano et al., 2020; Kassem et al., 2020; Szegedy et al., 2015; Saba et al., 2019; Lei et al., 2020) . For detection of skin cancer different data pre-processing techniques have been applied to the input images, generated with the help of derma-scope. Murugan et al. (2019)

compared the performance of K-nearest neighbor (KNN), random forest (RF), and support vector machine (SVM) on the feature space extracted from the segmented regions of input images. Results showed that SVM performed well in comparison to other classifiers for skin cancer classification.Similarly,

Ballerini et al. (2013) used KNN based hierarchical approach for classifying five different types of skin lesions. Thomas et al. (2021)

used deep learning based methods for classification and segmentation of skin cancer. For the early diagnosis of skin cancer,

Lau and Al-Jumaily (2009)

introduced diagnostic system that first enhanced the input images and only those regions are left in the images which hold cancerous cells. After that, information from the enhanced regions is then pass to Multi-Layer Perceptron (MLP) and auto-associated Neural Network (NN). Results showed that MLP performed better in terms of accuracy in comparison to auto associated NN.

Hosny et al. (2020) used transfer learning for classifying skin cancer data into seven classes. In Hosny’s technique pre-trained Alex Net network is used in which all the layers of model except last three layers are initialized from the pre-trained network. Whereas weights of last layers are initialized randomly.

Similarly, Dorj et al. (2018)

used pre-trained Alex Net for feature extraction and then ECOC-SVM is used to classify the images.

Guo and Yang (2018) utilized Multichannel ResNet technique for the analysis of skin lesion dataset. Similarly, Li and Shen (2018) used deep learning strategy based on two fully convolutional residual networks for the segmentation and classification of skin lesion. Convolutional Neural Network (CNN) is used for extracting features from dataset. The dataset used in Li’s technique was highly imbalanced therefore data augmentation was performed prior to segmentation and classification.

Hirano et al. (2020) suggested a transfer learning based technique in which hyperspectral data is used for the detection of melanoma. In Hirano’s technique, fine-tuning of pre-trained google Net is performed. Similarly, Kassem et al. (2020) used pre-trained google Net (Szegedy et al., 2015) for classifying eight distinct types of skin lesions. In Saba et al. (2019) technique for skin cancer detection, a three-stage framework is designed. In first step contrast enhancement is performed on the input images, whereas in second step color CNN approach is used to extract the boundary of the skin lesion. In third step Inception V3 model is used for feature extraction. In the end the best feature subset is selected and provided to MLP. Mahbod et al. (2019) proposed an ensemble based hybrid technique. In this technique first pre-processing and data augmentation is performed on input data. After that, three set of features are extracted from the augmented data. First and second sets of feature spaces are extracted from fully connected layers of pre-trained Alex Net (Srivastava et al., 2014) and VGG16 (Simonyan and Zisserman, 2014) respectively. Whereas third set of feature space is extracted from last fully connected and convolutional layers of pre-trained ResNet18 (He et al., 2016)

. After that three different SVM’s are trained on all the different extracted feature spaces. In the end all the predicted values from SVMs are than provided to simple logistic regression unit which finally performs classification. In another transfer learning based approach,

Esteva et al. (2017) performed fine-tuning of pre-trained InceptionV3 network on skin cancer classification dataset. During training phase all images are resized to dimensions to made them compatible with the input dimensions of InceptionV3 architecture.

All the previously reported techniques used data augmentation to deal with the class imbalance problem. Also, most of the techniques fine-tuned the pre-trained networks for improving the classification accuracy of cancerous vs non-cancerous data. Disadvantage of using the pre-trained network is that these networks are trained on random set of images, which may lead to negative learning in predicting cancerous images in target domain. Also, all the previously reported skin cancer classification techniques used the images, there is no auxiliary patient information used along with the images. The motivation behind the proposed technique is to explore the diversity at various levels during the training of the ensemble. The dataset used in proposed technique is taken from International-Skin-Imaging-Collaboration (ISIC) 2020 dataset, which is highly imbalanced (Rotemberg et al., 2021). In proposed ensembled based technique, first the base-learners are trained on different dimensions of input images. During training two out of six base-learners are pre-trained CNNs trained on balanced cancerous dataset which is not part of ISIC 2020 dataset. In the second step all the predictions from base-learners along with the meta-data provided with the input images is provided to SVM which finally classifies images as cancerous or non-cancerous. The proposed technique explored diversity at various levels. First instead of training the single classifiers, different base-learners are trained on different dimensions of input data in separate ways. Second important thing is that the all the predictions from base-learners along with the meta-data provide diverse information which improves the learning capability of SVM to distinguish between cancerous and non-cancerous images (Dietterich and others, 2002; Polikar, 2012)

2 Proposed technique for skin cancer classification

The proposed technique is an ensemble-based technique in which different CNNs are used as base-learners and after the training of base-learners, predictions from all the base-learners along with the meta data (which have information related to the images) are then provided to SVM, which finally classifies each image in dataset as malignant or benign image. Figure 1. shows the flowchart of the proposed technique. As in the proposed technique input images are of different dimensions, therefore before the training of each base-learners all the images are resized to same dimension. In the proposed technique, six base-learners are used, and each base-learner is trained on different dimension of input data. During training four base-learners i.e., are trained from scratch on dimensional input images respectively taken from ISIC 2020 dataset (Rotemberg et al., 2021).Whereas rest of two base-learners are trained on balanced malignant and benign skin cancer images which are not part of ISIC 2020 dataset. After training of all six base-learners, predictions from all of them along with the meta data is then provided to SVM, which finally classifies the input image as malignant or benign.

Figure 1: Block diagram of the proposed technique

2.1 Dataset

The dataset used in the proposed technique contain 33126 dermoscopic images collected from 2000 patients. All the images are labelled using histopathology and expert opinion either as benign or malignant skin lesions. The dataset is generated by ISIC and images are taken from the below mentioned sources(Rotemberg et al., 2021).

  • Hospital clinic Barcelona, Medical University of Vienna.

  • University of Athens Medical School.

  • Melanoma Institute Australia.

All the images are in JPG format of varying dimensions along with the meta data. Figure 2. shows the images present in the dataset. Table 1. shows the features in meta data against each image within the dataset. The ISIC 2020 data set is highly imbalanced because out of 33126 images 584 images are malignant.

Figure 2: Benign vs malignant images
Metadata
patient ID
Age
Sex
general anatomic site
Table 1: Metadata

2.2 Data preprocessing

In the proposed technique some of the pre-processing was performed on the images and on meta data provided in ISIC 2020 data set. In the proposed technique, different base-learners take different input dimensional data, therefore all the input images within data set are resized to , , and dimensional images. In case of meta data, some of the features’s values were missing, therefore all the missing values in feature are replaced by the average value of the feature.

2.3 Division of dataset

In the proposed technique hold out cross validation is used for the division of data set. During division, 10% out of total data(D) is kept as test data out of remaining data is used to train and validate the parameters of the classifiers used in the proposed technique. Also, data is divided using stratified group cross validation technique (SGC). In SGC technique data present in training, validation, and test set have same percentage of malignant and benign samples as in original distribution of data and data of same patient is not repeated across the sets. Figure 3. shows the dataset division used in the proposed technique.

Figure 3: Division of data set

2.4 Deep CNN as a base learner

CNN is considered as regularized version of MLP and is mostly used to analyze the image data set (Srivastava et al., 2014). In CNN there are multiple hidden layers between input and output layer. The information related to distinct types of layers in CNN is discussed below.

  • Convolutional layer

In CNN, convolutional operation is used instead of general matrix multiplication. Each convolutional layer of CNN convolves input from earlier layer with the kernel to generate feature map as output. Convolutional operation is a linear operation, in which weights are convolved with the specific region (receptive field) of input data and generates single value as an output. The kernel is swapped multiple times on image, as a result two-dimensional feature map is generated. As, kernel is convolved over the whole image multiple times, therefore multiple feature maps are generated by each layer of convolution in CNN. The equation 1 is mathematical description of convolutional operation.

(1)

In above equation K, , and R denote kernel, input, and output of convolutional layer, respectively. In equation 1, kernel size is x×y , while X and Y denotes the starting of receptive field of image on which kernel is convolved.

  • Pooling layer

Pooling layer is added to streamline the results and to reduce the dimensionality of input data by either averaging or selecting the maximum value from the group of neurons of the earlier layer. Hence, Pooling layer helps to reduce the number of parameters within the CNN and makes it invariant to changes within the input.

  • Fully connected layer

Just like fully connected layer, which connects each neuron in a layer to every other neuron in the next layer.

In the proposed technique, six different CNNs are used as a base-learners. Four out of six base-learners are trained from scratch on ISIC 2020 dataset. While two of CNNs are pre-trained on balanced dataset having benign and malignant skin moles and collected from ISIC-Achieve. As CNNs are mostly used for image like data set that is why in the proposed technique different CNN based base-learners extract the features from the input images and aid the meta-learner in training and classifying the input image as malignant or benign.

2.4.1 Transfer learning during training of base learners

Transfer learning is used to transfer the knowledge extracted from one type of machine learning problem to another(Torrey and Shavlik, 2010). The domain from where information is extracted is known as source domain. Whereas the domain where extracted information is applied is called target domain. The benefit of transfer learning is that it not only saves time that is needed to train the network from scratch but also aid in improving the performance of machine learning problem in target domain. There are diverse types of transfer learning approaches and each type depends on the way knowledge is transferred from source domain to target domain (Pan and Yang, 2009).

In the proposed technique idea of transfer learning is exploited during the training phase of base-learners. In machine learning to deal with data imbalance mostly down sampling is used, in which majority class is down sample to make balance data set for the training of classifier. The disadvantage of down sampling is that it may lead to loss of some valuable information from data. Another way to balance the data set is to augment the samples which are present in minority, but disadvantage of data augmentation is that it can only extracts limited information, and the information is not as much precise as real data have. Therefore, in this proposed technique instead of losing any information from data all the base-learners are trained on real data samples. The ISIC 2020 dataset used in the proposed technique is highly imbalanced therefore pre-trained base-learners are trained on balanced data set collected from ISIC archive is used. Where are rest of the base-learners are trained on ISIC 2020 data set from scratch. All the base-learners are trained differently, and all the feature extracted from pre-trained base learners and base-learners trained from scratch on ISIC 2020 data set provide the diverse feature space to meta-classifier.

2.5 SVM as a meta classifier

In the proposed technique SVM is used as a meta-learner. After the training of the base learners, all the predicted values along with the meta data is feed as input to the SVM. After that SVM predicts the label of input data as malignant or benign. As SVM is taking diverse information from different base-learners along with some side information, that is why it generalizes well.

3 Experimental details

All the computations were done on the Puhti supercomputer Atlos BullSequana X400 cluster comprised of Intel CPUs. All the CPUs relate to 700 nodes having range of storage and memory options. For implementation of deep learning models Keras framework is used and Python 3.0 is used as the programming language.

3.1 Parameters of deep CNN used as base-learner in the proposed technique

Tables 27 show the parameters which are set during the training phase of base-learners. All the parameters are selected on the bases of performance of trained base-learners on validation data.

Layer Layer Kernel size Feature maps and neurons Stride Activation function
0 Input Relu
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 512 Sigmoid
11 Dropout 512
12 Fully connected 1 Sigmoid
Table 2: Parameter setting of
Layer Layer Kernel size Feature maps and neurons Stride Activation function
0 Input
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 512 Sigmoid
11 Dropout 512
12 Fully connected 1 Sigmoid
Table 3: Parameter setting of
Layer Type of layer Kernel size No of feature maps and neurons Stride Activation function
0 Input
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 512 Sigmoid
11 Dropout 512
12 Fully connected 1 Sigmoid
Table 4: Parameter setting of
Layer Type of layer Kernel size No of feature maps and neurons Stride Activation function
0 Input
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 512 Sigmoid
11 Dropout 512
12 Fully connected 100 Sigmoid
13 Fully connected 1 Sigmoid
Table 5: Parameter setting of
Layer Type of layer Kernel size No of feature maps and neurons Stride Activation function
0 Input Relu
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 1024 Sigmoid
11 Dropout 1024
12 Fully connected 1 Sigmoid
Table 6: Parameter setting of pre-trained
Layer Type of layer Kernel size No of feature maps and neurons Stride Activation function
0 Input Relu
1 Convolutional [1 ,1] Relu
2 Max pooling [2 ,2]
3 Dropout
4 Convolutional [1 ,1] Relu
5 Max Pooling [2 ,2]
6 Convolutional [1 ,1] Relu
7 Max pooling [2 ,2]
8 Convolutional [1 ,1] Relu
9 Max Pooling [2 ,2]
10 Fully connected 700 Sigmoid
11 Dropout 700
12 Fully connected 1 Sigmoid
Table 7: Parameter setting of pre-trained

3.2 Parameters of SVM used as meta classifier in the proposed technique

In the proposed technique all the predictions from base-learners along with the metadata is provided as input to the SVM, which finally classifies the data as malignant or benign. Table 8 shows the parameters of SVM during the training phase.

Kernel Degree of kernel C Gamma
Polynomial 2 0.001 0.0009
Table 8:

Parameters of the meta-learner (Support Vector Machine)

3.3 Evaluation metrics

In the proposed technique, F1-measure, area under the ROC curve (AUC-ROC), and area under the PR curve (AUC-PR) are used as evaluation measure. F1-measure is considered as harmonic mean of precision and recall. AUC-ROC is the area under the curve between the true positive rate (TPR) and the false positive rate (FPR) at different values of the classification threshold, whereas AUC-PR is the area under the precision–recall curve at different values of the threshold.

(2)
(3)
(4)
(5)

In the above equations are the total number of positive samples that are truly predicted by classifier. Whereas are number of falsely predicted positive samples and N is the total number of negative class samples.

4 Experimental Results

The proposed technique is an ensemble learning based approach, in which training of the base-learners is performed and after that meta-classifier predicts the final class label based on the information collected from the meta data and base-learners. To validate the performance of the proposed technique, holdout cross validation is used. To check the performance of simple base-line classifiers on ISIC 2020 data same training, validation, and test split is used as in the proposed technique. Table 9. shows the performance of MLP, SVM, KNN, RF, and Deep Auto-Encoder (DAE) in terms of F1-measure, AUC-ROC, and AUC-PR for ten independent runs. Results show that SVM performs good among all the base-line classifiers.

Classifier F1-measure AUC-PR AUC-ROC
KNN
RF
Deep-AE
MLP
SVM
Table 9: Performance of base-line classifiers

4.1 Performance of base-learners and meta-classifier in terms of F1-measure, AUC-PR, and AUC-ROC

The proposed technique in comprised of two steps; in first step base-learners are trained and in second step meta-learner is trained on the top of base-learners. Therefore, it is important to show the performance comparison of base-learners and meta classifier. Table 10 shows the shows performance of base learners with meta-classifier. Out of six base-learners four are trained from scratch on the ISIC 2020 dataset while, rest of two are pre-trained on skin cancer images that are not part of ISIC 2020 dataset. The performance comparison of base-learner with the meta-classifier shows that the meta-classifier shows improved in performance in comparison to all the base-learners. The purpose of base-learners is to provide the diverse set of features to the meta-learner that’s why generalization capability of meta-learner is good in comparison to the individual base-learners. Table 10 shows that each base-learner is different in terms of performance and this difference in opinion diversifies the feature space provided to meta-classifier. Therefore, meta classifier shows significantly better performance. Figure 4 and 5 show the graphical representation of AUC-ROC and AUC-PR over ten independent runs.

Performance F1-Measure AUC-PR AUC-ROC
Pre-trained
Pre-trained
SVM(meta-classifier)
Table 10: Performance comparison of base-learners and meta-classifier
Figure 4: AUC-ROC for ten independent runs
Figure 5: AUC-PR for ten independent runs

4.2 Significance of transfer learning based proposed technique

In the proposed technique six different CNNs are used as base-learners. During training four out of six base-learners are trained from scratch. Whereas two of base-learners are pre-trained on images collected from ISIC arxive and have equal number of malignant and benign images. As ISIC dataset is highly imbalanced, therefore introduction of pre-trained CNNs on balanced dataset provides diverse set of predictions from all the base-learners. Table 11 shows the performance comparison of proposed technique when transfer learning based base-learner are included as base-learners vs when base-learners are trained from scratch only on ISIC 2020 dataset. Table 11

shows the pre-trained base-learners help in improving the performance of the technique. The use of pre-trained base-learners not only improves result in terms of F1-measure, AUC-ROC, and AUC-PR but also improves the standard deviation of error for ten independent runs. Moreover, pre-trained base-learners used in the proposed technique save the time needed to train the base-learner from scratch on ISIC 2020 data set.

The proposed technique F1-measure AUC-PR AUC-ROC
Without transfer learning
With transfer learning
Table 11: Performance comparison of the proposed technique with or without pre-trained base-learners

4.3 Comparison of the proposed technique with other commonly used classifiers

Table 12 shows the comparison of the proposed technique with different base-line classifiers and state-of the art techniques. Results show that the proposed technique is better in terms of all the evaluation measures. To check the stability of the all the techniques mentioned in Table 12, the results are reported in terms of mean and standard deviation of error for ten independent runs. The proposed technique achieved F1-measure, AUC-PR, and AUC-PR of 0.5283, 0.5770, and 0.9708 respectively, which are highest among all the methods reported in Table 12. Figure 6, Figure 7, and Figure8 show the graphical comparison of the proposed technique with all the other reported methods in terms of F1-measure, AUC-ROC, and AUC-PR respectively.

Technique F1-measure AUC-PR AUC-ROC
KNN
RF
Deep-AE
MLP
SVM
Esteva et al. (2017)
Mahbod et al. (2019)
The proposed technique
Table 12: Comparison of the proposed technique with other methods
Figure 6: Comparison of F1-measure over ten independent runs
Figure 7: Comparison of AUC-ROC over ten independent runs
Figure 8: Comparison of AUC-PR over ten independent runs

5 Conclusion

For the accurate prediction of skin cancer classification an ensemble learning approach is proposed. In first step, idea of transfer learning is explored during the training of base-learners which not only saves time but also diversifies the feature space generated by the base-learners. In second step, all the extracted feature along with the meta-data is provided as input to SVM which finally classifies the input data. The ISIC 2020 data set used is proposed technique is highly imbalanced. Instead of relying on a single classifier for imbalanced data set which makes classifier more biased toward the majority class the proposed technique tackles this problem by using the concept of ensemble learning.

In this proposed technique two base-learners are pre-trained on balanced skin cancer data collected from ISIC archive (these images are not part of ISIC 2020 data set), while rest of base learners are trained from scratch on ISIC 2020 data set (which is imbalanced). Most of the times, data augmentation or down sampling is used for data balancing. In case of data augmentation artificial samples are generated which have limited information and not have as much precise information as real data have. Whereas down sampling leads to loss of information that original data has. Unlike data augmentation and down sampling techniques, there is no loss of information because base-learners trained from scratch on ISIC 2020 data extract useful information and this knowledge is further boosts by incorporating the information that is extracted from the pre-trained base-learners. Therefore, all the base-learners generate different hypothesis spaces and thus help the meta-learner (SVM) to make a robust decision for skin cancer classification.

Acknowledgement

This work is supported by the Academy of Finland (Projects TensorML#311277, and the FCAI Flagship)

References

  • B. K. Armstrong and A. Kricker (1995) Skin cancer. Dermatologic Clinics 13 (3), pp. 583–594. Cited by: §1.
  • L. Ballerini, R. B. Fisher, B. Aldridge, and J. Rees (2013) A color and texture based hierarchical k-nn approach to the classification of non-melanoma skin lesions. In Color Medical Image Analysis, pp. 63–86. External Links: Document Cited by: §1.
  • N. L. Berlin, B. Cartmel, D. J. Leffell, A. E. Bale, S. T. Mayne, and L. M. Ferrucci (2015) Family history of skin cancer is associated with early-onset basal cell carcinoma independent of mc1r genotype. Cancer epidemiology 39 (6), pp. 1078–1083. External Links: Document Cited by: §1.
  • F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal (2018)

    Global cancer statistics 2018: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries

    .
    CA: a cancer journal for clinicians 68 (6), pp. 394–424. External Links: Document Cited by: §1.
  • T. L. Diepgen and V. Mahler (2002) The epidemiology of skin cancer. Br J Dermatol 146 (Suppl 61), pp. 1–6. External Links: Document Cited by: §1.
  • T. G. Dietterich et al. (2002) Ensemble learning. The handbook of brain theory and neural networks 2, pp. 110–125. Cited by: §1.
  • U. Dorj, K. Lee, J. Choi, and M. Lee (2018) The skin cancer classification using deep convolutional neural network. Multimedia Tools and Applications 77 (8), pp. 9909–9924. External Links: Document Cited by: §1, §1.
  • A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun (2017) Dermatologist-level classification of skin cancer with deep neural networks. nature 542 (7639), pp. 115–118. External Links: Document Cited by: §1, Table 12.
  • S. Guo and Z. Yang (2018) Multi-channel-resnet: an integration framework towards skin lesion analysis. Informatics in Medicine Unlocked 12, pp. 67–74. External Links: Document Cited by: §1, §1.
  • K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    ,
    pp. 770–778. External Links: Document Cited by: §1.
  • G. Hirano, M. Nemoto, Y. Kimura, Y. Kiyohara, H. Koga, N. Yamazaki, G. Christensen, C. Ingvar, K. Nielsen, A. Nakamura, et al. (2020) Automatic diagnosis of melanoma using hyperspectral data and googlenet. Skin Research and Technology 26 (6), pp. 891–897. External Links: Document Cited by: §1, §1.
  • K. M. Hosny, M. A. Kassem, and M. M. Fouad (2020) Classification of skin lesions into seven classes using transfer learning with alexnet. Journal of Digital Imaging 33 (5), pp. 1325–1334. External Links: Document Cited by: §1.
  • M. A. Kassem, K. M. Hosny, and M. M. Fouad (2020) Skin lesions classification into eight classes for isic 2019 using deep convolutional neural network and transfer learning. IEEE Access 8, pp. 114822–114832. External Links: Document Cited by: §1, §1.
  • H. T. Lau and A. Al-Jumaily (2009) Automatically early detection of skin cancer: study based on nueral netwok classification. In 2009 International Conference of Soft Computing and Pattern Recognition, pp. 375–380. External Links: Document Cited by: §1.
  • B. Lei, Z. Xia, F. Jiang, X. Jiang, Z. Ge, Y. Xu, J. Qin, S. Chen, T. Wang, and S. Wang (2020)

    Skin lesion segmentation via generative adversarial networks with dual discriminators

    .
    Medical Image Analysis 64, pp. 101716. External Links: Document Cited by: §1.
  • Y. Li and L. Shen (2018) Skin lesion analysis towards melanoma detection using deep learning network. Sensors 18 (2), pp. 556. External Links: Document Cited by: §1, §1.
  • A. Mahbod, G. Schaefer, C. Wang, R. Ecker, and I. Ellinge (2019) Skin lesion classification using hybrid deep neural networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1229–1233. External Links: Document Cited by: §1, §1, Table 12.
  • C. Morton and R. Mackie (1998) Clinical accuracy of the diagnosis of cutaneous malignant melanoma.. The British journal of dermatology 138 (2), pp. 283–287. External Links: Document Cited by: §1.
  • A. Murugan, S. A. H. Nair, and K. S. Kumar (2019) Detection of skin cancer using svm, random forest and knn classifiers. Journal of medical systems 43 (8), pp. 1–9. External Links: Document Cited by: §1.
  • S. J. Pan and Q. Yang (2009) A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22 (10), pp. 1345–1359. External Links: Document Cited by: §2.4.1.
  • R. Polikar (2012) Ensemble learning. In Ensemble machine learning, pp. 1–34. External Links: Document Cited by: §1.
  • V. Rotemberg, N. Kurtansky, B. Betz-Stablein, L. Caffery, E. Chousakos, N. Codella, M. Combalia, S. Dusza, P. Guitera, D. Gutman, et al. (2021) A patient-centric dataset of images and metadata for identifying melanomas using clinical context. Scientific Data 8 (1), pp. 1–8. External Links: Document Cited by: §1, §2.1, §2.
  • T. Saba, M. A. Khan, A. Rehman, and S. L. Marie-Sainte (2019) Region extraction and classification of skin cancer: a heterogeneous framework of deep cnn features fusion and reduction. Journal of medical systems 43 (9), pp. 1–19. External Links: Document Cited by: §1, §1.
  • R. N. Saladi and A. N. Persaud (2005) The causes of skin cancer: a comprehensive review. Drugs of Today 41 (1), pp. 37–54. External Links: Document Cited by: §1.
  • M. F. Simões, J. S. Sousa, and A. C. Pais (2015) Skin cancer and new treatment perspectives: a review. Cancer letters 357 (1), pp. 8–42. External Links: Document Cited by: §1.
  • K. Simonyan and A. Zisserman (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. Cited by: §1.
  • N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2014) Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15 (1), pp. 1929–1958. Cited by: §1, §2.4.
  • C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9. External Links: Document Cited by: §1, §1.
  • S. M. Thomas, J. G. Lefevre, G. Baxter, and N. A. Hamilton (2021) Interpretable deep learning systems for multi-class segmentation and classification of non-melanoma skin cancer. Medical Image Analysis 68, pp. 101915. External Links: Document Cited by: §1.
  • L. Torrey and J. Shavlik (2010) Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques, pp. 242–264. External Links: Document Cited by: §2.4.1.