Deep Transfer Learning Methods for Colon Cancer Classification in Confocal Laser Microscopy Images

05/20/2019 ∙ by Nils Gessert, et al. ∙ 0

Purpose: The gold standard for colorectal cancer metastases detection in the peritoneum is histological evaluation of a removed tissue sample. For feedback during interventions, real-time in-vivo imaging with confocal laser microscopy has been proposed for differentiation of benign and malignant tissue by manual expert evaluation. Automatic image classification could improve the surgical workflow further by providing immediate feedback. Methods: We analyze the feasibility of classifying tissue from confocal laser microscopy in the colon and peritoneum. For this purpose, we adopt both classical and state-of-the-art convolutional neural networks to directly learn from the images. As the available dataset is small, we investigate several transfer learning strategies including partial freezing variants and full fine-tuning. We address the distinction of different tissue types, as well as benign and malignant tissue. Results: We present a thorough analysis of transfer learning strategies for colorectal cancer with confocal laser microscopy. In the peritoneum, metastases are classified with an AUC of 97.1 and in the colon, the primarius is classified with an AUC of 73.1. In general, transfer learning substantially improves performance over training from scratch. We find that the optimal transfer learning strategy differs for models and classification tasks. Conclusions: We demonstrate that convolutional neural networks and transfer learning can be used to identify cancer tissue with confocal laser microscopy. We show that there is no generally optimal transfer learning strategy and model as well as task-specific engineering is required. Given the high performance for the peritoneum, even with a small dataset, application for intraoperative decision support could be feasible.



There are no comments yet.


page 6

page 10

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Colorectal cancer is very common and it is often associated with metastatic spread torre2015global . In particular, peritoneal carcinomatosis (PC) can arise in later stages of development which often shortens patient survival times substantially verwaal2005long ; franko2012treatment . Thus, early and reliable detection of metastases is crucial. Diagnosis with typical external imaging techniques such as computed tomography (CT) and magnetic resonance imaging (MRI) is difficult for PC as a very high resolution is required. For example, preoperative CT has been shown to be ineffective to detect individual peritoneal tumor deposits and the interobserver variability among experts was significant de2004peritoneal . Also, integrated PET/CT did not provide sufficient information for accurate assessment dromain2008staging . For MRI, studies have shown improvement over assessment with CT only low2000extrahepatic ; iafrate2012peritoneal but overall, its resolution is still a limitation gonzalez2009imaging . Therefore, exploratory laparoscopy is generally employed to investigate the presence of PC ishigami2014clinical .

Recently, a new intraoperative device using confocal laser microscopy (CLM) has been introduced which provides submicrometer image resolution ellebrecht2018confocal . In the study, ten rats received colon carcionoma cell implants in the colon and peritoneum. After a growth period, laparotomy with in-vivo CLM was performed. CLM images of healthy and malignant colon tissue, as well as healthy and malignant peritoneum were acquired. It was shown that experts are able to distinguish different tissue types as well as healthy and malignant tissue from CLM. This raises the question whether image processing techniques can be used to automatically classify different tissue types. This could enable faster and improved intraoperative decision support with CLM.

Recently, automatic tissue characterization has been successfully addressed using deep learning methods such as convolutional neural networks (CNNs) for semantic segmentation and classification

Litjens.2017 ; goceri2017deep . For example, skin cancer classification at dermatologist-level performance was achieved esteva2017dermatologist . However, the datasets for this and related studies are large and commonly, datasets for medical learning tasks are small shen2017deep . This can be problematic as insufficient data for optimal training might lead to overfitting and limited generalization. This is particularly important for deep learning models which can be prone to overfitting due to their large number of trainable parameters. To overcome this issue, transfer learning methods have been proposed where a deep learning model is first pretrained on a different, large dataset bengio2012deep . Then, information from the source domain can be transferred to the (medical) target domain using strategies such as ”off-the-shelf” features, partial layer freezing, or full fine-tuning hoo2016deep . While this has been successfully applied for medical learning tasks gessert2018automatic , there is no single solution for all problems and the optimal transfer learning strategy is highly dependent on the imaging modality and dataset size tajbakhsh2016convolutional .

Automatic analysis of CLM images has been proposed for different tissue types such as human skin rajadhyaksha1995vivo , the cornea niederer2007age or the oral cavity aubreville2017automatic . Recently, deep learning methods have been applied to CLM and similar modalities. For example, CNNs have been used for oral squamous cell carcinoma classification aubreville2017automatic and motion correction with CLM aubreville2018deep . Similarly, skin images from CLM have been used with CNN-based classification wiltgen2016automatic . For the gastrointestinal tract, CNNs have been used to distinguish three classes of Barret’s esophagus hong2017convolutional . Also, brain tumor classification with CNNs and CLM has shown promising results Izadyyazdanabadi.2018 . For example, a CNN has been used to differentiate CLM images with and without diagnostic value for a physician during surgery izadyyazdanabadi2017improving . Also, weakly-supervised localization has been used to derive local information in CLM images from image-level labels only izadyyazdanabadi2018weakly .

So far, deep learning-based classification of colorectal cancer from CLM images has not been addressed. Also, while several approaches have used CLM and CNNs for other problems izadyyazdanabadi2018convolutional

, there is no analysis of transfer learning properties for colorectal cancer with CLM. Therefore, we study deep learning-based colon cancer classification from CLM images with a variety of transfer learning methods from the ImageNet dataset. We consider training from scratch, partial layer freezing, ”off-the-shelf” features and full fine-tuning to investigate how transferable ImageNet features are to CLM. We perform this study with the classic models VGG-16

simonyan2014very and Inception-V3 szegedy2016rethinking as well as the state-of-the-art architectures Densenet Huang.2016 and squeeze-and-excitation networks hu2018squeeze to analyze the consistency of transfer strategies across architectures. We consider the classes healthy colon (HC), malignant colon (MC), healthy peritoneum (HP) and malignant peritoneum (MP). Based on these classes, we address three binary classification tasks with CLM. First, we consider the differentiation of organs (HP vs. HC). Then, we study the detection of malignant tissue in two types of organs (HP vs. MP and HC vs. MC). This allows us to study variations across different classification tasks for CLM. A preliminary version of this paper was presented at the BVM Workshop 2019 gessert2019colon . We substantially revised the paper, extended the review of the literature and performed more experiments with additional transfer strategies and more architectures. This paper is structured as follows. First, we describe our models and transfer learning stratgies and the data set we use in Section 2. Then we report our results in Section 3 and discuss them in Section 4. Last, we conclude in Section 5.

2 Methods

2.1 Model Architectures and Training Strategies

Figure 1: The building blocks of the models we use. The building blocks from CNN architectures as indicated in Figure 2. We employ VGG-16 (a), Inception-V3 (b), Densenet121 (c) and SE-Resnext50 (d).

denotes the number of feature maps in each block. The Conv blocks also contain ReLU activations and batch normalization for VGG-16 (a). SE-Resnext50 is shown in simplified form without its bottleneck in the SE module. FC-

is a fully-connected layer with sigmoid activation. C. is an abbreviation for convolutional layers. Note that Inception-V3 employs multiple block variants and we show one example.

First, we consider the classic model VGG-16 simonyan2014very with the addition of batch normalization which enables faster training of the architecture by reducing the internal covariate shift Ioffe.2015 . The model itself is simple as it consists of several stacked convolutional layers without further augmentation. In between blocks of two to three convolutional layers with kernel sizes of and

, max pooling reduces the spatial dimensions. Subsequent convolutions double the number of feature maps. A building block of the architecture is shown in Figure 

1 (top left). Due to its simple structure, the architecture can serve as a baseline.

Second, we employ Inception-V3 szegedy2016rethinking . The model consists of multiple Inception blocks which follow two core design principles. First, the blocks have a multi-path structure, i.e., the input feature maps are processed in parallel by different convolution and pooling operations. At the block’s output, the feature maps from all paths are concatenated. Second, the convolutional paths perform a reduction operation that downsizes the feature map dimension with kernels. Then, computationally more expensive convolutions process the lower dimensional representations. The output feature map size is increased if the spatial dimensions are reduced inside the block which avoids representational limitations.

The idea of reduction and expansion has also found its way into the Resnet architecture He.2016 which is a core component of the next two models. Resnets learn a residual instead of a full feature transformation by using skip connections. In detail, a Resnet block (ResBlock) computes


where is the block output, is the block input, is a ReLU activation nair2010rectified and represents two convolutional layers with parameters . The skip connection enables better gradient propagation for improved training.

Third, we consider Densenet121 Huang.2016 , a state-of-the-art architecture which strives for more efficiency by introducing extensive feature reuse. In particular, within one DenseBlock, features computed in previous layers are also fed into the subsequent layers. To keep the feature map sizes moderate, compression blocks reduce the feature maps between DenseBlocks. The DenseBlock is shown in Figure 1 (bottom left).

Fourth, we adopt the architecture SE-Resnext50 hu2018squeeze . At its core, the model uses Resnext blocks Xie.2017 which are an extension of Resnet. Here, the single convolutional path is split into multiple paths with individual layers which increases representational power. The key addition in SE-Resnext50 is the use of squeeze-and-excitation (SE) modules which recalibrate the feature maps learned by Resnext blocks. These modules have shown improved performance with only a minimal increase in the number of parameters. The concept is shown in Figure 1 (bottom right).

Figure 2: The different transfer learning scenarios we investigate. Model Block refers to one of the blocks shown in Figure 1. Green indicates that blocks are retrained. Red indicates that blocks are frozen with their weights having been trained on ImageNet.

Due to the small dataset size, we study several transfer learning strategies where the above-mentioned models are trained on ImageNet. We cut off the last layer of all models and replace it with a fully-connected layer with two outputs for binary classification. We apply a softmax layer on top and the final classification output is the class with the highest probability. We train a separate model for each of our binary classification tasks.

As a baseline, we consider training from scratch, i.e. all weights are randomly initialized. Then, we use several different transfer learning strategies illustrated in Figure 2

. The first transfer approach follows the ”off-the-shelf” features idea. Here, only the new classifier is retrained on features extracted by the pretrained CNN. We also consider two partial freezing methods, where an initial part of the network remains frozen and the part closer to the classifier is retrained. We chose the freezing points block-wise, i.e. we do not cut into building blocks. Last, we consider full fine-tuning where all weights in the network are retrained with a small learning rate. The different strategies represent different abstractions of feature transfer between ImageNet and CLM images.

To further improve generalization, we employ online data augmentation with random image flipping and random changes in brightness and contrast. Furthermore, we use random cropping with crops of size ( for Inception-V3) taken from the full images of size

. We use the Adam algorithm for optimization. We adapt learning rates and the number of training epochs for the different transfer scenarios. We use a cross-entropy loss function with additional weighting to account for the slight class imbalance. In detail, we multiply the loss of a training example by

where is the total number of training examples in the current fold and is the number of examples belonging to class in the current fold. In this way, underrepresented classes receive a higher weighting in the loss function. During evaluation, we use mutli-crop evaluation with evenly spread crops over the images. This ensures that all image regions are covered with large overlaps between crops. The final predictions are averaged over the

crops. We implement our models in PyTorch.

2.2 Dataset and Experiments

Figure 3: Examples for the different classes. Malignant colon tissue, healthy colon tissue, malignant peritoneum tissue and healthy peritoneum tissue are shown from left to right.

The dataset was collected in a previous study conducted at the University Hospital Schleswig-Holstein in Lübeck where expert assessment of CLM images in the colon area was evaluated ellebrecht2018confocal . A custom intraoperative device with integrated CLM (Karl Storz GmbH & Co KG, Tuttlingen, Germany) was built. The image resolution was pixels which covers a field of view of . In the study, ten rats received colon adenocarcinoma cell implantation in the colon and peritoneum with a growth time of seven days. Then, laparotomy was conducted and images of healthy colon tissue, malignant colon tissue, healthy peritoneum tissue and malignant peritoneum tissue were obtained. Example CLM images for each tissue type are shown in Figure 3. After removal of low quality images, 1577 images remained with 533 belonging to class HC, 309 belonging to class MC, 343 belonging to class HP and 392 belonging to class MP. Note that some subjects are missing classes such that, on average, six subjects per class remain. Ground-truth annotation of all images was obtained by tissue removal of the scanned areas and subsequent histological evaluation.

Due to the small dataset size, we chose a cross-validation scheme where images from one subject are left for evaluation and training is performed on the remaining ones. Thus, all reported results are the mean value of, on average, six training scenarios with six different folds. Based on the four classes, we address three binary classification problems. First, we consider HC vs. HP, i.e., we investigate the feasibility of distinguishing the different organs in CLM. Then, we consider the differentiation of healthy and malignant tissue with the two binary classification problems HP vs. MP and HC vs. MC. We report the accuracy, sensitivity, specificity, F1-score and AUC. We use the AUC as the main metric as it is threshold-independent.

3 Results

Figure 4:

AUC values of all applied architectures for the different classification problems. We evaluate the following training types: (1) retrain classifier, (2) partial freeze 1, (3) partial freeze 2, (4) full fine-tuning, (5) training from scratch. For each value the standard deviation over multiple folds is represented by an error bar.

First, we compare the different transfer learning scenarios described in Section 2 across all architectures for each classification scenario, see Figure 4. In general, the AUC is very high for the differentiation of different healthy tissue types and healthy and malignant peritoneum tissue. The AUC for classifying malignant colon tissue is substantially lower. Also, the standard deviation is higher for this task. Training from scratch performs worst for all architectures and classification scenarios.

Regarding the transfer learning scenarios, training from scratch performs worst for all classification scenarios. For two of the three scenarios, only retraining the classifier shows substantially lower performance than other transfer scenarios. There are no clear trends between the partial freezing and fine-tuning scenarios.

Figure 5: ROC curve for the different architectures and the different training types, shown for the classification of HP vs. MP.

Second, we go into more details for the classification task HP vs. MP. Figure 5 shows the ROC curves for all models with all transfer learning scenarios for the classification task. Operating points with a good trade-off in the upper left corner vary for each model. For VGG-16, retraining the classifier only stands out. For Densenet121, partial freezing performs well. For Inception-V3 and SE-Resnext50, partial freezing and fine-tuning perform similar.

Type Accuracy Sensitivity Specificity F1-Score AUC

HC vs. HP

Inception Freeze 1 87.7 79.9 94.4 90.4 95.7
Densenet Freeze 1 91.2 82.8 95.3 91.9 92.6
SE-RX50 Freeze 1 85.8 78.5 96.3 91.3 91.9
VGG-16 Freeze 1 82.5 74.9 91.8 87.2 91.6

HP vs. MP

Inception Freeze 2 85.9 86.6 87.0 86.8 95.6
Densenet Freeze 2 83.3 84.6 83.2 84.0 91.9
SE-RX50 Freeze 1 81.7 84.6 83.2 84.0 90.9
VGG-16 Classifier 88.0 91.0 84.6 87.9 97.1

HC vs. MC

Inception Fine-Tuning 63.1 71.0 57.0 63.7 68.0
Densenet Freeze 1 70.0 72.9 64.1 69.1 73.1
SE-RX50 Fine-Tuning 63.7 66.7 65.9 69.1 71.8
VGG-16 Freeze 2 63.5 67.6 64.2 68.1 72.0
Table 1: The best performing transfer learning method for each model and classification task. Densenet refers to the Densenet121 model, SE-RX50 refers to the SE-Resnext50 model. For each training scenario, the best performing configuration is marked bold. All values are given in percent. The sensitivity is given with respect to the cancer class and for the case of organ differentiation it is given with respect to the peritoneum class.

Third, an overview of the best performing transfer strategies is shown in Table 1. Comparing individual results for each architecture, no model clearly outperforms the others consistently. In general, Densenet121 performs slightly better across the tasks. The optimal transfer strategy differs across models and classification tasks. For HC vs. HP and for Densenet121 in general, the partial freezing method performs best.

Figure 6: Training times for 90 epochs of all applied architectures for the different training scenarios for the classification task HP vs HC. Note that for training from scratch the same number of parameters is trained as for full-fine tuning. Thus, training times are equivalent for the two cases.

Last, we provide training times for all architectures and training scenarios, see Figure 6. In general, freezing more weights during training reduces the overall training time. Furthermore, training time loosely scales with the number of trainable parameters as VGG-16 contains the most parameters and shows the longest training times, followed by SE-Resnext50.

4 Discussion

We study deep transfer learning methods for CLM images for three binary classification problems. Automatic decision support with CLM during interventions could improve workflow with immediate feedback on the tissue properties. For this purpose we investigate the use of CNNs with four different architectures and five training scenarios.

The three classification tasks. As a baseline, differentiating healthy colon and peritoneum tissue works well with an AUC over for partial freezing across all models, see Figure 4. This indicates that discriminative features for different organs can be learned from CLM images. Similarly, for classification of metastases in the peritoneum the AUC is around for all transfer learning scenarios. However, classifying healthy and malignant colon tissue performs substantially worse with an AUC of for partial freezing and fine-tuning. The task appears to be more difficult which is also reflected in a slightly higher standard deviation. This indicates higher uncertainty of model predictions. This could be caused by the heterogeneous appearance of colon tissue in different parts of the colon which complicates the learning task in conjunction with the small dataset size. Furthermore, during development, colon carcinoma cells transform from a healthy stage to adenoma and then carcinoma. At earlier stages, healthy and malignant cells can still have similar appearance which complicates the learning task.

Transfer learning scenarios. Figure 4 also provides an overview of the transfer strategies across all models. Clearly, transfer learning substantially outperforms training from scratch across all classification tasks which supports the effectiveness of transfer learning for medical image classification problems Shin.2016 . The results indicate that meaningful feature transfer from the natural image domain to CLM images is possible, although the images have a vastly different appearance. However, comparing transfer strategies, only retraining the classifier performs worse than other scenarios in two out of three classification tasks. This agrees with results of a previous study on transfer learning with CLM images in neurosurgery izadyyazdanabadi2018convolutional . Here, the authors found that full fine-tuning outperforms retraining of the classifier only. However, in our case, retraining the classifier only also shows a high performance for the task HP vs. MP. This could be caused by fragile co-adaptation of weights yosinski2014transferable which leads to large performance differences between the different classification tasks. For some tasks (e.g. HP vs. MP) recovery and reuse of potentially co-adapted weights might be feasible while reuse is impaired for other tasks (e.g. HC vs. MC). The partial freezing and fine-tuning strategies appear to be more consistent across tasks, however, the optimal strategy still differs. Overall, our results indicate that the transferability of features not only depends on the imaging modality but also the classification task. This adds to previous insights on transfer learning in the medical domain where the optimal transfer strategy was found to be modality and dataset size dependent tajbakhsh2016convolutional . Comparing the partial freezing and fine-tuning strategies, performance is very close and there is no optimal strategy for each of the tasks. However, training times are also an aspect to consider for the different transfer learning strategies. As shown in Figure 6, freezing more parameters inside the architecture leads to reduced training times. Thus, partial freezing can be generally seen as advantageous as it often achieves similar performance as full tine-tuning while requiring less training time. For application, this insight could be useful when adopting and retraining models for cancer classification in other organs or when newer architectures are introduced.

Different architectures for CLM. To analyze the different transfer strategies further, we consider the ROC curves of each architecture for the HP vs. MP task, see Figure 5. For this task, using ”off-the-shelf” features and only retraining the classifier performed considerably better than for the other tasks. As discussed before, this indicates that transfer learning scenarios are classification task-dependent. In detail, the ROC curves reveal that VGG-16 stands out in particular where retraining the classifier only performs best out of all transfer strategies. In transfer learning research, VGG-16 is still a popular general purpose feature extractor for numerous tasks herath2017going ; Litjens.2017 . For the other architectures, the optimal strategy differs. For example, for Densenet121, the partial freezing methods show good operating points in the upper, left corner of the ROC-curve. For Inception-V3 and SE-Resnext50, partial freezing and fine-tuning perform similar with no clearly superior method. This indicates that the choice of transfer learning method depends on the architecture. This should be expected, as the models have very different block types and each freezing type fixes a different number of parameters. The detailed results in Table 1 with additional metrics underline this insight. There is no optimal transfer learning strategy and the best performing strategy varies for different architectures and classification tasks. Overall, we demonstrate that transfer learning has an impact on performance, however, there is no simple rule-of-thumb for optimal transfer learning with CLM. Our results show that examining different freezing strategies can considerably improve performance for individual models.

5 Conclusion

We investigate the feasibility of colon cancer classification in CLM images using CNNs and multiple transfer learning scenarios. Using in-vivo images of healthy and malignant colon and peritoneum tissue obtained from ten subjects, we adopt four architectures and five transfer learning scenarios for three classification problems with CLM. Our results show that different organs as well as healthy and malignant peritoneum tissue can be classified with deep transfer learning. We show that transfer learning from ImageNet is successful with CLM but the transferability of features is limited. We find that there is no single optimal model or transfer strategy for all CLM classification problems and that task-specific engineering is likely required for application. For future work, our results could be extended to more classification problems with CLM.

Compliance with Ethical Standards

Funding: The authors have no funding to declare.

Conflict of Interest: The authors declare that they have no conflict of interest.

Ethical Approval: All procedures performed in studies involving animals were in accordance with the ethical standards of the institution or practice at which the studies were conducted.

Informed Consent: Informed consent was obtained from all individual participants included in the study.


  • (1) Torre, L.A., Bray, F., Siegel, R.L., Ferlay, J., Lortet-Tieulent, J., Jemal, A. (2015) Global cancer statistics, 2012. CA: a cancer journal for clinicians 65(2), 87–108
  • (2) Verwaal, V.J., van Ruth, S., Witkamp, A., Boot, H., van Slooten, G., Zoetmulder, F.A. (2005) Long-term survival of peritoneal carcinomatosis of colorectal origin. Annals of surgical oncology 12(1), 65–71
  • (3) Franko, J., Shi, Q., Goldman, C.D., Pockaj, B.A., Nelson, G.D., Goldberg, R.M., Pitot, H.C., Grothey, A., Alberts, S.R., Sargent, D.J. (2012) Treatment of colorectal peritoneal carcinomatosis with systemic chemotherapy: a pooled analysis of north central cancer treatment group phase iii trials n9741 and n9841. Journal of Clinical Oncology 30(3), 263
  • (4) de Bree, E., Koops, W., Kröger, R., van Ruth, S., Witkamp, A.J., Zoetmulder, F.A. (2004) Peritoneal carcinomatosis from colorectal or appendiceal origin: correlation of preoperative ct with intraoperative findings and evaluation of interobserver agreement. Journal of surgical oncology 86(2), 64–73
  • (5) Dromain, C., Leboulleux, S., Auperin, A., Goere, D., Malka, D., Lumbroso, J., Schumberger, M., Sigal, R., Elias, D. (2008) Staging of peritoneal carcinomatosis: enhanced ct vs. pet/ct. Abdominal imaging 33(1), 87–93
  • (6) Low, R.N., Semelka, R.C., Worawattanakul, S., Alzate, G.D. (2000) Extrahepatic abdominal imaging in patients with malignancy: comparison of mr imaging and helical ct in 164 patients. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine 12(2), 269–277
  • (7) Iafrate, F., Ciolina, M., Sammartino, P., Baldassari, P., Rengo, M., Lucchesi, P., Sibio, S., Accarpio, F., Di Giorgio, A., Laghi, A. (2012) Peritoneal carcinomatosis: imaging with 64-mdct and 3t mri with diffusion-weighted imaging. Abdominal imaging 37(4), 616–627
  • (8) González-Moreno, S., González-Bayón, L., Ortega-Pérez, G., González-Hernando, C. (2009) Imaging of peritoneal carcinomatosis. The Cancer Journal 15(3), 184–189
  • (9) Ishigami, S., Uenosono, Y., Arigami, T., Yanagita, S., Okumura, H., Uchikado, Y., Kita, Y., Kurahara, H., Kijima, Y., Nakajo, A., Maemura, K., Natsugoe, S. (2014) Clinical utility of perioperative staging laparoscopy for advanced gastric cancer. World journal of surgical oncology 12(1), 350
  • (10) Ellebrecht, D.B., Kuempers, C., Horn, M., Keck, T., Kleemann, M. (2018) Confocal laser microscopy as novel approach for real-time and in-vivo tissue examination during minimal-invasive surgery in colon cancer. Surgical endoscopy pp. 1–7
  • (11) Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A., Van Ginneken, B., Sánchez, C.I. (2017) A survey on deep learning in medical image analysis. Medical image analysis 42, 60–88
  • (12) Goceri, E., Goceri, N. (2017) Deep learning in medical image analysis: recent advances and future trends.

    In: International Conferences Computer Graphics, Visualization, Computer Vision and Image Processing, pp. 305–311

  • (13) Esteva, A., Kuprel, B., Novoa, R.A., Ko, J., Swetter, S.M., Blau, H.M., Thrun, S. (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115
  • (14) Shen, D., Wu, G., Suk, H.I. (2017) Deep learning in medical image analysis. Annual review of biomedical engineering 19, 221–248
  • (15) Bengio, Y. (2012) Deep learning of representations for unsupervised and transfer learning. In: Proceedings of ICML Workshop on Unsupervised and Transfer Learning, pp. 17–36
  • (16) Hoo-Chang, S., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M. (2016) Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging 35(5), 1285
  • (17) Gessert, N., Lutz, M., Heyder, M., Latus, S., Leistner, D.M., Abdelwahed, Y.S., Schlaefer, A. (2019) Automatic plaque detection in ivoct pullbacks using convolutional neural networks. IEEE transactions on medical imaging 38(2), 426–434
  • (18) Tajbakhsh, N., Shin, J.Y., Gurudu, S.R., Hurst, R.T., Kendall, C.B., Gotway, M.B., Liang, J. (2016) Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging 35(5), 1299–1312
  • (19) Rajadhyaksha, M., Grossman, M., Esterowitz, D., Webb, R.H., Anderson, R.R. (1995) In vivo confocal scanning laser microscopy of human skin: melanin provides strong contrast. Journal of Investigative Dermatology 104(6), 946–952
  • (20) Niederer, R.L., Perumal, D., Sherwin, T., McGhee, C.N. (2007) Age-related differences in the normal human cornea: a laser scanning in vivo confocal microscopy study. British Journal of Ophthalmology
  • (21) Aubreville, M., Knipfer, C., Oetter, N., Jaremenko, C., Rodner, E., Denzler, J., Bohr, C., Neumann, H., Stelzle, F., Maier, A. (2017) Automatic classification of cancerous tissue in laserendomicroscopy images of the oral cavity using deep learning. Scientific reports 7(1), 11,979
  • (22) Aubreville, M., Stoeve, M., Oetter, N., Goncalves, M., Knipfer, C., Neumann, H., Bohr, C., Stelzle, F., Maier, A. (2018) Deep learning-based detection of motion artifacts in probe-based confocal laser endomicroscopy images. International journal of computer assisted radiology and surgery pp. 1–12
  • (23) Wiltgen, M., Bloice, M. (2016) Automatic interpretation of melanocytic images in confocal laser scanning microscopy. In: Microscopy and Analysis. InTech
  • (24) Hong, J., Park, B.y., Park, H. (2017) Convolutional neural network classifier for distinguishing barrett’s esophagus and neoplasia endomicroscopy images. In: Engineering in Medicine and Biology Society (EMBC), 2017 39th Annual International Conference of the IEEE, pp. 2892–2895. IEEE
  • (25) Izadyyazdanabadi, M., Belykh, E., Mooney, M.A., Eschbacher, J.M., Nakaji, P., Yang, Y., Preul, M.C. (2018) Prospects for theranostics in neurosurgical imaging: Empowering confocal laser endomicroscopy diagnostics via deep learning. Frontiers in Oncology 8, 240
  • (26) Izadyyazdanabadi, M., Belykh, E., Martirosyan, N., Eschbacher, J., Nakaji, P., Yang, Y., Preul, M.C. (2017) Improving utility of brain tumor confocal laser endomicroscopy: objective value assessment and diagnostic frame detection with convolutional neural networks. In: Medical Imaging 2017: Computer-Aided Diagnosis, vol. 10134, p. 101342J. International Society for Optics and Photonics
  • (27)

    Izadyyazdanabadi, M., Belykh, E., Cavallo, C., Zhao, X., Gandhi, S., Moreira, L.B., Eschbacher, J., Nakaji, P., Preul, M.C., Yang, Y. (2018) Weakly-supervised learning-based feature localization for confocal laser endomicroscopy glioma images.

    In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 300–308. Springer
  • (28) Izadyyazdanabadi, M., Belykh, E., Mooney, M., Martirosyan, N., Eschbacher, J., Nakaji, P., Preul, M.C., Yang, Y. (2018) Convolutional neural networks: ensemble modeling, fine-tuning and unsupervised semantic localization for neurosurgical cle images. Journal of Visual Communication and Image Representation 54, 10–20
  • (29) Simonyan, K., Zisserman, A. (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
  • (30) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016) Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826
  • (31) Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L. (2016) Densely connected convolutional networks. arXiv preprint arXiv:1608.06993
  • (32) Hu, J., Shen, L., Sun, G. (2018) Squeeze-and-excitation networks.

    In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141

  • (33) Gessert, N., Wittig, L., Drömann, D., Keck, T., Schlaefer, A., Ellebrecht, D.B. (2019) Feasibility of colon cancer detection in confocal laser microscopy images using convolution neural networks. In: Bildverarbeitung für die Medizin 2019
  • (34) Ioffe, S., Szegedy, C. (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: ICML
  • (35) He, K., Zhang, X., Ren, S., Sun, J. (2016) Deep residual learning for image recognition. In: CVPR, pp. 770–778
  • (36)

    Nair, V., Hinton, G.E. (2010) Rectified linear units improve restricted boltzmann machines.

    In: ICML, pp. 807–814
  • (37) Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K. (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5987–5995. IEEE
  • (38) Shin, H.C., Roth, H.R., Gao, M., Le Lu, Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M. (2016) Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Transactions on Medical Imaging 35(5), 1285–1298
  • (39) Yosinski, J., Clune, J., Bengio, Y., Lipson, H. (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, pp. 3320–3328
  • (40) Herath, S., Harandi, M., Porikli, F. (2017) Going deeper into action recognition: A survey. Image and vision computing 60, 4–21