Deep learning has become a leading tool for analyzing medical images, and digital pathology as its major application area (Litjens et al., 2017). Main practical issues in current deep learning methods for medical imaging are low number of recorded cases, large size of images (slides) and low availability of a diagnosis with a pixel level annotation (a.k.a. weakly labeled data). These problems lead to severe overfitting, impractical computations, e.g., training using images larger than pixels requires already a considerably large amount of computational resources, and difficulties in information flow from single label for large images. We propose to handle these issues by introducing a framework that processes a medical image as a collection of small patches using a single, shared neural network. The final diagnosis is provided by combining scores of individual patches. In machine learning community such approach is called a multi-instance learning (MIL) (Maron and Lozano-Pérez, 1998).
but these methods were mainly used for already pre-processed data. Recently, there is an increase of interest in applying MIL to medical imaging and, especially, to histopathology. One of first such methods used SVM and Boosting to cluster and classify colon cancer images(Xu et al., 2012). Recently, a single neural network with a MIL-pooling layer was used to classify and segment microscopy images with populations of cells (Kraus et al., 2016)
. A method that is closely related to our approach utilized a neural network to process small patches in the first stage of training and the Expectation Maximization algorithm to determine latent labels of the patches in the second stage(Hou et al., 2016)
. However, our model is trained end-to-end by backpropagation.
A classical supervised learning problem aims at finding a model that takes an object,, and predicts a value of a target variable, . In the multi-instance learning problem, however, there is a bag of objects, , that exhibit neither dependency nor ordering among each other. There is also a single label associated with this bag. We assume that could vary for different bags. We do not have access to individual labels of the objects within the bag, i.e., we assume are unknown, but we know that the label of the bag is if at least one object is , i.e., . This statement is equivalent to the logic OR operator and could be further re-formulated as the maximum operator: The max-operator is permutation-invariant that is an important property since objects within a bag are independent.
Training a bag-level classifier requires a permutation-invariant combination of individual labels
that are given by an instance-level (shared) classifier. In this paper, we propose to train a model using the likelihood approach. We take the Bernoulli distribution for the bag label:
is the probability ofgiven the bag of objects . Further, we consider a shared instance-level classifier (a neural network) with parameters , , that returns a score for the -th object, and . Then, the parameter is modeled using a permutation-invariant operator , i.e., .
Obviously, we can choose the max-operator as but it is not necessarily well-suited for training neural networks using the backpropagation. Alternatively, we consider the following differentiable operators:
Once the operator is chosen, we train the model by minimazing the negative log-likelihood using (1):
In our framework the input is a slide or a patch from a needle biopsy stained with Hematoxylin & Eosin (H&E). Further, we divide the input into small patches (e.g., pixels). Each small patch is processed by a shared neural network , which consists of several convolutional layers and fully-connected layers with dropout, and it returns a score of each small patch, . A larger score determines a Region of Interest (ROI) that could be later presented to a human doctor. Eventually, an application of a permutation-invariant operator provides the probability of a diagnosis, e.g., benign or malignant tumor. The proposed framework is presented in Figure 1.
In the experiments we used a dataset that consists of 58 H&E stained histopathology image excerpts (896768 pixels) taken from 32 benign and 26 malignant breast cancer patients (Gelasca et al., 2008). Due to a limited size of the dataset, a 4-fold cross-validation is used as in (Kandemir et al., 2014). For images in the training set, we select eight 768768 overlapping subimages. However, for images in the test set we select a single 768768 subimage from the center of the image. During training, we use 10% of the training set for validation and monitoring a training progress. Subsequently, each subimage is divided into patches of 9696 pixels. A patch is discarded if more than 75% of the pixels are white.
In every training iteration we perform data augmentation to prevent overfitting. We randomly adjust the amount of H&E by decomposing the RGB color of the tissue into the H&E color space (Ruifrok and Johnston, 2001)
, followed by multiplying the magnitude of H&E for a pixel by two i.i.d. Gaussian random variables with expectation equal to one. We randomly rotate and mirror every patch. Lastly, we blur the patch using a Gaussian blur filter with a randomly chosen blur radius. See Figure2 for examples of data augmentation transformations.
Results and discussion
First, we notice that the proposed approach achieved similar performance to Gaussian process-based methods in terms of AUC. Second, the LSE operator failed to obtain high accuracy and F-score but it still resulted in high AUC. Comparing all operators, we believe that Noisy-or is the most promising but in order to obtain even better results a kind of regularization is required. A possible extension of the presented work would be an application of the Bayesian learning similarly to(Raykar et al., 2008). However, we leave investigating these issues for further research.
|GPMIL (Kandemir et al., 2014)||N/A||N/A||N/A||N/A||0.86|
|RGPMIL (Kandemir et al., 2014)||N/A||N/A||N/A||N/A||0.90|
Jakub M. Tomczak was funded by the European Commission within the Marie Skłodowska-Curie Individual Fellowship (Grant No. 702666, ”Deep Learning and Bayesian Inference for Medical Imaging”). Maximilian Ilse was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Grant ”DLMedIa: Deep Learning for Medical Image Analysis”).
Chen and Srihari (2014)
G. Chen and S. N. Srihari.
A noisy-or discriminative restricted boltzmann machine for recognizing handwriting style development.In Int. Conf. on Frontiers in Handwriting Recognition, pages 714–719, 2014.
- Gelasca et al. (2008) E. D. Gelasca, J. Byun, B. Obara, and B. Manjunath. Evaluation and benchmark for biological image segmentation. In IEEE Int. Conf. on Image Processing, pages 1816–1819, 2008.
- Halpern and Sontag (2013) Y. Halpern and D. Sontag. Unsupervised learning of noisy-or bayesian networks. arXiv preprint arXiv:1309.6834, 2013.
Hou et al. (2016)
L. Hou, D. Samaras, T. M. Kurc, Y. Gao, J. E. Davis, and J. H. Saltz.
Patch-based convolutional neural network for whole slide tissue image classification.In CVPR, pages 2424–2433, 2016.
- Kandemir et al. (2014) M. Kandemir, C. Zhang, and F. A. Hamprecht. Empowering multiple instance histopathology cancer diagnosis by cell graphs. In MICCAI, pages 228–235, 2014.
- Keeler et al. (1991) J. D. Keeler, D. E. Rumelhart, and W. K. Leow. Integrated segmentation and recognition of hand-printed numerals. In NIPS, pages 557–563, 1991.
- Kraus et al. (2016) O. Z. Kraus, J. L. Ba, and B. J. Frey. Classifying and segmenting microscopy images with deep multiple instance learning. Bioinformatics, 32(12):i52–i59, 2016.
- Litjens et al. (2017) G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. van der Laak, B. van Ginneken, and C. I. Sánchez. A survey on deep learning in medical image analysis. arXiv preprint arXiv:1702.05747, 2017.
- Maron and Lozano-Pérez (1998) O. Maron and T. Lozano-Pérez. A framework for multiple-instance learning. In NIPS, pages 570–576, 1998.
- Ramon and De Raedt (2000) J. Ramon and L. De Raedt. Multi instance neural networks. In ICML Workshop on Attribute-value and Relational Learning, pages 53–60, 2000.
Raykar et al. (2008)
V. C. Raykar, B. Krishnapuram, J. Bi, M. Dundar, and R. B. Rao.
Bayesian multiple instance learning: automatic feature selection and inductive transfer.In ICML, pages 808–815, 2008.
- Ruifrok and Johnston (2001) A. C. Ruifrok and D. A. Johnston. Quantification of histochemical staining by color deconvolution. Analytical and Quantitative Cytology and Histology, 23(4):291–299, 2001.
- Xu et al. (2012) Y. Xu, J.-Y. Zhu, E. Chang, and Z. Tu. Multiple clustered instance learning for histopathology cancer image classification, segmentation and clustering. In CVPR, pages 964–971, 2012.
- Zhang et al. (2006) C. Zhang, J. C. Platt, and P. A. Viola. Multiple instance boosting for object detection. In NIPS, pages 1417–1424, 2006.