Cluster-to-Conquer: A Framework for End-to-End Multi-Instance Learning for Whole Slide Image Classification

03/19/2021
by   Yash Sharma, et al.
0

In recent years, the availability of digitized Whole Slide Images (WSIs) has enabled the use of deep learning-based computer vision techniques for automated disease diagnosis. However, WSIs present unique computational and algorithmic challenges. WSIs are gigapixel-sized (∼100K pixels), making them infeasible to be used directly for training deep neural networks. Also, often only slide-level labels are available for training as detailed annotations are tedious and can be time-consuming for experts. Approaches using multiple-instance learning (MIL) frameworks have been shown to overcome these challenges. Current state-of-the-art approaches divide the learning framework into two decoupled parts: a convolutional neural network (CNN) for encoding the patches followed by an independent aggregation approach for slide-level prediction. In this approach, the aggregation step has no bearing on the representations learned by the CNN encoder. We have proposed an end-to-end framework that clusters the patches from a WSI into k-groups, samples k' patches from each group for training, and uses an adaptive attention mechanism for slide level prediction; Cluster-to-Conquer (C2C). We have demonstrated that dividing a WSI into clusters can improve the model training by exposing it to diverse discriminative features extracted from the patches. We regularized the clustering mechanism by introducing a KL-divergence loss between the attention weights of patches in a cluster and the uniform distribution. The framework is optimized end-to-end on slide-level cross-entropy, patch-level cross-entropy, and KL-divergence loss (Implementation: https://github.com/YashSharma/C2C).

READ FULL TEXT

page 12

page 16

06/13/2021

HistoTransfer: Understanding Transfer Learning for Histopathology

Advancement in digital pathology and artificial intelligence has enabled...
09/21/2022

PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification

Automatic pavement distress classification facilitates improving the eff...
09/23/2020

Whole Slide Images based Cancer Survival Prediction using Attention Guided Deep Multiple Instance Learning Networks

Traditional image-based survival prediction models rely on discriminativ...
07/27/2017

Representation-Aggregation Networks for Segmentation of Multi-Gigapixel Histology Images

Convolutional Neural Network (CNN) models have become the state-of-the-a...
04/17/2019

Aggregation Cross-Entropy for Sequence Recognition

In this paper, we propose a novel method, aggregation cross-entropy (ACE...
12/12/2021

Magnifying Networks for Images with Billions of Pixels

The shift towards end-to-end deep learning has brought unprecedented adv...