Masked Pre-Training of Transformers for Histology Image Analysis

04/14/2023
by   Shuai Jiang, et al.
0

In digital pathology, whole slide images (WSIs) are widely used for applications such as cancer diagnosis and prognosis prediction. Visual transformer models have recently emerged as a promising method for encoding large regions of WSIs while preserving spatial relationships among patches. However, due to the large number of model parameters and limited labeled data, applying transformer models to WSIs remains challenging. Inspired by masked language models, we propose a pretext task for training the transformer model without labeled data to address this problem. Our model, MaskHIT, uses the transformer output to reconstruct masked patches and learn representative histological features based on their positions and visual features. The experimental results demonstrate that MaskHIT surpasses various multiple instance learning approaches by 3 subtype classification tasks, respectively. Furthermore, MaskHIT also outperforms two of the most recent state-of-the-art transformer-based methods. Finally, a comparison between the attention maps generated by the MaskHIT model with pathologist's annotations indicates that the model can accurately identify clinically relevant histological structures in each task.

READ FULL TEXT

page 3

page 8

page 12

research
06/29/2023

Learning Nuclei Representations with Masked Image Modelling

Masked image modelling (MIM) is a powerful self-supervised representatio...
research
03/02/2023

BEL: A Bag Embedding Loss for Transformer enhances Multiple Instance Whole Slide Image Classification

Multiple Instance Learning (MIL) has become the predominant approach for...
research
07/31/2023

Performance Evaluation of Swin Vision Transformer Model using Gradient Accumulation Optimization Technique

Vision Transformers (ViTs) have emerged as a promising approach for visu...
research
10/04/2021

VTAMIQ: Transformers for Attention Modulated Image Quality Assessment

Following the major successes of self-attention and Transformers for ima...
research
11/26/2018

MIST: Multiple Instance Spatial Transformer Network

We propose a deep network that can be trained to tackle image reconstruc...
research
07/02/2021

Case Relation Transformer: A Crossmodal Language Generation Model for Fetching Instructions

There have been many studies in robotics to improve the communication sk...
research
03/24/2023

MSdocTr-Lite: A Lite Transformer for Full Page Multi-script Handwriting Recognition

The Transformer has quickly become the dominant architecture for various...

Please sign up or login with your details

Forgot password? Click here to reset