Multi-scale Efficient Graph-Transformer for Whole Slide Image Classification

05/25/2023
by   Saisai Ding, et al.
0

The multi-scale information among the whole slide images (WSIs) is essential for cancer diagnosis. Although the existing multi-scale vision Transformer has shown its effectiveness for learning multi-scale image representation, it still cannot work well on the gigapixel WSIs due to their extremely large image sizes. To this end, we propose a novel Multi-scale Efficient Graph-Transformer (MEGT) framework for WSI classification. The key idea of MEGT is to adopt two independent Efficient Graph-based Transformer (EGT) branches to process the low-resolution and high-resolution patch embeddings (i.e., tokens in a Transformer) of WSIs, respectively, and then fuse these tokens via a multi-scale feature fusion module (MFFM). Specifically, we design an EGT to efficiently learn the local-global information of patch tokens, which integrates the graph representation into Transformer to capture spatial-related information of WSIs. Meanwhile, we propose a novel MFFM to alleviate the semantic gap among different resolution patches during feature fusion, which creates a non-patch token for each branch as an agent to exchange information with another branch by cross-attention. In addition, to expedite network training, a novel token pruning module is developed in EGT to reduce the redundant tokens. Extensive experiments on TCGA-RCC and CAMELYON16 datasets demonstrate the effectiveness of the proposed MEGT.

READ FULL TEXT

page 1

page 3

page 8

research
03/27/2021

CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

The recently developed vision transformer (ViT) has achieved promising r...
research
07/05/2023

Multi-Scale Prototypical Transformer for Whole Slide Image Classification

Whole slide image (WSI) classification is an essential task in computati...
research
07/29/2021

PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion

The Transformer architecture has achieved rapiddevelopment in recent yea...
research
08/26/2022

Few-Shot Learning Meets Transformer: Unified Query-Support Transformers for Few-Shot Classification

Few-shot classification which aims to recognize unseen classes using ver...
research
07/12/2022

Trusted Multi-Scale Classification Framework for Whole Slide Image

Despite remarkable efforts been made, the classification of gigapixels w...
research
08/19/2022

Improved Image Classification with Token Fusion

In this paper, we propose a method using the fusion of CNN and transform...
research
10/19/2021

Bilateral-ViT for Robust Fovea Localization

The fovea is an important anatomical landmark of the retina. Detecting t...

Please sign up or login with your details

Forgot password? Click here to reset