Cascaded Cross-Attention Networks for Data-Efficient Whole-Slide Image Classification Using Transformers

05/11/2023
by   Firas Khader, et al.
0

Whole-Slide Imaging allows for the capturing and digitization of high-resolution images of histological specimen. An automated analysis of such images using deep learning models is therefore of high demand. The transformer architecture has been proposed as a possible candidate for effectively leveraging the high-resolution information. Here, the whole-slide image is partitioned into smaller image patches and feature tokens are extracted from these image patches. However, while the conventional transformer allows for a simultaneous processing of a large set of input tokens, the computational demand scales quadratically with the number of input tokens and thus quadratically with the number of image patches. To address this problem we propose a novel cascaded cross-attention network (CCAN) based on the cross-attention mechanism that scales linearly with the number of extracted patches. Our experiments demonstrate that this architecture is at least on-par with and even outperforms other attention-based state-of-the-art methods on two public datasets: On the use-case of lung cancer (TCGA NSCLC) our model reaches a mean area under the receiver operating characteristic (AUC) of 0.970 ± 0.008 and on renal cancer (TCGA RCC) reaches a mean AUC of 0.985 ± 0.004. Furthermore, we show that our proposed model is efficient in low-data regimes, making it a promising approach for analyzing whole-slide images in resource-limited settings. To foster research in this direction, we make our code publicly available on GitHub: XXX.

READ FULL TEXT

page 8

page 11

research
10/11/2022

Memory transformers for full context and high-resolution 3D Medical Segmentation

Transformer models achieve state-of-the-art results for image segmentati...
research
06/17/2021

XCiT: Cross-Covariance Image Transformers

Following their success in natural language processing, transformers hav...
research
05/06/2023

DBAT: Dynamic Backward Attention Transformer for Material Segmentation with Cross-Resolution Patches

The objective of dense material segmentation is to identify the material...
research
04/01/2023

Cross-scale Multi-instance Learning for Pathological Image Diagnosis

Analyzing high resolution whole slide images (WSIs) with regard to infor...
research
03/17/2023

CerviFormer: A Pap-smear based cervical cancer classification method using cross attention and latent transformer

Purpose: Cervical cancer is one of the primary causes of death in women....
research
04/03/2023

U-Netmer: U-Net meets Transformer for medical image segmentation

The combination of the U-Net based deep learning models and Transformer ...
research
02/07/2022

Inference of captions from histopathological patches

Computational histopathology has made significant strides in the past fe...

Please sign up or login with your details

Forgot password? Click here to reset