Prior Knowledge-Guided Attention in Self-Supervised Vision Transformers

09/07/2022
by   Kevin Miao, et al.
0

Recent trends in self-supervised representation learning have focused on removing inductive biases from training pipelines. However, inductive biases can be useful in settings when limited data are available or provide additional insight into the underlying data distribution. We present spatial prior attention (SPAN), a framework that takes advantage of consistent spatial and semantic structure in unlabeled image datasets to guide Vision Transformer attention. SPAN operates by regularizing attention masks from separate transformer heads to follow various priors over semantic regions. These priors can be derived from data statistics or a single labeled sample provided by a domain expert. We study SPAN through several detailed real-world scenarios, including medical image analysis and visual quality assurance. We find that the resulting attention masks are more interpretable than those derived from domain-agnostic pretraining. SPAN produces a 58.7 mAP improvement for lung and heart segmentation. We also find that our method yields a 2.2 mAUC improvement compared to domain-agnostic pretraining when transferring the pretrained model to a downstream chest disease classification task. Lastly, we show that SPAN pretraining leads to higher downstream classification performance in low-data regimes compared to domain-agnostic pretraining.

READ FULL TEXT

page 2

page 4

page 5

page 7

page 12

page 13

research
11/23/2022

Can we Adopt Self-supervised Pretraining for Chest X-Rays?

Chest radiograph (or Chest X-Ray, CXR) is a popular medical imaging moda...
research
09/01/2022

Self-Supervised Pretraining for 2D Medical Image Segmentation

Supervised machine learning provides state-of-the-art solutions to a wid...
research
08/23/2021

How Transferable Are Self-supervised Features in Medical Image Classification Tasks?

Transfer learning has become a standard practice to mitigate the lack of...
research
10/13/2022

The Hidden Uniform Cluster Prior in Self-Supervised Learning

A successful paradigm in representation learning is to perform self-supe...
research
07/06/2023

To pretrain or not to pretrain? A case study of domain-specific pretraining for semantic segmentation in histopathology

Annotating medical imaging datasets is costly, so fine-tuning (or transf...
research
05/21/2023

From Patches to Objects: Exploiting Spatial Reasoning for Better Visual Representations

As the field of deep learning steadily transitions from the realm of aca...
research
06/19/2023

ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers

In this paper we delve into the properties of transformers, attained thr...

Please sign up or login with your details

Forgot password? Click here to reset