SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading

10/20/2022
by   Yijin Huang, et al.
0

Self-supervised learning (SSL) has been widely applied to learn image representations through exploiting unlabeled images. However, it has not been fully explored in the medical image analysis field. In this work, we propose Saliency-guided Self-Supervised image Transformer (SSiT) for diabetic retinopathy (DR) grading from fundus images. We novelly introduce saliency maps into SSL, with a goal of guiding self-supervised pre-training with domain-specific prior knowledge. Specifically, two saliency-guided learning tasks are employed in SSiT: (1) We conduct saliency-guided contrastive learning based on the momentum contrast, wherein we utilize fundus images' saliency maps to remove trivial patches from the input sequences of the momentum-updated key encoder. And thus, the key encoder is constrained to provide target representations focusing on salient regions, guiding the query encoder to capture salient features. (2) We train the query encoder to predict the saliency segmentation, encouraging preservation of fine-grained information in the learned representations. Extensive experiments are conducted on four publicly-accessible fundus image datasets. The proposed SSiT significantly outperforms other representative state-of-the-art SSL methods on all datasets and under various evaluation settings, establishing the effectiveness of the learned representations from SSiT. The source code is available at https://github.com/YijinHuang/SSiT.

READ FULL TEXT

page 1

page 3

page 8

page 9

research
03/09/2022

3SD: Self-Supervised Saliency Detection With No Labels

We present a conceptually simple self-supervised method for saliency det...
research
11/27/2020

Self supervised contrastive learning for digital histopathology

Unsupervised learning has been a long-standing goal of machine learning ...
research
07/17/2021

Lesion-based Contrastive Learning for Diabetic Retinopathy Grading from Fundus Images

Manually annotating medical images is extremely expensive, especially fo...
research
02/22/2023

Saliency Guided Contrastive Learning on Scene Images

Self-supervised learning holds promise in leveraging large numbers of un...
research
06/17/2022

Rectify ViT Shortcut Learning by Visual Saliency

Shortcut learning is common but harmful to deep learning models, leading...
research
10/26/2022

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input

Masked Autoencoders is a simple yet powerful self-supervised learning me...
research
08/11/2022

On the Pros and Cons of Momentum Encoder in Self-Supervised Visual Representation Learning

Exponential Moving Average (EMA or momentum) is widely used in modern se...

Please sign up or login with your details

Forgot password? Click here to reset