Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification

03/09/2022
by   Zhiyuan Cai, et al.
1

A large-scale labeled dataset is a key factor for the success of supervised deep learning in computer vision. However, a limited number of annotated data is very common, especially in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not need massive annotations. With an attempt to use as many as possible unlabeled ophthalmic images, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images. In this paper, we propose a universal self-supervised Transformer framework, named Uni4Eye, to discover the inherent image property and capture domain-specific feature embedding in ophthalmic images. Uni4Eye can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer (ViT) architecture. We employ a Unified Patch Embedding module to replace the origin patch embedding module in ViT for jointly processing both 2D and 3D input images. Besides, we design a dual-branch multitask decoder module to simultaneously perform two reconstruction tasks on the input image and its gradient map, delivering discriminative representations for better convergence. We evaluate the performance of our pre-trained Uni4Eye encoder by fine-tuning it on six downstream ophthalmic image classification tasks. The superiority of Uni4Eye is successfully established through comparisons to other state-of-the-art SSL pre-training methods.

READ FULL TEXT

page 3

page 8

page 10

research
12/17/2021

Unified 2D and 3D Pre-training for Medical Image classification and Segmentation

Self-supervised learning (SSL) opens up huge opportunities for better ut...
research
03/04/2022

DiT: Self-supervised Pre-training for Document Image Transformer

Image Transformer has recently achieved significant progress for natural...
research
02/21/2022

S3T: Self-Supervised Pre-training with Swin Transformer for Music Classification

In this paper, we propose S3T, a self-supervised pre-training method wit...
research
04/23/2020

Self-supervised Learning for Astronomical Image Classification

In Astronomy, a huge amount of image data is generated daily by photomet...
research
01/03/2023

A New Perspective to Boost Vision Transformer for Medical Image Classification

Transformer has achieved impressive successes for various computer visio...
research
04/11/2022

XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation

Generating a new font library is a very labor-intensive and time-consumi...
research
07/27/2023

Self-Supervised Graph Transformer for Deepfake Detection

Deepfake detection methods have shown promising results in recognizing f...

Please sign up or login with your details

Forgot password? Click here to reset