ViT-AE++: Improving Vision Transformer Autoencoder for Self-supervised Medical Image Representations

01/18/2023
by   Chinmay Prabhakar, et al.
0

Self-supervised learning has attracted increasing attention as it learns data-driven representation from data without annotations. Vision transformer-based autoencoder (ViT-AE) by He et al. (2021) is a recent self-supervised learning technique that employs a patch-masking strategy to learn a meaningful latent space. In this paper, we focus on improving ViT-AE (nicknamed ViT-AE++) for a more effective representation of both 2D and 3D medical images. We propose two new loss functions to enhance the representation during the training stage. The first loss term aims to improve self-reconstruction by considering the structured dependencies and hence indirectly improving the representation. The second loss term leverages contrastive loss to directly optimize the representation from two randomly masked views. As an independent contribution, we extended ViT-AE++ to a 3D fashion for volumetric medical images. We extensively evaluate ViT-AE++ on both natural images and medical images, demonstrating consistent improvement over vanilla ViT-AE and its superiority over other contrastive learning approaches.

READ FULL TEXT

page 4

page 7

page 12

research
02/10/2023

A Review of Predictive and Contrastive Self-supervised Learning for Medical Images

Over the last decade, supervised deep learning on manually annotated big...
research
06/15/2023

Advancing Volumetric Medical Image Segmentation via Global-Local Masked Autoencoder

Masked autoencoder (MAE) has emerged as a promising self-supervised pret...
research
01/03/2023

A New Perspective to Boost Vision Transformer for Medical Image Classification

Transformer has achieved impressive successes for various computer visio...
research
04/25/2022

Masked Image Modeling Advances 3D Medical Image Analysis

Recently, masked image modeling (MIM) has gained considerable attention ...
research
11/02/2022

RegCLR: A Self-Supervised Framework for Tabular Representation Learning in the Wild

Recent advances in self-supervised learning (SSL) using large models to ...
research
03/11/2022

Towards Self-Supervised Learning of Global and Object-Centric Representations

Self-supervision allows learning meaningful representations of natural i...
research
12/17/2021

Unified 2D and 3D Pre-training for Medical Image classification and Segmentation

Self-supervised learning (SSL) opens up huge opportunities for better ut...

Please sign up or login with your details

Forgot password? Click here to reset