Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training

Harnessing the power of pre-training on large-scale datasets like ImageNet forms a fundamental building block for the progress of representation learning-driven solutions in computer vision. Medical images are inherently different from natural images as they are acquired in the form of many modalities (CT, MR, PET, Ultrasound etc.) and contain granulated information like tissue, lesion, organs etc. These characteristics of medical images require special attention towards learning features representative of local context. In this work, we focus on designing an effective pre-training framework for 3D radiology images. First, we propose a new masking strategy called local masking where the masking is performed across channel embeddings instead of tokens to improve the learning of local feature representations. We combine this with classical low-level perturbations like adding noise and downsampling to further enable low-level representation learning. To this end, we introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations. Additionally, we also devise a cross-modal contrastive loss (CMCL) to accommodate the pre-training of multiple modalities in a single framework. We curate a large-scale dataset to enable pre-training of 3D medical radiology images (MRI and CT). The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance. Notably, our proposed method tops the public test leaderboard of BTCV multi-organ segmentation challenge.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
09/21/2023

Multi-level Asymmetric Contrastive Learning for Medical Image Segmentation Pre-training

Contrastive learning, which is a powerful technique for learning image-l...
research
12/06/2021

Joint Learning of Localized Representations from Medical Images and Reports

Contrastive learning has proven effective for pre-training image models ...
research
03/09/2023

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature Mimicking

Masked Autoencoders (MAE) have been popular paradigms for large-scale vi...
research
12/19/2021

On Efficient Transformer and Image Pre-training for Low-level Vision

Pre-training has marked numerous state of the arts in high-level compute...
research
08/24/2021

Graph Contrastive Pre-training for Effective Theorem Reasoning

Interactive theorem proving is a challenging and tedious process, which ...
research
10/09/2020

A Cross-Level Information Transmission Network for Predicting Phenotype from New Genotype: Application to Cancer Precision Medicine

An unsolved fundamental problem in biology and ecology is to predict obs...
research
04/04/2022

MultiMAE: Multi-modal Multi-task Masked Autoencoders

We propose a pre-training strategy called Multi-modal Multi-task Masked ...

Please sign up or login with your details

Forgot password? Click here to reset