Joint Learning of Localized Representations from Medical Images and Reports

12/06/2021
by   Philip Müller, et al.
0

Contrastive learning has proven effective for pre-training image models on unlabeled data with promising results for tasks such as medical image classification. Using paired text and images (such as radiological reports and images) during pre-training improved the results even further. Still, most existing methods target image classification as downstream tasks and may not be optimal for localized tasks like semantic segmentation or object detection. We therefore propose Localized representation learning from Vision and Text (LoVT), to our best knowledge, the first text-supervised pre-training method that targets localized medical imaging tasks. Our method combines instance-level image-report contrastive learning with local contrastive learning on image region and report sentence representations. We evaluate LoVT and commonly used pre-training methods on a novel evaluation framework consisting of 18 localized tasks on chest X-rays from five public datasets. While there is no single best method, LoVT performs best on 11 out of the 18 studied tasks making it the preferred method of choice for localized tasks.

READ FULL TEXT
research
11/14/2022

The Role of Local Alignment and Uniformity in Image-Text Contrastive Learning on Medical Images

Image-text contrastive learning has proven effective for pretraining med...
research
07/24/2023

PRIOR: Prototype Representation Joint Learning from Medical Images and Reports

Contrastive learning based vision-language joint pre-training has emerge...
research
07/31/2023

Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training

Harnessing the power of pre-training on large-scale datasets like ImageN...
research
07/12/2023

Unified Medical Image-Text-Label Contrastive Learning With Continuous Prompt

Contrastive language-image Pre-training (CLIP) [13] can leverage large d...
research
08/26/2021

LocTex: Learning Data-Efficient Visual Representations from Localized Textual Supervision

Computer vision tasks such as object detection and semantic/instance seg...
research
10/18/2022

MedCLIP: Contrastive Learning from Unpaired Medical Images and Text

Existing vision-text contrastive learning like CLIP aims to match the pa...
research
12/17/2021

Unified 2D and 3D Pre-training for Medical Image classification and Segmentation

Self-supervised learning (SSL) opens up huge opportunities for better ut...

Please sign up or login with your details

Forgot password? Click here to reset