PRIOR: Prototype Representation Joint Learning from Medical Images and Reports

07/24/2023
by   Pujin Cheng, et al.
0

Contrastive learning based vision-language joint pre-training has emerged as a successful representation learning strategy. In this paper, we present a prototype representation learning framework incorporating both global and local alignment between medical images and reports. In contrast to standard global multi-modality alignment methods, we employ a local alignment module for fine-grained representation. Furthermore, a cross-modality conditional reconstruction module is designed to interchange information across modalities in the training phase by reconstructing masked images and reports. For reconstructing long reports, a sentence-wise prototype memory bank is constructed, enabling the network to focus on low-level localized visual and high-level clinical linguistic features. Additionally, a non-auto-regressive generation paradigm is proposed for reconstructing non-sequential reports. Experimental results on five downstream tasks, including supervised classification, zero-shot classification, image-to-text retrieval, semantic segmentation, and object detection, show the proposed method outperforms other state-of-the-art methods across multiple datasets and under different dataset size settings. The code is available at https://github.com/QtacierP/PRIOR.

READ FULL TEXT

page 3

page 8

page 15

page 16

research
12/06/2021

Joint Learning of Localized Representations from Medical Images and Reports

Contrastive learning has proven effective for pre-training image models ...
research
10/18/2022

MedCLIP: Contrastive Learning from Unpaired Medical Images and Text

Existing vision-text contrastive learning like CLIP aims to match the pa...
research
10/12/2022

Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning

Learning medical visual representations directly from paired radiology r...
research
07/17/2022

Gigapixel Whole-Slide Images Classification using Locally Supervised Learning

Histopathology whole slide images (WSIs) play a very important role in c...
research
11/21/2018

Unsupervised Multimodal Representation Learning across Medical Images and Reports

Joint embeddings between medical imaging modalities and associated radio...
research
08/18/2023

Label-Free Event-based Object Recognition via Joint Learning with Image Reconstruction from Events

Recognizing objects from sparse and noisy events becomes extremely diffi...
research
03/19/2022

Domain Adaptation Meets Zero-Shot Learning: An Annotation-Efficient Approach to Multi-Modality Medical Image Segmentation

Due to the lack of properly annotated medical data, exploring the genera...

Please sign up or login with your details

Forgot password? Click here to reset