Towards a Visual-Language Foundation Model for Computational Pathology

07/24/2023
by   Ming Y. Lu, et al.
0

The accelerated adoption of digital pathology and advances in deep learning have enabled the development of powerful models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain and the model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text, and notably over 1.17 million image-caption pairs via task-agnostic pretraining. Evaluated on a suite of 13 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving either or both histopathology images and text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.

READ FULL TEXT

page 7

page 10

page 12

page 14

page 31

page 32

page 33

page 35

research
03/13/2023

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents

Foundation models trained on large-scale dataset gain a recent surge in ...
research
09/15/2022

OmniVL:One Foundation Model for Image-Language and Video-Language Tasks

This paper presents OmniVL, a new foundation model to support both image...
research
05/04/2022

CoCa: Contrastive Captioners are Image-Text Foundation Models

Exploring large-scale pretrained foundation models is of significant int...
research
06/15/2023

COSA: Concatenated Sample Pretrained Vision-Language Foundation Model

Due to the limited scale and quality of video-text training corpus, most...
research
12/08/2022

Structured Vision-Language Pretraining for Computational Cooking

Vision-Language Pretraining (VLP) and Foundation models have been the go...
research
09/14/2023

Virchow: A Million-Slide Digital Pathology Foundation Model

Computational pathology uses artificial intelligence to enable precision...
research
06/12/2023

Sticker820K: Empowering Interactive Retrieval with Stickers

Stickers have become a ubiquitous part of modern-day communication, conv...

Please sign up or login with your details

Forgot password? Click here to reset