Medical Image Understanding with Pretrained Vision Language Models: A Comprehensive Study

09/30/2022
by   Ziyuan Qin, et al.
0

The large-scale pre-trained vision language models (VLM) have shown remarkable domain transfer capability on natural images. However, it remains unknown whether this capability can also apply to the medical image domain. This paper thoroughly studies the knowledge transferability of pre-trained VLMs to the medical domain, where we show that well-designed medical prompts are the key to elicit knowledge from pre-trained VLMs. We demonstrate that by prompting with expressive attributes that are shared between domains, the VLM can carry the knowledge across domains and improve its generalization. This mechanism empowers VLMs to recognize novel objects with fewer or without image samples. Furthermore, to avoid the laborious manual designing process, we develop three approaches for automatic generation of medical prompts, which can inject expert-level medical knowledge and image-specific information into the prompts for fine-grained grounding. We conduct extensive experiments on thirteen different medical datasets across various modalities, showing that our well-designed prompts greatly improve the zero-shot performance compared to the default prompts, and our fine-tuned models surpass the supervised models by a significant margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/15/2023

Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

Medical Image Segmentation is crucial in various clinical applications w...
research
08/15/2023

A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision

Foundation vision-language models are currently transforming computer vi...
research
04/28/2023

Segment Anything Model for Medical Images?

The Segment Anything Model (SAM) is the first foundation model for gener...
research
07/05/2023

A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis

Zero-shot medical image classification is a critical process in real-wor...
research
03/15/2023

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

Large Language Models (LLMs) have made remarkable strides in various tas...
research
02/13/2023

A Comprehensive Study of Modern Architectures and Regularization Approaches on CheXpert5000

Computer aided diagnosis (CAD) has gained an increased amount of attenti...
research
05/14/2023

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity

We present a comprehensive evaluation of Parameter-Efficient Fine-Tuning...

Please sign up or login with your details

Forgot password? Click here to reset