Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

08/15/2023
by   Kanchan Poudel, et al.
0

Medical Image Segmentation is crucial in various clinical applications within the medical domain. While state-of-the-art segmentation models have proven effective, integrating textual guidance to enhance visual features for this task remains an area with limited progress. Existing segmentation models that utilize textual guidance are primarily trained on open-domain images, raising concerns about their direct applicability in the medical domain without manual intervention or fine-tuning. To address these challenges, we propose using multimodal vision-language models for capturing semantic information from image descriptions and images, enabling the segmentation of diverse medical images. This study comprehensively evaluates existing vision language models across multiple datasets to assess their transferability from the open domain to the medical field. Furthermore, we introduce variations of image descriptions for previously unseen images in the dataset, revealing notable variations in model performance based on the generated prompts. Our findings highlight the distribution shift between the open-domain images and the medical domain and show that the segmentation models trained on open-domain images are not directly transferrable to the medical field. But their performance can be increased by finetuning them in the medical datasets. We report the zero-shot and finetuned segmentation performance of 4 Vision Language Models (VLMs) on 11 medical datasets using 9 types of prompts derived from 14 attributes.

READ FULL TEXT

page 8

page 22

page 23

page 24

page 25

research
06/07/2023

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

Vision-Language Pretraining (VLP) has demonstrated remarkable capabiliti...
research
09/30/2022

Medical Image Understanding with Pretrained Vision Language Models: A Comprehensive Study

The large-scale pre-trained vision language models (VLM) have shown rema...
research
08/08/2023

Few-shot medical image classification with simple shape and texture text descriptors using vision-language models

In this work, we investigate the usefulness of vision-language models (V...
research
05/20/2023

Bi-VLGM : Bi-Level Class-Severity-Aware Vision-Language Graph Matching for Text Guided Medical Image Segmentation

Medical reports with substantial information can be naturally complement...
research
08/14/2023

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

Cultural heritage applications and advanced machine learning models are ...
research
01/18/2022

HashSet – A Dataset For Hashtag Segmentation

Hashtag segmentation is the task of breaking a hashtag into its constitu...
research
07/28/2023

Med-HALT: Medical Domain Hallucination Test for Large Language Models

This research paper focuses on the challenges posed by hallucinations in...

Please sign up or login with your details

Forgot password? Click here to reset