Bi-VLGM : Bi-Level Class-Severity-Aware Vision-Language Graph Matching for Text Guided Medical Image Segmentation

05/20/2023
by   Chen Wenting, et al.
0

Medical reports with substantial information can be naturally complementary to medical images for computer vision tasks, and the modality gap between vision and language can be solved by vision-language matching (VLM). However, current vision-language models distort the intra-model relation and mainly include class information in prompt learning that is insufficient for segmentation task. In this paper, we introduce a Bi-level class-severity-aware Vision-Language Graph Matching (Bi-VLGM) for text guided medical image segmentation, composed of a word-level VLGM module and a sentence-level VLGM module, to exploit the class-severity-aware relation among visual-textual features. In word-level VLGM, to mitigate the distorted intra-modal relation during VLM, we reformulate VLM as graph matching problem and introduce a vision-language graph matching (VLGM) to exploit the high-order relation among visual-textual features. Then, we perform VLGM between the local features for each class region and class-aware prompts to bridge their gap. In sentence-level VLGM, to provide disease severity information for segmentation task, we introduce a severity-aware prompting to quantify the severity level of retinal lesion, and perform VLGM between the global features and the severity-aware prompts. By exploiting the relation between the local (global) and class (severity) features, the segmentation model can selectively learn the class-aware and severity-aware information to promote performance. Extensive experiments prove the effectiveness of our method and its superiority to existing methods. Source code is to be released.

READ FULL TEXT
research
06/07/2023

Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation

Vision-Language Pretraining (VLP) has demonstrated remarkable capabiliti...
research
08/15/2023

Exploring Transfer Learning in Medical Image Segmentation using Vision-Language Models

Medical Image Segmentation is crucial in various clinical applications w...
research
11/20/2019

Hierarchical Attention Networks for Medical Image Segmentation

The medical image is characterized by the inter-class indistinction, hig...
research
01/05/2021

Similarity Reasoning and Filtration for Image-Text Matching

Image-text matching plays a critical role in bridging the vision and lan...
research
03/02/2023

ConTEXTual Net: A Multimodal Vision-Language Model for Segmentation of Pneumothorax

Clinical imaging databases contain not only medical images but also text...
research
08/08/2023

Class-level Structural Relation Modelling and Smoothing for Visual Representation Learning

Representation learning for images has been advanced by recent progress ...
research
09/11/2020

Multimodal Depression Severity Prediction from medical bio-markers using Machine Learning Tools and Technologies

Depression has been a leading cause of mental-health illnesses across th...

Please sign up or login with your details

Forgot password? Click here to reset