Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

07/22/2023
by   Xinyue Hu, et al.
0

To contribute to automating the medical vision-language model, we propose a novel Chest-Xray Difference Visual Question Answering (VQA) task. Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them. This is consistent with the radiologist's diagnosis practice that compares the current image with the reference before concluding the report. We collect a new dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from 164,324 pairs of main and reference images. Compared to existing medical VQA datasets, our questions are tailored to the Assessment-Diagnosis-Intervention-Evaluation treatment procedure used by clinical professionals. Meanwhile, we also propose a novel expert knowledge-aware graph representation learning model to address this task. The proposed baseline model leverages expert knowledge such as anatomical structure prior, semantic, and spatial knowledge to construct a multi-relationship graph, representing the image differences between two images for the image difference VQA task. The dataset and code can be found at https://github.com/Holipori/MIMIC-Diff-VQA. We believe this work would further push forward the medical vision language model.

READ FULL TEXT

page 2

page 9

research
02/19/2023

Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning

Medical visual question answering (VQA) aims to answer clinically releva...
research
08/10/2022

Aesthetic Visual Question Answering of Photographs

Aesthetic assessment of images can be categorized into two main forms: n...
research
07/03/2023

Localized Questions in Medical Visual Question Answering

Visual Question Answering (VQA) models aim to answer natural language qu...
research
03/06/2022

Dynamic Key-value Memory Enhanced Multi-step Graph Reasoning for Knowledge-based Visual Question Answering

Knowledge-based visual question answering (VQA) is a vision-language tas...
research
08/24/2022

Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task

VQA is an ambitious task aiming to answer any image-related question. Ho...
research
12/21/2022

UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering

Medical Visual Question Answering (Medical-VQA) aims to to answer clinic...
research
08/28/2021

QACE: Asking Questions to Evaluate an Image Caption

In this paper, we propose QACE, a new metric based on Question Answering...

Please sign up or login with your details

Forgot password? Click here to reset