PathVQA: 30000+ Questions for Medical Visual Question Answering

03/07/2020
by   Xuehai He, et al.
0

Is it possible to develop an "AI Pathologist" to pass the board-certified examination of the American Board of Pathology? To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Our work makes the first attempt to build such a dataset. Different from creating general-domain VQA datasets where the images are widely accessible and there are many crowdsourcing workers available and capable of generating question-answer pairs, developing a medical VQA dataset is much more challenging. First, due to privacy concerns, pathology images are usually not publicly available. Second, only well-trained pathologists can understand pathology images, but they barely have time to help create datasets for AI research. To address these challenges, we resort to pathology textbooks and online digital libraries. We develop a semi-automated pipeline to extract pathology images and captions from textbooks and generate question-answer pairs from captions using natural language processing. We collect 32,799 open-ended questions from 4,998 pathology images where each question is manually checked to ensure correctness. To our best knowledge, this is the first dataset for pathology VQA. Our dataset will be released publicly to promote research in medical VQA.

READ FULL TEXT
research
11/19/2021

Medical Visual Question Answering: A Survey

Medical Visual Question Answering (VQA) is a combination of medical arti...
research
06/03/2019

Generating Question Relevant Captions to Aid Visual Question Answering

Visual question answering (VQA) and image captioning require a shared bo...
research
07/03/2023

Localized Questions in Medical Visual Question Answering

Visual Question Answering (VQA) models aim to answer natural language qu...
research
05/30/2023

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

The open-ended Visual Question Answering (VQA) task requires AI models t...
research
08/22/2023

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning

Text-to-music generation (T2M-Gen) faces a major obstacle due to the sca...
research
12/05/2021

Gaudí: Conversational Interactions with Deep Representations to Generate Image Collections

Based on recent advances in realistic language modeling (GPT-3) and cros...
research
10/08/2020

Characterizing Datasets for Social Visual Question Answering, and the New TinySocial Dataset

Modern social intelligence includes the ability to watch videos and answ...

Please sign up or login with your details

Forgot password? Click here to reset