SciGraphQA: A Large-Scale Synthetic Multi-Turn Question-Answering Dataset for Scientific Graphs

08/07/2023
by   Shengzhi Li, et al.
0

In this work, we present SciGraphQA, a synthetic multi-turn question-answer dataset related to academic graphs. SciGraphQA is 13 times larger than ChartVQA, the previously largest chart-visual question-answering dataset. It is also the largest open-sourced chart VQA dataset with non-synthetic charts. To build our dataset, we selected 290,000 Computer Science or Machine Learning ArXiv papers published between 2010 and 2020, and then used Palm-2 to generate 295K samples of open-vocabulary multi-turn question-answering dialogues about the graphs. As context, we provided the text-only Palm-2 with paper title, abstract, paragraph mentioning the graph, and rich text contextual data from the graph itself, obtaining dialogues with an average 2.23 question-answer turns for each graph. We asked GPT-4 to assess the matching quality of our question-answer turns given the paper's context, obtaining an average rating of 8.7/10 on our 3K test set. We evaluated the 0-shot capability of the most popular MLLM models such as LLaVa, mPLUGowl, BLIP-2, and openFlamingo's on our dataset, finding LLaVA-13B being the most performant with a CIDEr score of 0.08. We further enriched the question prompts for LLAVA by including the serialized data tables extracted from the graphs using the DePlot model, boosting LLaVA's 0-shot CIDEr to 0.15. To verify the validity of our dataset, we also fine-tuned LLaVa using our dataset, reaching a substantially higher CIDEr score of 0.26. We anticipate further accuracy improvement by including segmentation mask tokens and leveraging larger LLM backbones coupled with emergent prompting techniques. Our code and data are open-sourced.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2023

DBLP-QuAD: A Question Answering Dataset over the DBLP Scholarly Knowledge Graph

In this work we create a question answering dataset over the DBLP schola...
research
06/29/2023

Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering

We study visual question answering in a setting where the answer has to ...
research
05/24/2023

Extracting Psychological Indicators Using Question Answering

In this work, we propose a method for extracting text spans that may ind...
research
03/13/2021

ParaQA: A Question Answering Dataset with Paraphrase Responses for Single-Turn Conversation

This paper presents ParaQA, a question answering (QA) dataset with multi...
research
11/18/2022

FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering

Generative models have recently started to outperform extractive models ...
research
08/10/2018

Lingke: A Fine-grained Multi-turn Chatbot for Customer Service

Traditional chatbots usually need a mass of human dialogue data, especia...

Please sign up or login with your details

Forgot password? Click here to reset