LEAF-QA: Locate, Encode & Attend for Figure Question Answering

07/30/2019
by   Ritwick Chaudhry, et al.
0

We introduce LEAF-QA, a comprehensive dataset of 250,000 densely annotated figures/charts, constructed from real-world open data sources, along with 2 million question-answer (QA) pairs querying the structure and semantics of these charts. LEAF-QA highlights the problem of multimodal QA, which is notably different from conventional visual QA (VQA), and has recently gained interest in the community. Furthermore, LEAF-QA is significantly more complex than previous attempts at chart QA, viz. FigureQA and DVQA, which present only limited variations in chart data. LEAF-QA being constructed from real-world sources, requires a novel architecture to enable question answering. To this end, LEAF-Net, a deep architecture involving chart element localization, question and answer encoding in terms of chart elements, and an attention network is proposed. Different experiments are conducted to demonstrate the challenges of QA on LEAF-QA. The proposed architecture, LEAF-Net also considerably advances the current state-of-the-art on FigureQA and DVQA.

READ FULL TEXT
research
08/29/2018

From VQA to Multimodal CQA: Adapting Visual QA Models for Community QA Tasks

In this work, we present novel methods to adapt visual QA models for com...
research
08/03/2023

RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic

We present a comprehensive study of chart visual question-answering(QA) ...
research
11/01/2018

On the Generation of Medical Question-Answer Pairs

Question answering (QA) has achieved promising progress recently. Howeve...
research
06/21/2023

CompMix: A Benchmark for Heterogeneous Question Answering

Fact-centric question answering (QA) often requires access to multiple, ...
research
11/07/2015

Stacked Attention Networks for Image Question Answering

This paper presents stacked attention networks (SANs) that learn to answ...
research
07/20/2020

Multimodal Dialogue State Tracking By QA Approach with Data Augmentation

Recently, a more challenging state tracking task, Audio-Video Scene-Awar...
research
06/05/2016

Multimodal Residual Learning for Visual QA

Deep neural networks continue to advance the state-of-the-art of image r...

Please sign up or login with your details

Forgot password? Click here to reset