Co-VQA : Answering by Interactive Sub Question Sequence

04/02/2022
by   Ruonan Wang, et al.
0

Most existing approaches to Visual Question Answering (VQA) answer questions directly, however, people usually decompose a complex question into a sequence of simple sub questions and finally obtain the answer to the original question after answering the sub question sequence(SQS). By simulating the process, this paper proposes a conversation-based VQA (Co-VQA) framework, which consists of three components: Questioner, Oracle, and Answerer. Questioner raises the sub questions using an extending HRED model, and Oracle answers them one-by-one. An Adaptive Chain Visual Reasoning Model (ACVRM) for Answerer is also proposed, where the question-answer pair is used to update the visual representation sequentially. To perform supervised learning for each model, we introduce a well-designed method to build a SQS for each question on VQA 2.0 and VQA-CP v2 datasets. Experimental results show that our method achieves state-of-the-art on VQA-CP v2. Further analyses show that SQSs help build direct semantic connections between questions and images, provide question-adaptive variable-length reasoning chains, and with explicit interpretability as well as error traceability.

READ FULL TEXT

page 3

page 4

page 8

page 14

research
06/03/2018

On the Flip Side: Identifying Counterexamples in Visual Question Answering

Visual question answering (VQA) models respond to open-ended natural lan...
research
04/09/2019

Multi-Target Embodied Question Answering

Embodied Question Answering (EQA) is a relatively new task where an agen...
research
10/20/2020

SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency

Recent research in Visual Question Answering (VQA) has revealed state-of...
research
04/04/2023

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder

Medical Visual Question Answering (VQA) systems play a supporting role t...
research
06/27/2016

Revisiting Visual Question Answering Baselines

Visual question answering (VQA) is an interesting learning setting for e...
research
01/20/2020

SQuINTing at VQA Models: Interrogating VQA Models with Sub-Questions

Existing VQA datasets contain questions with varying levels of complexit...
research
11/23/2019

Unsupervised Keyword Extraction for Full-sentence VQA

In existing studies on Visual Question Answering (VQA), which aims to tr...

Please sign up or login with your details

Forgot password? Click here to reset