The Promise of Premise: Harnessing Question Premises in Visual Question Answering

05/01/2017
by   Aroma Mahendru, et al.
0

In this paper, we make a simple observation that questions about images often contain premises - objects and relationships implied by the question - and that reasoning about premises can help Visual Question Answering (VQA) models respond more intelligently to irrelevant or previously unseen questions. When presented with a question that is irrelevant to an image, state-of-the-art VQA models will still answer purely based on learned language biases, resulting in non-sensical or even misleading answers. We note that a visual question is irrelevant to an image if at least one of its premises is false (i.e. not depicted in the image). We leverage this observation to construct a dataset for Question Relevance Prediction and Explanation (QRPE) by searching for false premises. We train novel question relevance detection models and show that models that reason about premises consistently outperform models that do not. We also find that forcing standard VQA models to reason about premises during training can lead to improvements on tasks requiring compositional reasoning.

READ FULL TEXT

page 1

page 5

page 12

research
06/21/2016

Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions

Visual Question Answering (VQA) is the task of answering natural-languag...
research
04/05/2022

SwapMix: Diagnosing and Regularizing the Over-Reliance on Visual Context in Visual Question Answering

While Visual Question Answering (VQA) has progressed rapidly, previous w...
research
04/03/2022

Question-Driven Graph Fusion Network For Visual Question Answering

Existing Visual Question Answering (VQA) models have explored various vi...
research
03/15/2022

Can you even tell left from right? Presenting a new challenge for VQA

Visual Question Answering (VQA) needs a means of evaluating the strength...
research
05/11/2023

Overinformative Question Answering by Humans and Machines

When faced with a polar question, speakers often provide overinformative...
research
06/10/2021

Supervising the Transfer of Reasoning Patterns in VQA

Methods for Visual Question Anwering (VQA) are notorious for leveraging ...
research
04/04/2023

SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering

Visual question answering (VQA) is a critical multimodal task in which a...

Please sign up or login with your details

Forgot password? Click here to reset