Logically Consistent Loss for Visual Question Answering

11/19/2020
by   Anh Cat Le Ngo, et al.
0

Given an image, a back-ground knowledge, and a set of questions about an object, human learners answer the questions very consistently regardless of question forms and semantic tasks. The current advancement in neural-network based Visual Question Answering (VQA), despite their impressive performance, cannot ensure such consistency due to identically distribution (i.i.d.) assumption. We propose a new model-agnostic logic constraint to tackle this issue by formulating a logically consistent loss in the multi-task learning framework as well as a data organisation called family-batch and hybrid-batch. To demonstrate usefulness of this proposal, we train and evaluate MAC-net based VQA machines with and without the proposed logically consistent loss and the proposed data organization. The experiments confirm that the proposed loss formulae and introduction of hybrid-batch leads to more consistency as well as better performance. Though the proposed approach is tested with MAC-net, it can be utilised in any other QA methods whenever the logical consistency between answers exist.

READ FULL TEXT
research
09/10/2019

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

While models for Visual Question Answering (VQA) have steadily improved ...
research
03/16/2023

Logical Implications for Visual Question Answering Consistency

Despite considerable recent progress in Visual Question Answering (VQA) ...
research
08/29/2018

From VQA to Multimodal CQA: Adapting Visual QA Models for Community QA Tasks

In this work, we present novel methods to adapt visual QA models for com...
research
04/10/2020

Rephrasing visual questions by specifying the entropy of the answer distribution

Visual question answering (VQA) is a task of answering a visual question...
research
07/03/2020

Visual Question Answering as a Multi-Task Problem

Visual Question Answering(VQA) is a highly complex problem set, relying ...
research
08/13/2017

A Cost-Sensitive Visual Question-Answer Framework for Mining a Deep And-OR Object Semantics from Web Images

This paper presents a cost-sensitive Question-Answering (QA) framework f...
research
06/07/2020

Robust Learning Through Cross-Task Consistency

Visual perception entails solving a wide set of tasks, e.g., object dete...

Please sign up or login with your details

Forgot password? Click here to reset