Learning content and context with language bias for Visual Question Answering

12/21/2020
by   Chao Yang, et al.
1

Visual Question Answering (VQA) is a challenging multimodal task to answer questions about an image. Many works concentrate on how to reduce language bias which makes models answer questions ignoring visual content and language context. However, reducing language bias also weakens the ability of VQA models to learn context prior. To address this issue, we propose a novel learning strategy named CCB, which forces VQA models to answer questions relying on Content and Context with language Bias. Specifically, CCB establishes Content and Context branches on top of a base VQA model and forces them to focus on local key content and global effective context respectively. Moreover, a joint loss function is proposed to reduce the importance of biased samples and retain their beneficial influence on answering questions. Experiments show that CCB outperforms the state-of-the-art methods in terms of accuracy on VQA-CP v2.

READ FULL TEXT

page 1

page 5

research
06/24/2019

RUBi: Reducing Unimodal Biases in Visual Question Answering

Visual Question Answering (VQA) is the task of answering questions about...
research
05/29/2021

LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering

Most existing Visual Question Answering (VQA) systems tend to overly rel...
research
04/04/2023

SC-ML: Self-supervised Counterfactual Metric Learning for Debiased Visual Question Answering

Visual question answering (VQA) is a critical multimodal task in which a...
research
06/10/2020

Estimating semantic structure for the VQA answer space

Since its appearance, Visual Question Answering (VQA, i.e. answering a q...
research
03/07/2022

Barlow constrained optimization for Visual Question Answering

Visual question answering is a vision-and-language multimodal task, that...
research
05/17/2023

An Empirical Study on the Language Modal in Visual Question Answering

Generalization beyond in-domain experience to out-of-distribution data i...
research
05/17/2022

Gender and Racial Bias in Visual Question Answering Datasets

Vision-and-language tasks have increasingly drawn more attention as a me...

Please sign up or login with your details

Forgot password? Click here to reset