LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering

05/29/2021
by   Zujie Liang, et al.
0

Most existing Visual Question Answering (VQA) systems tend to overly rely on language bias and hence fail to reason from the visual clue. To address this issue, we propose a novel Language-Prior Feedback (LPF) objective function, to re-balance the proportion of each answer's loss value in the total VQA loss. The LPF firstly calculates a modulating factor to determine the language bias using a question-only branch. Then, the LPF assigns a self-adaptive weight to each training sample in the training process. With this reweighting mechanism, the LPF ensures that the total VQA loss can be reshaped to a more balanced form. By this means, the samples that require certain visual information to predict will be efficiently used during training. Our method is simple to implement, model-agnostic, and end-to-end trainable. We conduct extensive experiments and the results show that the LPF (1) brings a significant improvement over various VQA models, (2) achieves competitive performance on the bias-sensitive VQA-CP v2 benchmark.

READ FULL TEXT
research
12/21/2020

Learning content and context with language bias for Visual Question Answering

Visual Question Answering (VQA) is a challenging multimodal task to answ...
research
08/01/2022

Generative Bias for Visual Question Answering

The task of Visual Question Answering (VQA) is known to be plagued by th...
research
06/10/2020

Estimating semantic structure for the VQA answer space

Since its appearance, Visual Question Answering (VQA, i.e. answering a q...
research
05/06/2023

Adaptive loose optimization for robust question answering

Question answering methods are well-known for leveraging data bias, such...
research
06/01/2023

Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

The Visual Question Answering (VQA) system offers a user-friendly interf...
research
10/10/2022

Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning

Models for Visual Question Answering (VQA) often rely on the spurious co...
research
10/26/2022

Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering

Despite the excellent performance of large-scale vision-language pre-tra...

Please sign up or login with your details

Forgot password? Click here to reset