Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training

06/01/2023
by   Zhenghang Yuan, et al.
0

The Visual Question Answering (VQA) system offers a user-friendly interface and enables human-computer interaction. However, VQA models commonly face the challenge of language bias, resulting from the learned superficial correlation between questions and answers. To address this issue, in this study, we present a novel framework to reduce the language bias of the VQA for remote sensing data (RSVQA). Specifically, we add an adversarial branch to the original VQA framework. Based on the adversarial branch, we introduce two regularizers to constrain the training process against language bias. Furthermore, to evaluate the performance in terms of language bias, we propose a new metric that combines standard accuracy with the performance drop when incorporating question and random image information. Experimental results demonstrate the effectiveness of our method. We believe that our method can shed light on future work for reducing language bias on the RSVQA task.

READ FULL TEXT
research
05/06/2022

From Easy to Hard: Learning Language-guided Curriculum for Visual Question Answering on Remote Sensing Data

Visual question answering (VQA) for remote sensing scene has great poten...
research
09/24/2021

How to find a good image-text embedding for remote sensing visual question answering?

Visual question answering (VQA) has recently been introduced to remote s...
research
10/08/2018

Overcoming Language Priors in Visual Question Answering with Adversarial Regularization

Modern Visual Question Answering (VQA) models have been shown to rely he...
research
05/29/2021

LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering

Most existing Visual Question Answering (VQA) systems tend to overly rel...
research
06/01/2023

LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing

Visual question answering (VQA) methods in remote sensing (RS) aim to an...
research
06/25/2023

Visual Question Answering in Remote Sensing with Cross-Attention and Multimodal Information Bottleneck

In this research, we deal with the problem of visual question answering ...
research
05/17/2023

An Empirical Study on the Language Modal in Visual Question Answering

Generalization beyond in-domain experience to out-of-distribution data i...

Please sign up or login with your details

Forgot password? Click here to reset