Knowledge-Based Counterfactual Queries for Visual Question Answering

03/05/2023
by   Theodoti Stoikou, et al.
0

Visual Question Answering (VQA) has been a popular task that combines vision and language, with numerous relevant implementations in literature. Even though there are some attempts that approach explainability and robustness issues in VQA models, very few of them employ counterfactuals as a means of probing such challenges in a model-agnostic way. In this work, we propose a systematic method for explaining the behavior and investigating the robustness of VQA models through counterfactual perturbations. For this reason, we exploit structured knowledge bases to perform deterministic, optimal and controllable word-level replacements targeting the linguistic modality, and we then evaluate the model's response against such counterfactual inputs. Finally, we qualitatively extract local and global explanations based on counterfactual responses, which are ultimately proven insightful towards interpreting VQA model behaviors. By performing a variety of perturbation types, targeting different parts of speech of the input question, we gain insights to the reasoning of the model, through the comparison of its responses in different adversarial circumstances. Overall, we reveal possible biases in the decision-making process of the model, as well as expected and unexpected patterns, which impact its performance quantitatively and qualitatively, as indicated by our analysis.

READ FULL TEXT

page 3

page 10

page 11

page 12

page 13

page 14

page 15

page 16

research
11/14/2019

Question-Conditioned Counterfactual Image Generation for VQA

While Visual Question Answering (VQA) models continue to push the state-...
research
03/14/2020

Counterfactual Samples Synthesizing for Robust Visual Question Answering

Despite Visual Question Answering (VQA) has realized impressive progress...
research
11/26/2020

Learning from Lexical Perturbations for Consistent Visual Question Answering

Existing Visual Question Answering (VQA) models are often fragile and se...
research
01/10/2022

COIN: Counterfactual Image Generation for VQA Interpretation

Due to the significant advancement of Natural Language Processing and Co...
research
10/03/2021

Counterfactual Samples Synthesizing and Training for Robust Visual Question Answering

Today's VQA models still tend to capture superficial linguistic correlat...
research
04/08/2021

How Transferable are Reasoning Patterns in VQA?

Since its inception, Visual Question Answering (VQA) is notoriously know...

Please sign up or login with your details

Forgot password? Click here to reset