Two Causal Principles for Improving Visual Dialog

11/24/2019
by   Jiaxin Qi, et al.
10

This paper is a winner report from team MReaL-BDAI for Visual Dialog Challenge 2019. We present two causal principles for improving Visual Dialog (VisDial). By "improving", we mean that they can promote almost every existing VisDial model to the state-of-the-art performance on Visual Dialog 2019 Challenge leader-board. Such a major improvement is only due to our careful inspection on the causality behind the model and data, finding that the community has overlooked two causalities in VisDial. Intuitively, Principle 1 suggests: we should remove the direct input of the dialog history to the answer model, otherwise the harmful shortcut bias will be introduced; Principle 2 says: there is an unobserved confounder for history, question, and answer, leading to spurious correlations from training data. In particular, to remove the confounder suggested in Principle 2, we propose several causal intervention algorithms, which make the training fundamentally different from the traditional likelihood estimation. Note that the two principles are model-agnostic, so they are applicable in any VisDial model.

READ FULL TEXT

page 7

page 8

page 14

page 15

research
02/26/2019

Image-Question-Answer Synergistic Network for Visual Dialog

The image, question (combined with the history for de-referencing), and ...
research
05/08/2020

History for Visual Dialog: Do we really need it?

Visual Dialog involves "understanding" the dialog history (what has been...
research
12/04/2017

Examining Cooperation in Visual Dialog Models

In this work we propose a blackbox intervention method for visual dialog...
research
05/29/2022

VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution

The visual dialog task requires an AI agent to interact with humans in m...
research
01/15/2020

Ensemble based discriminative models for Visual Dialog Challenge 2018

This manuscript describes our approach for the Visual Dialog Challenge 2...
research
08/22/2022

Neuro-Symbolic Visual Dialog

We propose Neuro-Symbolic Visual Dialog (NSVD) -the first method to comb...
research
04/23/2022

Supplementing Missing Visions via Dialog for Scene Graph Generations

Most current AI systems rely on the premise that the input visual data a...

Please sign up or login with your details

Forgot password? Click here to reset