Iterative Context-Aware Graph Inference for Visual Dialog

04/05/2020
by   Dan Guo, et al.
3

Visual dialog is a challenging task that requires the comprehension of the semantic dependencies among implicit visual and textual contexts. This task can refer to the relation inference in a graphical model with sparse contexts and unknown graph structure (relation descriptor), and how to model the underlying context-aware relation inference is critical. To this end, we propose a novel Context-Aware Graph (CAG) neural network. Each node in the graph corresponds to a joint semantic feature, including both object-based (visual) and history-related (textual) context representations. The graph structure (relations in dialog) is iteratively updated using an adaptive top-K message passing mechanism. Specifically, in every message passing step, each node selects the most K relevant nodes, and only receives messages from them. Then, after the update, we impose graph attention on all the nodes to get the final graph embedding and infer the answer. In CAG, each node has dynamic relations in the graph (different related K neighbor nodes), and only the most relevant nodes are attributive to the context-aware relational graph inference. Experimental results on VisDial v0.9 and v1.0 datasets show that CAG outperforms comparative methods. Visualization results further validate the interpretability of our method.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 9

page 11

page 12

research
04/11/2019

Reasoning Visual Dialogs with Structural and Partial Observations

We propose a novel model to address the task of Visual Dialog which exhi...
research
05/10/2019

Language-Conditioned Graph Networks for Relational Reasoning

Solving grounded language tasks often requires reasoning about relations...
research
11/17/2014

Relations World: A Possibilistic Graphical Model

We explore the idea of using a "possibilistic graphical model" as the ba...
research
09/17/2021

GoG: Relation-aware Graph-over-Graph Network for Visual Dialog

Visual dialog, which aims to hold a meaningful conversation with humans ...
research
04/14/2020

DialGraph: Sparse Graph Learning Networks for Visual Dialog

Visual dialog is a task of answering a sequence of questions grounded in...
research
01/28/2022

Explaining Graph-level Predictions with Communication Structure-Aware Cooperative Games

Explaining predictions made by machine learning models is important and ...
research
02/01/2019

Lift-the-Flap: Context Reasoning Using Object-Centered Graphs

Children benefit from lift-the-flap books by taking on an active role in...

Please sign up or login with your details

Forgot password? Click here to reset