Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

10/01/2020
by   Zipeng Xu, et al.
0

A goal-oriented visual dialogue involves multi-turn interactions between two agents, Questioner and Oracle. During which, the answer given by Oracle is of great significance, as it provides golden response to what Questioner concerns. Based on the answer, Questioner updates its belief on target visual content and further raises another question. Notably, different answers drive into different visual beliefs and future questions. However, existing methods always indiscriminately encode answers after much longer questions, resulting in a weak utilization of answers. In this paper, we propose an Answer-Driven Visual State Estimator (ADVSE) to impose the effects of different answers on visual states. First, we propose an Answer-Driven Focusing Attention (ADFA) to capture the answer-driven effect on visual attention by sharpening question-related attention and adjusting it by answer-based logical operation at each turn. Then based on the focusing attention, we get the visual state estimation by Conditional Visual Information Fusion (CVIF), where overall information and difference information are fused conditioning on the question-answer state. We evaluate the proposed ADVSE to both question generator and guesser tasks on the large-scale GuessWhat?! dataset and achieve the state-of-the-art performances on both tasks. The qualitative results indicate that the ADVSE boosts the agent to generate highly efficient questions and obtains reliable visual attentions during the reasonable question generation and guess processes.

READ FULL TEXT

page 6

page 7

page 8

research
11/12/2019

Visual Dialogue State Tracking for Question Generation

GuessWhat?! is a visual dialogue task between a guesser and an oracle. T...
research
06/29/2021

Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue

Building an interactive artificial intelligence that can ask questions a...
research
03/27/2019

Information Maximizing Visual Question Generation

Though image-to-sequence generation models have become overwhelmingly po...
research
11/15/2017

Good and safe uses of AI Oracles

An Oracle is a design for potentially high power artificial intelligence...
research
02/24/2020

Guessing State Tracking for Visual Dialogue

The Guesser plays an important role in GuessWhat?! like visual dialogues...
research
11/21/2017

Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning

The Visual Dialogue task requires an agent to engage in a conversation a...
research
11/21/2017

Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards

Despite significant progress in a variety of vision-and-language problem...

Please sign up or login with your details

Forgot password? Click here to reset