Guessing State Tracking for Visual Dialogue

02/24/2020
by   Wei Pang, et al.
0

The Guesser plays an important role in GuessWhat?! like visual dialogues. It locates the target object in an image supposed by an oracle oneself over a question-answer based dialogue between a Questioner and the Oracle. Most existing guessers make one and only one guess after receiving all question-answer pairs in a dialogue with predefined number of rounds. This paper proposes the guessing state for the guesser, and regards guess as a process with change of guessing state through a dialogue. A guessing state tracking based guess model is therefore proposed. The guessing state is defined as a distribution on candidate objects in the image. A state update algorithm including three modules is given. UoVR updates the representation of the image according to current guessing state, QAEncoder encodes the question-answer pairs, and UoGS updates the guessing state by combining both information from the image and dialogue history. With the guessing state in hand, two loss functions are defined as supervisions for model training. Early supervision brings supervision to guesser at early rounds, and incremental supervision brings monotonicity to the guessing state. Experimental results on GuessWhat?! dataset show that our model significantly outperforms previous models, achieves new state-of-the-art, especially, the success rate of guessing 83.3 approaching human-level performance 84.4

READ FULL TEXT
research
11/12/2019

Visual Dialogue State Tracking for Question Generation

GuessWhat?! is a visual dialogue task between a guesser and an oracle. T...
research
10/29/2021

Amendable Generation for Dialogue State Tracking

In task-oriented dialogue systems, recent dialogue state tracking method...
research
05/27/2020

Rethinking Dialogue State Tracking with Reasoning

Tracking dialogue states to better interpret user goals and feed downstr...
research
10/01/2020

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

A goal-oriented visual dialogue involves multi-turn interactions between...
research
02/11/2018

FlipDial: A Generative Model for Two-Way Visual Dialogue

We present FlipDial, a generative model for visual dialogue that simulta...
research
05/08/2021

Comprehensive Study: How the Context Information of Different Granularity Affects Dialogue State Tracking?

Dialogue state tracking (DST) plays a key role in task-oriented dialogue...
research
05/24/2021

Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic Representation

GuessWhat?! is a two-player visual dialog guessing game where player A a...

Please sign up or login with your details

Forgot password? Click here to reset