Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue

06/29/2021
by   Shoya Matsumori, et al.
6

Building an interactive artificial intelligence that can ask questions about the real world is one of the biggest challenges for vision and language problems. In particular, goal-oriented visual dialogue, where the aim of the agent is to seek information by asking questions during a turn-taking dialogue, has been gaining scholarly attention recently. While several existing models based on the GuessWhat?! dataset have been proposed, the Questioner typically asks simple category-based questions or absolute spatial questions. This might be problematic for complex scenes where the objects share attributes or in cases where descriptive questions are required to distinguish objects. In this paper, we propose a novel Questioner architecture, called Unified Questioner Transformer (UniQer), for descriptive question generation with referring expressions. In addition, we build a goal-oriented visual dialogue task called CLEVR Ask. It synthesizes complex scenes that require the Questioner to generate descriptive questions. We train our model with two variants of CLEVR Ask datasets. The results of the quantitative and qualitative evaluations show that UniQer outperforms the baseline.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 6

page 7

page 8

page 9

research
12/16/2018

What's to know? Uncertainty as a Guide to Asking Goal-oriented Questions

One of the core challenges in Visual Dialogue problems is asking the que...
research
05/17/2018

Ask No More:Deciding when to guess in referential visual dialogue

Our goal is to explore how the abilities brought in by a dialogue manage...
research
10/01/2020

Answer-Driven Visual State Estimator for Goal-Oriented Visual Dialogue

A goal-oriented visual dialogue involves multi-turn interactions between...
research
10/22/2020

AI-lead Court Debate Case Investigation

The multi-role judicial debate composed of the plaintiff, defendant, and...
research
11/21/2017

Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards

Despite significant progress in a variety of vision-and-language problem...
research
09/10/2018

Jointly Learning to See, Ask, and GuessWhat

We are interested in understanding how the ability to ground language in...
research
10/12/2021

Decision-Theoretic Question Generation for Situated Reference Resolution: An Empirical Study and Computational Model

Dialogue agents that interact with humans in situated environments need ...

Please sign up or login with your details

Forgot password? Click here to reset