What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues

09/01/2019
by   Xintong Yu, et al.
0

Grounding a pronoun to a visual object it refers to requires complex reasoning from various information sources, especially in conversational scenarios. For example, when people in a conversation talk about something all speakers can see, they often directly use pronouns (e.g., it) to refer to it without previous introduction. This fact brings a huge challenge for modern natural language understanding systems, particularly conventional context-based pronoun coreference models. To tackle this challenge, in this paper, we formally define the task of visual-aware pronoun coreference resolution (PCR) and introduce VisPro, a large-scale dialogue PCR dataset, to investigate whether and how the visual information can help resolve pronouns in dialogues. We then propose a novel visual-aware PCR model, VisCoref, for this task and conduct comprehensive experiments and case studies on our dataset. Results demonstrate the importance of the visual information in this PCR case and show the effectiveness of the proposed model.

READ FULL TEXT

page 1

page 8

research
06/04/2019

The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue

This paper introduces the PhotoBook dataset, a large-scale collection of...
research
10/01/2021

TEACh: Task-driven Embodied Agents that Chat

Robots operating in human spaces must be able to engage in natural langu...
research
09/10/2021

Exophoric Pronoun Resolution in Dialogues with Topic Regularization

Resolving pronouns to their referents has long been studied as a fundame...
research
03/14/2021

Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD Images

Grounding referring expressions in RGBD image has been an emerging field...
research
03/12/2019

Scaling Multi-Domain Dialogue State Tracking via Query Reformulation

We present a novel approach to dialogue state tracking and referring exp...
research
11/18/2019

An Annotated Corpus of Reference Resolution for Interpreting Common Grounding

Common grounding is the process of creating, repairing and updating mutu...
research
08/14/2019

FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension

Conversational machine comprehension requires deep understanding of the ...

Please sign up or login with your details

Forgot password? Click here to reset