Log In Sign Up

Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts

by   Ece Takmaz, et al.

Dialogue participants often refer to entities or situations repeatedly within a conversation, which contributes to its cohesiveness. Subsequent references exploit the common ground accumulated by the interlocutors and hence have several interesting properties, namely, they tend to be shorter and reuse expressions that were effective in previous mentions. In this paper, we tackle the generation of first and subsequent references in visually grounded dialogue. We propose a generation model that produces referring utterances grounded in both the visual and the conversational context. To assess the referring effectiveness of its output, we also implement a reference resolution system. Our experiments and analyses show that the model produces better, more effective referring utterances than a model not grounded in the dialogue context, and generates subsequent references that exhibit linguistic patterns akin to humans.


page 1

page 4

page 8

page 9


DIALKI: Knowledge Identification in Conversational Systems through Dialogue-Document Contextualization

Identifying relevant knowledge to be used in conversational systems that...

The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue

This paper introduces the PhotoBook dataset, a large-scale collection of...

Reference-Centric Models for Grounded Collaborative Dialogue

We present a grounded neural dialogue model that successfully collaborat...

Learning to Map Context-Dependent Sentences to Executable Formal Queries

We propose a context-dependent model to map utterances within an interac...

Reference Resolution and Context Change in Multimodal Situated Dialogue for Exploring Data Visualizations

Reference resolution, which aims to identify entities being referred to ...

Joint Retrieval and Generation Training for Grounded Text Generation

Recent advances in large-scale pre-training such as GPT-3 allow seemingl...

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

A major challenge in visually grounded language generation is to build r...