MeetUp! A Corpus of Joint Activity Dialogues in a Visual Environment

07/11/2019
by   Nikolai Ilinykh, et al.
2

Building computer systems that can converse about their visual environment is one of the oldest concerns of research in Artificial Intelligence and Computational Linguistics (see, for example, Winograd's 1972 SHRDLU system). Only recently, however, have methods from computer vision and natural language processing become powerful enough to make this vision seem more attainable. Pushed especially by developments in computer vision, many data sets and collection environments have recently been published that bring together verbal interaction and visual processing. Here, we argue that these datasets tend to oversimplify the dialogue part, and we propose a task---MeetUp!---that requires both visual and conversational grounding, and that makes stronger demands on representations of the discourse. MeetUp! is a two-player coordination game where players move in a visual environment, with the objective of finding each other. To do so, they must talk about what they see, and achieve mutual understanding. We describe a data collection and show that the resulting dialogues indeed exhibit the dialogue phenomena of interest, while also challenging the language & vision aspect.

READ FULL TEXT

page 2

page 6

page 8

research
11/23/2016

GuessWhat?! Visual object discovery through multi-modal dialogue

We introduce GuessWhat?!, a two-player guessing game as a testbed for re...
research
11/12/2021

Visual Intelligence through Human Interaction

Over the last decade, Computer Vision, the branch of Artificial Intellig...
research
10/16/2019

Does Gender Matter? Towards Fairness in Dialogue Systems

Recently there are increasing concerns about the fairness of Artificial ...
research
05/28/2023

ConvGenVisMo: Evaluation of Conversational Generative Vision Models

Conversational generative vision models (CGVMs) like Visual ChatGPT (Wu ...
research
08/29/2019

Grounded Agreement Games: Emphasizing Conversational Grounding in Visual Dialogue Settings

Where early work on dialogue in Computational Linguistics put much empha...
research
02/16/2023

What A Situated Language-Using Agent Must be Able to Do: A Top-Down Analysis

Even in our increasingly text-intensive times, the primary site of langu...
research
07/21/2018

A Pipeline for Creative Visual Storytelling

Computational visual storytelling produces a textual description of even...

Please sign up or login with your details

Forgot password? Click here to reset