Multimodal Dialogs (MMD): A large-scale dataset for studying multimodal domain-aware conversations

04/01/2017
by   Amrita Saha, et al.
0

While multimodal conversation agents are gaining importance in several domains such as retail, travel etc., deep learning research in this area has been limited primarily due to the lack of availability of large-scale, open chatlogs. To overcome this bottleneck, in this paper we introduce the task of multimodal, domain-aware conversations, and propose the MMD benchmark dataset. This dataset was gathered by working in close coordination with large number of domain experts in the retail domain and consists of over 150K conversation sessions between shoppers and sales agents, with over 6.5Million utterances. With this dataset, we propose 5 new sub-tasks for multimodal conversations along with their evaluation methodology. We also propose two novel multimodal neural models in the encode-attend-decode paradigm and demonstrate their performance on two of the sub-tasks, namely text response generation and best image response selection. These experiments serve to establish baseline performance and open new research directions for each of these sub-tasks.

READ FULL TEXT

page 12

page 13

page 14

page 16

research
09/04/2021

Towards Expressive Communication with Internet Memes: A New Multimodal Conversation Dataset and Benchmark

As a kind of new expression elements, Internet memes are popular and ext...
research
05/01/2018

Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems

Intelligent personal assistant systems with either text-based or voice-b...
research
04/18/2021

SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations

We present a new corpus for the Situated and Interactive Multimodal Conv...
research
12/07/2020

A Taxonomy of Empathetic Response Intents in Human Social Conversations

Open-domain conversational agents or chatbots are becoming increasingly ...
research
06/02/2020

Situated and Interactive Multimodal Conversations

Next generation virtual assistants are envisioned to handle multimodal i...
research
10/23/2022

McQueen: a Benchmark for Multimodal Conversational Query Rewrite

The task of query rewrite aims to convert an in-context query to its ful...
research
01/18/2021

MONAH: Multi-Modal Narratives for Humans to analyze conversations

In conversational analyses, humans manually weave multimodal information...

Please sign up or login with your details

Forgot password? Click here to reset