The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

09/27/2021
by   Nan Zhao, et al.
0

With the development of the Internet, more and more people get accustomed to online shopping. When communicating with customer service, users may express their requirements by means of text, images, and videos, which precipitates the need for understanding these multimodal information for automatic customer service systems. Images usually act as discriminators for product models, or indicators of product failures, which play important roles in the E-commerce scenario. On the other hand, detailed information provided by the images is limited, and typically, customer service systems cannot understand the intents of users without the input text. Thus, bridging the gap of the image and text is crucial for the multimodal dialogue task. To handle this problem, we construct JDDC 2.0, a large-scale multimodal multi-turn dialogue dataset collected from a mainstream Chinese E-commerce platform (JD.com), containing about 246 thousand dialogue sessions, 3 million utterances, and 507 thousand images, along with product knowledge bases and image category annotations. We present the solutions of top-5 teams participating in the JDDC multimodal dialogue challenge based on this dataset, which provides valuable insights for further researches on the multimodal dialogue task.

READ FULL TEXT
research
11/22/2019

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset forE-commerce Customer Service

Human conversations in real scenarios are complicated and building a hum...
research
08/31/2022

Unified Knowledge Prompt Pre-training for Customer Service Dialogues

Dialogue bots have been widely applied in customer service scenarios to ...
research
04/18/2021

DCH-2: A Parallel Customer-Helpdesk Dialogue Corpus with Distributions of Annotators' Labels

We introduce a data set called DCH-2, which contains 4,390 real customer...
research
12/08/2016

Discovering Conversational Dependencies between Messages in Dialogs

We investigate the task of inferring conversational dependencies between...
research
11/22/2019

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service

Human conversations in real scenarios are complicated and building a hum...
research
06/30/2015

The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems

This paper introduces the Ubuntu Dialogue Corpus, a dataset containing a...
research
05/12/2015

Turn Segmentation into Utterances for Arabic Spontaneous Dialogues and Instance Messages

Text segmentation task is an essential processing task for many of Natur...

Please sign up or login with your details

Forgot password? Click here to reset