DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

09/18/2021
by   Zeming Liu, et al.
0

In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset (DuRecDial 2.0) to enable researchers to explore a challenging task of multilingual and cross-lingual conversational recommendation. The difference between DuRecDial 2.0 and existing conversational recommendation datasets is that the data item (Profile, Goal, Knowledge, Context, Response) in DuRecDial 2.0 is annotated in two languages, both English and Chinese, while other datasets are built with the setting of a single language. We collect 8.2k dialogs aligned across English and Chinese languages (16.5k dialogs and 255k utterances in total) that are annotated by crowdsourced workers with strict quality control procedure. We then build monolingual, multilingual, and cross-lingual conversational recommendation baselines on DuRecDial 2.0. Experiment results show that the use of additional English data can bring performance improvement for Chinese conversational recommendation, indicating the benefits of DuRecDial 2.0. Finally, this dataset provides a challenging testbed for future studies of monolingual, multilingual, and cross-lingual conversational recommendation.

READ FULL TEXT

page 2

page 6

research
03/17/2020

XPersona: Evaluating Multilingual Personalized Chatbot

Personalized dialogue systems are an essential step toward better human-...
research
10/11/2019

BiPaR: A Bilingual Parallel Dataset for Multilingual and Cross-lingual Reading Comprehension on Novels

This paper presents BiPaR, a bilingual parallel novel-style machine read...
research
05/20/2022

Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog

Research on (multi-domain) task-oriented dialog (TOD) has predominantly ...
research
04/03/2023

Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning

Cross-lingual transfer of language models trained on high-resource langu...
research
04/17/2021

Crossing the Conversational Chasm: A Primer on Multilingual Task-Oriented Dialogue Systems

Despite the fact that natural language conversations with machines repre...
research
04/11/2022

Zero-shot Cross-lingual Conversational Semantic Role Labeling

While conversational semantic role labeling (CSRL) has shown its usefuln...
research
09/16/2020

Multilingual Music Genre Embeddings for Effective Cross-Lingual Music Item Annotation

Annotating music items with music genres is crucial for music recommenda...

Please sign up or login with your details

Forgot password? Click here to reset