CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation

11/21/2022
by   Yinpei Dai, et al.
0

Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data. To better solve the above problems, we propose CGoDial, new challenging and comprehensive Chinese benchmark for multi-domain Goal-oriented Dialog evaluation. It contains 96,763 dialog sessions and 574,949 dialog turns totally, covering three datasets with different knowledge sources: 1) a slot-based dialog (SBD) dataset with table-formed knowledge, 2) a flow-based dialog (FBD) dataset with tree-formed knowledge, and a retrieval-based dialog (RBD) dataset with candidate-formed knowledge. To bridge the gap between academic benchmarks and spoken dialog scenarios, we either collect data from real conversations or add spoken features to existing datasets via crowd-sourcing. The proposed experimental settings include the combinations of training with either the entire training set or a few-shot training set, and testing with either the standard test set or a hard test subset, which can assess model capabilities in terms of general prediction, fast adaptability and reliable robustness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2021

Annotation Inconsistency and Entity Bias in MultiWOZ

MultiWOZ is one of the most popular multi-domain task-oriented dialog da...
research
04/07/2020

Interview: A Large-Scale Open-Source Corpus of Media Dialog

Existing conversational datasets consist either of written proxies for d...
research
06/08/2016

DialPort: Connecting the Spoken Dialog Research Community to Real User Data

This paper describes a new spoken dialog portal that connects systems pr...
research
12/20/2022

Enhancing Task Bot Engagement with Synthesized Open-Domain Dialog

Many efforts have been made to construct dialog systems for different ty...
research
03/21/2022

A Slot Is Not Built in One Utterance: Spoken Language Dialogs with Sub-Slots

A slot value might be provided segment by segment over multiple-turn int...
research
06/06/2023

Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs

Measurement of interaction quality is a critical task for the improvemen...
research
09/01/2019

Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset

A significant barrier to progress in data-driven approaches to building ...

Please sign up or login with your details

Forgot password? Click here to reset