MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines

by   Xiaoxue Zang, et al.

MultiWOZ is a well-known task-oriented dialogue dataset containing over 10,000 annotated dialogues spanning 8 domains. It is extensively used as a benchmark for dialogue state tracking. However, recent works have reported presence of substantial noise in the dialogue state annotations. MultiWOZ 2.1 identified and fixed many of these erroneous annotations and user utterances, resulting in an improved version of this dataset. This work introduces MultiWOZ 2.2, which is a yet another improved version of this dataset. Firstly, we identify and fix dialogue state annotation errors across 17.3 utterances on top of MultiWOZ 2.1. Secondly, we redefine the ontology by disallowing vocabularies of slots with a large number of possible values (e.g., restaurant name, time of booking). In addition, we introduce slot span annotations for these slots to standardize them across recent models, which previously used custom string matching heuristics to generate them. We also benchmark a few state of the art dialogue state tracking models on the corrected dataset to facilitate comparison for future work. In the end, we discuss best practices for dialogue data collection that can help avoid annotation errors.


MultiWOZ 2.1: Multi-Domain Dialogue State Corrections and State Tracking Baselines

MultiWOZ is a recently-released multidomain dialogue dataset spanning 7 ...

MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with Essential Annotation Corrections to Improve State Tracking Evaluation

The MultiWOZ 2.0 dataset was released in 2018. It consists of more than ...

MultiWOZ 2.3: A multi-domain task-oriented dataset enhanced with annotation corrections and co-reference annotation

Task-oriented dialogue systems have made unprecedented progress with mul...

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

Even though machine learning has become the major scene in dialogue rese...

ASSIST: Towards Label Noise-Robust Dialogue State Tracking

The MultiWOZ 2.0 dataset has greatly boosted the research on dialogue st...

CheckDST: Measuring Real-World Generalization of Dialogue State Tracking Performance

Recent neural models that extend the pretrain-then-finetune paradigm con...

How Good is Automatic Segmentation as a Multimodal Discourse Annotation Aid?

Collaborative problem solving (CPS) in teams is tightly coupled with the...

Please sign up or login with your details

Forgot password? Click here to reset