MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

by   Paweł Budzianowski, et al.
University of Cambridge

Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available. To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of 10k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora. The contribution of this work apart from the open-sourced dataset labelled with dialogue belief states and dialogue actions is two-fold: firstly, a detailed description of the data collection procedure along with a summary of data structure and analysis is provided. The proposed data-collection pipeline is entirely based on crowd-sourcing without the need of hiring professional annotators; secondly, a set of benchmark results of belief tracking, dialogue act and response generation is reported, which shows the usability of the data and sets a baseline for future studies.


Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing

Robust dialogue belief tracking is a key component in maintaining good q...

MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines

MultiWOZ is a well-known task-oriented dialogue dataset containing over ...

Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking

Annotating task-oriented dialogues is notorious for the expensive and di...

A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets

In recent years, interest has arisen in using machine learning to improv...

SalesBot: Transitioning from Chit-Chat to Task-Oriented Dialogues

Dialogue systems are usually categorized into two types, open-domain and...

ScriptWriter: Narrative-Guided Script Generation

It is appealing to have a system that generates a story or scripts autom...

Code Repositories


A parser of the Multi-Domain Wizard-of-Oz dataset (MultiWOZ)

view repo

Please sign up or login with your details

Forgot password? Click here to reset