ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language

03/15/2022
by   Phi Nguyen Van, et al.
0

Most of the current task-oriented dialogue systems (ToD), despite having interesting results, are designed for a handful of languages like Chinese and English. Therefore, their performance in low-resource languages is still a significant problem due to the absence of a standard dataset and evaluation policy. To address this problem, we proposed ViWOZ, a fully-annotated Vietnamese task-oriented dialogue dataset. ViWOZ is the first multi-turn, multi-domain tasked oriented dataset in Vietnamese, a low-resource language. The dataset consists of a total of 5,000 dialogues, including 60,946 fully annotated utterances. Furthermore, we provide a comprehensive benchmark of both modular and end-to-end models in low-resource language scenarios. With those characteristics, the ViWOZ dataset enables future studies on creating a multilingual task-oriented dialogue system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2022

MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue

Task-oriented dialogue (TOD) systems have been applied in a range of dom...
research
05/25/2023

Multijugate Dual Learning for Low-Resource Task-Oriented Dialogue System

Dialogue data in real scenarios tend to be sparsely available, rendering...
research
02/27/2021

A Simple But Effective Approach to n-shot Task-Oriented Dialogue Augmentation

The collection and annotation of task-oriented conversational data is a ...
research
10/28/2020

Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

Utterance classification performance in low-resource dialogue systems is...
research
05/12/2022

A Chit-Chats Enhanced Task-Oriented Dialogue Corpora for Fuse-Motive Conversation Systems

The goal of building intelligent dialogue systems has largely been separ...
research
12/23/2021

Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems

While the English virtual assistants have achieved exciting performance ...
research
04/07/2023

BenCoref: A Multi-Domain Dataset of Nominal Phrases and Pronominal Reference Annotations

Coreference Resolution is a well studied problem in NLP. While widely st...

Please sign up or login with your details

Forgot password? Click here to reset