Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems

12/23/2021
by   Michael Sun, et al.
2

While the English virtual assistants have achieved exciting performance with an enormous amount of training resources, the needs of non-English-speakers have not been satisfied well. Up to Dec 2021, Alexa, one of the most popular smart speakers around the world, is able to support 9 different languages [1], while there are thousands of languages in the world, 91 of which are spoken by more than 10 million people according to statistics published in 2019 [2]. However, training a virtual assistant in other languages than English is often more difficult, especially for those low-resource languages. The lack of high-quality training data restricts the performance of models, resulting in poor user satisfaction. Therefore, we devise an efficient and effective training solution for multilingual task-orientated dialogue systems, using the same dataset generation pipeline and end-to-end dialogue system architecture as BiToD[5], which adopted some key design choices for a minimalistic natural language design where formal dialogue states are used in place of natural language inputs. This reduces the room for error brought by weaker natural language models, and ensures the model can correctly extract the essential slot values needed to perform dialogue state tracking (DST). Our goal is to reduce the amount of natural language encoded at each turn, and the key parameter we investigate is the number of turns (H) to feed as history to model. We first explore the turning point where increasing H begins to yield limiting returns on the overall performance. Then we examine whether the examples a model with small H gets wrong can be categorized in a way for the model to do few-shot finetuning on. Lastly, will explore the limitations of this approach, and whether there is a certain type of examples that this approach will not be able to resolve.

READ FULL TEXT
research
12/20/2022

MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue

Task-oriented dialogue (TOD) systems have been applied in a range of dom...
research
03/15/2022

ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language

Most of the current task-oriented dialogue systems (ToD), despite having...
research
08/27/2022

MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages

Owing to the lack of corpora for low-resource languages, current works o...
research
04/28/2022

EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

Knowledge-based authentication is crucial for task-oriented spoken dialo...
research
01/31/2022

Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation

Multilingual task-oriented dialogue (ToD) facilitates access to services...
research
11/04/2021

Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues

Robust state tracking for task-oriented dialogue systems currently remai...
research
08/16/2023

MDDial: A Multi-turn Differential Diagnosis Dialogue Dataset with Reliability Evaluation

Dialogue systems for Automatic Differential Diagnosis (ADD) have a wide ...

Please sign up or login with your details

Forgot password? Click here to reset