MULTI3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue

12/20/2022
by   Nikita Moghe, et al.
0

Task-oriented dialogue (TOD) systems have been applied in a range of domains to support human users to achieve specific goals. Systems are typically constructed for a single domain or language and do not generalise well beyond this. Their extension to other languages in particular is restricted by the lack of available training data for many of the world's languages. To support work on Natural Language Understanding (NLU) in TOD across multiple languages and domains simultaneously, we constructed MULTI3NLU++, a multilingual, multi-intent, multi-domain dataset. MULTI3NLU++ extends the English-only NLU++ dataset to include manual translations into a range of high, medium and low resource languages (Spanish, Marathi, Turkish and Amharic), in two domains (banking and hotels). MULTI3NLU++ inherits the multi-intent property of NLU++, where an utterance may be labelled with multiple intents, providing a more realistic representation of a user's goals and aligning with the more complex tasks that commercial systems aim to model. We use MULTI3NLU++ to benchmark state-of-the-art multilingual language models as well as Machine Translation and Question Answering systems for the NLU task of intent detection for TOD systems in the multilingual setting. The results demonstrate the challenging nature of the dataset, particularly in the low-resource language setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2022

ViWOZ: A Multi-Domain Task-Oriented Dialogue Systems Dataset For Low-resource Language

Most of the current task-oriented dialogue systems (ToD), despite having...
research
04/27/2022

NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue

We present NLU++, a novel dataset for natural language understanding (NL...
research
12/23/2021

Investigating Effect of Dialogue History in Multilingual Task Oriented Dialogue Systems

While the English virtual assistants have achieved exciting performance ...
research
03/07/2023

A Hybrid Architecture for Out of Domain Intent Detection and Intent Discovery

Intent Detection is one of the tasks of the Natural Language Understandi...
research
12/13/2022

The Massively Multilingual Natural Language Understanding 2022 (MMNLU-22) Workshop and Competition

Despite recent progress in Natural Language Understanding (NLU), the cre...
research
10/20/2022

DialogUSR: Complex Dialogue Utterance Splitting and Reformulation for Multiple Intent Detection

While interacting with chatbots, users may elicit multiple intents in a ...
research
07/27/2020

Towards Multi-Language Recipe Personalisation and Recommendation

Multi-language recipe personalisation and recommendation is an under-exp...

Please sign up or login with your details

Forgot password? Click here to reset