Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages

05/24/2023
by   Qi Gou, et al.
0

This paper proposes a framework to address the issue of data scarcity in Document-Grounded Dialogue Systems(DGDS). Our model leverages high-resource languages to enhance the capability of dialogue generation in low-resource languages. Specifically, We present a novel pipeline CLEM (Cross-Lingual Enhanced Model) including adversarial training retrieval (Retriever and Re-ranker), and Fid (fusion-in-decoder) generator. To further leverage high-resource language, we also propose an innovative architecture to conduct alignment across different languages with translated training. Extensive experiment results demonstrate the effectiveness of our model and we achieved 4th place in the DialDoc 2023 Competition. Therefore, CLEM can serve as a solution to resource scarcity in DGDS and provide useful guidance for multi-lingual alignment tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/06/2019

A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Parsers are available for only a handful of the world's languages, since...
research
01/27/2021

An Empirical Study of Cross-Lingual Transferability in Generative Dialogue State Tracker

There has been a rapid development in data-driven task-oriented dialogue...
research
12/15/2021

DG2: Data Augmentation Through Document Grounded Dialogue Generation

Collecting data for training dialog systems can be extremely expensive d...
research
04/17/2022

Ìtàkúròso: Exploiting Cross-Lingual Transferability for Natural Language Generation of Dialogues in Low-Resource, African Languages

We investigate the possibility of cross-lingual transfer from a state-of...
research
05/02/2023

Turning Flowchart into Dialog: Plan-based Data Augmentation for Low-Resource Flowchart-grounded Troubleshooting Dialogs

Flowchart-grounded troubleshooting dialogue (FTD) systems, which follow ...
research
06/28/2021

Efficient Dialogue State Tracking by Masked Hierarchical Transformer

This paper describes our approach to DSTC 9 Track 2: Cross-lingual Multi...
research
05/19/2020

Cross-lingual Transfer Learning for Dialogue Act Recognition

This paper deals with cross-lingual transfer learning for dialogue act (...

Please sign up or login with your details

Forgot password? Click here to reset