Towards Multilingual Automatic Dialogue Evaluation

08/31/2023
by   John Mendonça, et al.
0

The main limiting factor in the development of robust multilingual dialogue evaluation metrics is the lack of multilingual data and the limited availability of open sourced multilingual dialogue systems. In this work, we propose a workaround for this lack of data by leveraging a strong multilingual pretrained LLM and augmenting existing English dialogue data using Machine Translation. We empirically show that the naive approach of finetuning a pretrained multilingual encoder model with translated data is insufficient to outperform the strong baseline of finetuning a multilingual model with only source data. Instead, the best approach consists in the careful curation of translated data using MT Quality Estimation metrics, excluding low quality translations that hinder its performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2023

Overview of Robust and Multilingual Automatic Evaluation Metrics for Open-Domain Dialogue Systems at DSTC 11 Track 4

The advent and fast development of neural networks have revolutionized t...
research
08/17/2020

BUT-FIT at SemEval-2020 Task 4: Multilingual commonsense

This paper describes work of the BUT-FIT's team at SemEval 2020 Task 4 -...
research
08/31/2023

Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation

Despite significant research effort in the development of automatic dial...
research
06/06/2021

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

One of the biggest challenges hindering progress in low-resource and mul...
research
11/04/2021

Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues

Robust state tracking for task-oriented dialogue systems currently remai...
research
07/03/2020

El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks

Pre-training large-scale language models (LMs) requires huge amounts of ...
research
05/22/2023

The Best of Both Worlds: Combining Human and Machine Translations for Multilingual Semantic Parsing with Active Learning

Multilingual semantic parsing aims to leverage the knowledge from the hi...

Please sign up or login with your details

Forgot password? Click here to reset