DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation

05/30/2019
by   Rachel Bawden, et al.
0

We present a new English-French test set for the evaluation of Machine Translation (MT) for informal, written bilingual dialogue. The test set contains 144 spontaneous dialogues (5,700+ sentences) between native English and French speakers, mediated by one of two neural MT systems in a range of role-play settings. The dialogues are accompanied by fine-grained sentence-level judgments of MT quality, produced by the dialogue participants themselves, as well as by manually normalised versions and reference translations produced a posteriori. The motivation for the corpus is two-fold: to provide (i) a unique resource for evaluating MT models, and (ii) a corpus for the analysis of MT-mediated communication. We provide a preliminary analysis of the corpus to confirm that the participants' judgments reveal perceptible differences in MT quality between the two MT systems used.

READ FULL TEXT
research
10/26/2022

A Bilingual Parallel Corpus with Discourse Annotations

Machine translation (MT) has almost achieved human parity at sentence-le...
research
09/07/2022

Facilitating Global Team Meetings Between Language-Based Subgroups: When and How Can Machine Translation Help?

Global teams frequently consist of language-based subgroups who put toge...
research
12/11/2020

Document-aligned Japanese-English Conversation Parallel Corpus

Sentence-level (SL) machine translation (MT) has reached acceptable qual...
research
10/24/2021

Understanding the Impact of UGC Specificities on Translation Quality

This work takes a critical look at the evaluation of user-generated cont...
research
10/20/2014

Using Mechanical Turk to Build Machine Translation Evaluation Sets

Building machine translation (MT) test sets is a relatively expensive ta...
research
05/18/2018

SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension

We present a novel neural architecture for the Argument Reasoning Compre...
research
12/20/2022

Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation

One of the major challenges of machine translation (MT) is ambiguity, wh...

Please sign up or login with your details

Forgot password? Click here to reset