DeepAI AI Chat
Log In Sign Up

Findings of the First Shared Task on Machine Translation Robustness

06/27/2019
by   Xian Li, et al.
0

We share the findings of the first shared task on improving robustness of Machine Translation (MT). The task provides a testbed representing challenges facing MT models deployed in the real world, and facilitates new approaches to improve models; robustness to noisy input and domain mismatch. We focus on two language pairs (English-French and English-Japanese), and the submitted systems are evaluated on a blind test set consisting of noisy comments on Reddit and professionally sourced translations. As a new task, we received 23 submissions by 11 participating teams from universities, companies, national labs, etc. All submitted systems achieved large improvements over baselines, with the best improvement having +22.33 BLEU. We evaluated submissions by both human judgment and automatic evaluation (BLEU), which shows high correlations (Pearson's r = 0.94 and 0.95). Furthermore, we conducted a qualitative analysis of the submitted systems using compare-mt, which revealed their salient differences in handling challenges in this task. Such analysis provides additional insights when there is occasional disagreement between human judgment and BLEU, e.g. systems better at producing colloquial expressions received higher score from human judgment.

READ FULL TEXT
07/15/2019

Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness Task

This paper describes the systems that we submitted to the WMT19 Machine ...
09/02/2018

MTNT: A Testbed for Machine Translation of Noisy Text

Noisy or non-standard input text can cause disastrous mistranslations in...
11/30/2022

Findings of the WMT 2022 Shared Task on Translation Suggestion

We report the result of the first edition of the WMT shared task on Tran...
10/31/2019

Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness

We share a French-English parallel corpus of Foursquare restaurant revie...
04/30/2020

Explicit Representation of the Translation Space: Automatic Paraphrasing for Machine Translation Evaluation

Following previous work on automatic paraphrasing, we assess the feasibi...
01/30/2019

Reference-less Quality Estimation of Text Simplification Systems

The evaluation of text simplification (TS) systems remains an open chall...
05/19/2016

Automatic TM Cleaning through MT and POS Tagging: Autodesk's Submission to the NLP4TM 2016 Shared Task

We describe a machine learning based method to identify incorrect entrie...