How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs

12/14/2016
by   Rico Sennrich, et al.
0

Analysing translation quality in regards to specific linguistic phenomena has historically been difficult and time-consuming. Neural machine translation has the attractive property that it can produce scores for arbitrary translations, and we propose a novel method to assess how well NMT systems model specific linguistic phenomena such as agreement over long distances, the production of novel words, and the faithful translation of polarity. The core idea is that we measure whether a reference translation is more probable under a NMT model than a contrastive translation which introduces a specific type of error. We present LingEval97, a large-scale data set of 97000 contrastive translation pairs based on the WMT English->German translation task, with errors automatically created with simple rules. We report results for a number of systems, and find that recently introduced character-level NMT systems perform better at transliteration than models with byte-pair encoding (BPE) segmentation, but perform more poorly at morphosyntactic agreement, and translating discontiguous units of meaning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2019

A Test Suite and Manual Evaluation of Document-Level NMT at WMT19

As the quality of machine translation rises and neural machine translati...
research
05/18/2018

Combining Advanced Methods in Japanese-Vietnamese Neural Machine Translation

Neural machine translation (NMT) systems have recently obtained state-of...
research
02/13/2018

Examining the Tip of the Iceberg: A Data Set for Idiom Translation

Neural Machine Translation (NMT) has been widely used in recent years wi...
research
11/04/2020

PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents

Neural Machine Translation (NMT) has shown drastic improvement in its qu...
research
07/17/2023

Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training

Supervised learning in Neural Machine Translation (NMT) typically follow...
research
05/26/2023

CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation

Neural machine translation (NMT) systems exhibit limited robustness in h...
research
03/03/2022

As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning

Omission and addition of content is a typical issue in neural machine tr...

Please sign up or login with your details

Forgot password? Click here to reset