It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information

05/05/2020
by   Emanuele Bugliarello, et al.
0

The performance of neural machine translation systems is commonly evaluated in terms of BLEU. However, due to its reliance on target language properties and generation, the BLEU metric does not allow an assessment of which translation directions are more difficult to model. In this paper, we propose cross-mutual information (XMI): an asymmetric information-theoretic metric of machine translation difficulty that exploits the probabilistic nature of most neural machine translation models. XMI allows us to better evaluate the difficulty of translating text into the target language while controlling for the difficulty of the target-side generation component independent of the translation task. We then present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems. Code for replicating our experiments is available online at https://github.com/e-bug/nmt-difficulty.

READ FULL TEXT
research
09/19/2022

The first neural machine translation system for the Erzya language

We present the first neural machine translation system for translation b...
research
08/29/2018

An Operation Sequence Model for Explainable Neural Machine Translation

We propose to achieve explainable neural machine translation (NMT) by ch...
research
07/26/2021

Revisiting Negation in Neural Machine Translation

In this paper, we evaluate the translation of negation both automaticall...
research
10/16/2018

Multi-Source Neural Machine Translation with Data Augmentation

Multi-source translation systems translate from multiple languages to a ...
research
02/10/2019

Neural Machine Translation for Cebuano to Tagalog with Subword Unit Translation

The Philippines is an archipelago composed of 7, 641 different islands w...
research
11/03/2019

Controlling Text Complexity in Neural Machine Translation

This work introduces a machine translation task where the output is aime...
research
09/01/2021

EVIL: Exploiting Software via Natural Language

Writing exploits for security assessment is a challenging task. The writ...

Please sign up or login with your details

Forgot password? Click here to reset