Train, Sort, Explain: Learning to Diagnose Translation Models

03/28/2019
by   Robert Schwarzenberg, et al.
0

Evaluating translation models is a trade-off between effort and detail. On the one end of the spectrum there are automatic count-based methods such as BLEU, on the other end linguistic evaluations by humans, which arguably are more informative but also require a disproportionately high effort. To narrow the spectrum, we propose a general approach on how to automatically expose systematic differences between human and machine translations to human experts. Inspired by adversarial settings, we train a neural text classifier to distinguish human from machine translations. A classifier that performs and generalizes well after training should recognize systematic differences between the two classes, which we uncover with neural explainability methods. Our proof-of-concept implementation, DiaMaT, is open source. Applied to a dataset translated by a state-of-the-art neural Transformer model, DiaMaT achieves a classification accuracy of 75 humans and the Transformer, amidst the current discussion about human parity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2018

Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

We reassess a recent study (Hassan et al., 2018) that claimed that machi...
research
03/15/2018

Achieving Human Parity on Automatic Chinese to English News Translation

Machine translation has made rapid advances in recent years. Millions of...
research
11/04/2019

Analysing Coreference in Transformer Outputs

We analyse coreference phenomena in three neural machine translation sys...
research
06/22/2022

Comparing Formulaic Language in Human and Machine Translation: Insight from a Parliamentary Corpus

A recent study has shown that, compared to human translations, neural ma...
research
05/03/2018

A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation

We present an approach to interactive-predictive neural machine translat...
research
09/16/2021

Does Summary Evaluation Survive Translation to Other Languages?

The creation of a large summarization quality dataset is a considerable,...
research
05/06/2021

Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases

In this paper, we present a quantitative evaluation of differences betwe...

Please sign up or login with your details

Forgot password? Click here to reset