Variance-Aware Machine Translation Test Sets

11/07/2021
by   Runzhe Zhan, et al.
0

We release 70 small and discriminative test sets for machine translation (MT) evaluation called variance-aware test sets (VAT), covering 35 translation directions from WMT16 to WMT20 competitions. VAT is automatically created by a novel variance-aware filtering method that filters the indiscriminative test instances of the current MT test sets without any human labor. Experimental results show that VAT outperforms the original WMT test sets in terms of the correlation with human judgement across mainstream language pairs and test sets. Further analysis on the properties of VAT reveals the challenging linguistic features (e.g., translation of low-frequency words and proper nouns) for competitive MT systems, providing guidance for constructing future MT test sets. The test sets and the code for preparing variance-aware MT test sets are freely available at https://github.com/NLP2CT/Variance-Aware-MT-Test-Sets .

READ FULL TEXT
research
07/30/2021

Difficulty-Aware Machine Translation Evaluation

The high-quality translation results produced by machine translation (MT...
research
10/20/2014

Using Mechanical Turk to Build Machine Translation Evaluation Sets

Building machine translation (MT) test sets is a relatively expensive ta...
research
10/12/2020

It's not a Non-Issue: Negation as a Source of Error in Machine Translation

As machine translation (MT) systems progress at a rapid pace, questions ...
research
03/19/2019

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

In this paper, we describe compare-mt, a tool for holistic analysis and ...
research
05/26/2021

The statistical advantage of automatic NLG metrics at the system level

Estimating the expected output quality of generation systems is central ...
research
05/04/2022

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

Human-translated text displays distinct features from naturally written ...
research
10/13/2021

Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

Training data for machine translation (MT) is often sourced from a multi...

Please sign up or login with your details

Forgot password? Click here to reset