Removing Biases from Trainable MT Metrics by Using Self-Training

08/10/2015
by   Miloš Stanojević, et al.
0

Most trainable machine translation (MT) metrics train their weights on human judgments of state-of-the-art MT systems outputs. This makes trainable metrics biases in many ways. One of them is preferring longer translations. These biased metrics when used for tuning are evaluating different types of translations -- n-best lists of translations with very diverse quality. Systems tuned with these metrics tend to produce overly long translations that are preferred by the metric but not by humans. This is usually solved by manually tweaking metric's weights to equally value recall and precision. Our solution is more general: (1) it does not address only the recall bias but also all other biases that might be present in the data and (2) it does not require any knowledge of the types of features used which is useful in cases when manual tuning of metric's weights is not possible. This is accomplished by self-training on unlabeled n-best lists by using metric that was initially trained on standard human judgments. One way of looking at this is as domain adaptation from the domain of state-of-the-art MT translations to diverse n-best list translations.

READ FULL TEXT

page 1

page 2

research
05/26/2023

Do GPTs Produce Less Literal Translations?

Large Language Models (LLMs) such as GPT-3 have emerged as general-purpo...
research
05/30/2023

Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation

We propose a genetic algorithm (GA) based method for modifying n-best li...
research
04/01/2021

Detecting over/under-translation errors for determining adequacy in human translations

We present a novel approach to detecting over and under translations (OT...
research
05/09/2016

GLEU Without Tuning

The GLEU metric was proposed for evaluating grammatical error correction...
research
11/04/2019

Analysing Coreference in Transformer Outputs

We analyse coreference phenomena in three neural machine translation sys...
research
03/31/2020

On the Integration of LinguisticFeatures into Statistical and Neural Machine Translation

New machine translations (MT) technologies are emerging rapidly and with...

Please sign up or login with your details

Forgot password? Click here to reset