KoBE: Knowledge-Based Machine Translation Evaluation

09/23/2020
by   Zorik Gekhman, et al.
0

We propose a simple and effective method for machine translation evaluation which does not require reference translations. Our approach is based on (1) grounding the entity mentions found in each source sentence and candidate translation against a large-scale multilingual knowledge base, and (2) measuring the recall of the grounded entities found in the candidate vs. those found in the source. Our approach achieves the highest correlation with human judgements on 9 out of the 18 language pairs from the WMT19 benchmark for evaluation without references, which is the largest number of wins for a single evaluation method on this task. On 4 language pairs, we also achieve higher correlation with human judgements than BLEU. To foster further research, we release a dataset containing 1.8 million grounded entity mentions across 18 language pairs from the WMT19 metrics track data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2023

KG-BERTScore: Incorporating Knowledge Graph into BERTScore for Reference-Free Machine Translation Evaluation

BERTScore is an effective and robust automatic metric for referencebased...
research
05/30/2023

BLEU Meets COMET: Combining Lexical and Neural Metrics Towards Robust Machine Translation Evaluation

Although neural-based machine translation evaluation metrics, such as CO...
research
04/13/2020

BLEU might be Guilty but References are not Innocent

The quality of automatic metrics for machine translation has been increa...
research
11/14/2021

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

It has been shown that machine translation models usually generate poor ...
research
05/17/2022

Consistent Human Evaluation of Machine Translation across Language Pairs

Obtaining meaningful quality scores for machine translation systems thro...
research
06/17/2022

The ITU Faroese Pairs Dataset

This article documents a dataset of sentence pairs between Faroese and D...
research
04/29/2021

Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

Human evaluation of modern high-quality machine translation systems is a...

Please sign up or login with your details

Forgot password? Click here to reset