DeepAI AI Chat
Log In Sign Up

LEPOR: An Augmented Machine Translation Evaluation Metric

by   Aaron Li-Feng Han, et al.

Machine translation (MT) was developed as one of the hottest research topics in the natural language processing (NLP) literature. One important issue in MT is that how to evaluate the MT system reasonably and tell us whether the translation system makes an improvement or not. The traditional manual judgment methods are expensive, time-consuming, unrepeatable, and sometimes with low agreement. On the other hand, the popular automatic MT evaluation methods have some weaknesses. Firstly, they tend to perform well on the language pairs with English as the target language, but weak when English is used as source. Secondly, some methods rely on many additional linguistic features to achieve good performance, which makes the metric unable to replicate and apply to other language pairs easily. Thirdly, some popular metrics utilize incomprehensive factors, which result in low performance on some practical tasks. In this thesis, to address the existing problems, we design novel MT evaluation methods and investigate their performances on different languages. Firstly, we design augmented factors to yield highly accurate evaluation.Secondly, we design a tunable evaluation model where weighting of factors can be optimised according to the characteristics of languages. Thirdly, in the enhanced version of our methods, we design concise linguistic feature using POS to show that our methods can yield even higher performance when using some external linguistic resources. Finally, we introduce the practical performance of our metrics in the ACL-WMT workshop shared tasks, which show that the proposed methods are robust across different languages.


page 1

page 2

page 3

page 4


The RGNLP Machine Translation Systems for WAT 2018

This paper presents the system description of Machine Translation (MT) s...

CushLEPOR: Customised hLEPOR Metric Using LABSE Distilled Knowledge Model to Improve Agreement with Human Judgements

Human evaluation has always been expensive while researchers struggle to...

Attentive fine-tuning of Transformers for Translation of low-resourced languages @LoResMT 2021

This paper reports the Machine Translation (MT) systems submitted by the...

Machine Translation Evaluation: A Survey

We introduce the Machine Translation (MT) evaluation survey that contain...

HilMeMe: A Human-in-the-Loop Machine Translation Evaluation Metric Looking into Multi-Word Expressions

With the fast development of Machine Translation (MT) systems, especiall...

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

The rapid growth of machine translation (MT) systems has necessitated co...

Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources?

This paper presents an approach combining lexico-semantic resources and ...