DeepAI AI Chat
Log In Sign Up

Fine-grained human evaluation of neural versus phrase-based machine translation

by   Filip Klubička, et al.
University of Zagreb

We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs. The error types in our annotation are compliant with the multidimensional quality metrics (MQM), and the annotation is performed by two annotators. Inter-annotator agreement is high for such a task, and results show that the best performing system (neural) reduces the errors produced by the worst system (phrase-based) by 54


page 1

page 2

page 3

page 4


Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian

This paper presents a quantitative fine-grained manual evaluation approa...

A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions

We aim to shed light on the strengths and weaknesses of the newly introd...

A Challenge Set Approach to Evaluating Machine Translation

Neural machine translation represents an exciting leap forward in transl...

Coordination Annotation Extension in the Penn Tree Bank

Coordination is an important and common syntactic construction which is ...

Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

Fine-grained information on translation errors is helpful for the transl...

WiRe57 : A Fine-Grained Benchmark for Open Information Extraction

We build a reference for the task of Open Information Extraction, on fiv...