Macro-Average: Rare Types Are Important Too

04/12/2021
by   Thamme Gowda, et al.
2

While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy. Model-based MT metrics trained on segment-level human judgments have emerged as an attractive replacement due to strong correlation results. These models, however, require potentially expensive re-training for new domains and languages. Furthermore, their decisions are inherently non-transparent and appear to reflect unwelcome biases. We explore the simple type-based classifier metric, MacroF1, and study its applicability to MT evaluation. We find that MacroF1 is competitive on direct assessment, and outperforms others in indicating downstream cross-lingual information retrieval task performance. Further, we show that MacroF1 can be used to effectively compare supervised and unsupervised neural machine translation, and reveal significant qualitative differences in the methods' outputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2022

Extrinsic Evaluation of Machine Translation Metrics

Automatic machine translation (MT) metrics are widely used to distinguis...
research
09/18/2020

COMET: A Neural Framework for MT Evaluation

We present COMET, a neural framework for training multilingual machine t...
research
02/21/2022

USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation

The vast majority of evaluation metrics for machine translation are supe...
research
01/31/2023

Machine Translation Impact in E-commerce Multilingual Search

Previous work suggests that performance of cross-lingual information ret...
research
06/13/2023

Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment

Cross-lingual Machine Translation (MT) quality estimation plays a crucia...
research
12/20/2022

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

The rapid growth of machine translation (MT) systems has necessitated co...
research
09/15/2021

Regressive Ensemble for Machine Translation Quality Evaluation

This work introduces a simple regressive ensemble for evaluating machine...

Please sign up or login with your details

Forgot password? Click here to reset