IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

12/20/2022
by   Ananya B. Sai, et al.
0

The rapid growth of machine translation (MT) systems has necessitated comprehensive studies to meta-evaluate evaluation metrics being used, which enables a better selection of metrics that best reflect MT quality. Unfortunately, most of the research focuses on high-resource languages, mainly English, the observations for which may not always apply to other languages. Indian languages, having over a billion speakers, are linguistically different from English, and to date, there has not been a systematic study of evaluating MT systems from English into Indian languages. In this paper, we fill this gap by creating an MQM dataset consisting of 7000 fine-grained annotations, spanning 5 Indian languages and 7 MT systems, and use it to establish correlations between annotator scores and scores obtained using existing automatic metrics. Our results show that pre-trained metrics, such as COMET, have the highest correlations with annotator scores. Additionally, we find that the metrics do not adequately capture fluency-based errors in Indian languages, and there is a need to develop metrics focused on Indian languages. We hope that our dataset and analysis will help promote further research in this area.

READ FULL TEXT

page 3

page 4

page 14

research
03/31/2020

Evaluating Amharic Machine Translation

Machine translation (MT) systems are now able to provide very accurate r...
research
08/14/2023

The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation

Automatic evaluation of machine translation (MT) is a critical tool driv...
research
10/25/2022

DEMETR: Diagnosing Evaluation Metrics for Translation

While machine translation evaluation metrics based on string overlap (e....
research
06/29/2021

Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers

This paper presents the first large-scale meta-evaluation of machine tra...
research
05/17/2023

A Survey on Zero Pronoun Translation

Zero pronouns (ZPs) are frequently omitted in pro-drop languages (e.g. C...
research
05/09/2022

Building Machine Translation Systems for the Next Thousand Languages

In this paper we share findings from our effort to build practical machi...
research
04/12/2021

Macro-Average: Rare Types Are Important Too

While traditional corpus-level evaluation metrics for machine translatio...

Please sign up or login with your details

Forgot password? Click here to reset