MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation

11/16/2022
by   Dominik Macháček, et al.
0

There have been several studies on the correlation between human ratings and metrics such as BLEU, chrF2 and COMET in machine translation. Most, if not all consider full-sentence translation. It is unclear whether human ratings of simultaneous speech translation Continuous Rating (CR) correlate with these metrics or not. Therefore, we conduct an extensive correlation analysis of CR and the aforementioned automatic metrics on evaluations of candidate systems at English-German simultaneous speech translation task at IWSLT 2022. Our studies reveal that the offline MT metrics correlate with CR and can be reliably used for evaluating machine translation in the simultaneous mode, with some limitations on the test set size. This implies that automatic metrics can be used as proxies for CR, thereby alleviating the need for human evaluation.

READ FULL TEXT
research
07/30/2021

Difficulty-Aware Machine Translation Evaluation

The high-quality translation results produced by machine translation (MT...
research
03/15/2021

Towards the evaluation of simultaneous speech translation from a communicative perspective

In recent years, machine speech-to-speech and speech-to-text translation...
research
05/22/2023

Improving Metrics for Speech Translation

We introduce Parallel Paraphrasing (Para_both), an augmentation method f...
research
02/11/2022

Evaluating MT Systems: A Theoretical Framework

This paper outlines a theoretical framework using which different automa...
research
04/10/2023

DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach

Multiple choice questions (MCQs) are an efficient and common way to asse...
research
10/11/2021

It is Not as Good as You Think! Evaluating Simultaneous Machine Translation on Interpretation Data

Most existing simultaneous machine translation (SiMT) systems are traine...
research
12/20/2022

Extrinsic Evaluation of Machine Translation Metrics

Automatic machine translation (MT) metrics are widely used to distinguis...

Please sign up or login with your details

Forgot password? Click here to reset