It is Not as Good as You Think! Evaluating Simultaneous Machine Translation on Interpretation Data

10/11/2021
by   Jinming Zhao, et al.
0

Most existing simultaneous machine translation (SiMT) systems are trained and evaluated on offline translation corpora. We argue that SiMT systems should be trained and tested on real interpretation data. To illustrate this argument, we propose an interpretation test set and conduct a realistic evaluation of SiMT trained on offline translations. Our results, on our test set along with 3 existing smaller scale language pairs, highlight the difference of up-to 13.83 BLEU score when SiMT models are evaluated on translation vs interpretation data. In the absence of interpretation training data, we propose a translation-to-interpretation (T2I) style transfer method which allows converting existing offline translations into interpretation-style data, leading to up-to 2.8 BLEU improvement. However, the evaluation gap remains notable, calling for constructing large-scale interpretation corpora better suited for evaluating and developing SiMT systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2023

Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data

Simultaneous speech translation (SimulST) translates partial speech inpu...
research
11/16/2022

MT Metrics Correlate with Human Ratings of Simultaneous Speech Translation

There have been several studies on the correlation between human ratings...
research
07/21/2023

Incorporating Human Translator Style into English-Turkish Literary Machine Translation

Although machine translation systems are mostly designed to serve in the...
research
05/10/2018

Automatic Estimation of Simultaneous Interpreter Performance

Simultaneous interpretation, translation of the spoken word in real-time...
research
05/05/2022

Efficient yet Competitive Speech Translation: FBK@IWSLT2022

The primary goal of this FBK's systems submission to the IWSLT 2022 offl...
research
07/28/2023

Multilingual Tourist Assistance using ChatGPT: Comparing Capabilities in Hindi, Telugu, and Kannada

This research investigates the effectiveness of ChatGPT, an AI language ...
research
08/31/2019

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

The ongoing neural revolution in machine translation has made it easier ...

Please sign up or login with your details

Forgot password? Click here to reset