Re-translation versus Streaming for Simultaneous Translation

04/07/2020
by   Naveen Arivazhagan, et al.
0

There has been great progress in improving streaming machine translation, a simultaneous paradigm where the system appends to a growing hypothesis as more source content becomes available. We study a related problem in which revisions to the hypothesis beyond strictly appending words are permitted. This is suitable for applications such as live captioning an audio feed. In this setting, we compare custom streaming approaches to re-translation, a straightforward strategy where each new source token triggers a distinct translation from scratch. We find re-translation to be as good or better than state-of-the-art streaming systems, even when operating under constraints that allow very few revisions. We attribute much of this success to a previously proposed data-augmentation technique that adds prefix-pairs to the training data, which alongside wait-k inference forms a strong baseline for streaming translation. We also highlight re-translation's ability to wrap arbitrarily powerful MT systems with an experiment showing large improvements from an upgrade to its base model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2022

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

Simultaneous Machine Translation is the task of incrementally translatin...
research
04/18/2021

Stream-level Latency Evaluation for Simultaneous Machine Translation

Simultaneous machine translation has recently gained traction thanks to ...
research
05/31/2019

Thinking Slow about Latency Evaluation for Simultaneous Machine Translation

Simultaneous machine translation attempts to translate a source sentence...
research
12/29/2020

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation

Recently, simultaneous translation has gathered a lot of attention since...
research
03/17/2022

Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework

Simultaneous machine translation (SiMT) starts translating while receivi...
research
05/29/2019

Unsupervised Paraphrasing without Translation

Paraphrasing exemplifies the ability to abstract semantic content from s...

Please sign up or login with your details

Forgot password? Click here to reset