Tagged Back-Translation

06/15/2019
by   Isaac Caswell, et al.
0

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data. We show that the main role of such synthetic noise is not to diversify the source side, as previously suggested, but simply to indicate to the model that the given source is synthetic. We propose a simpler alternative to noising techniques, consisting of tagging back-translated source sentences with an extra token. Our results on WMT outperform noised back-translation in English-Romanian and match performance on English-German, re-defining state-of-the-art in the former.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2018

Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation

Recent work achieved remarkable results in training neural machine trans...
research
12/22/2019

Tag-less Back-Translation

An effective method to generate a large number of parallel sentences for...
research
11/27/2017

Modeling Past and Future for Neural Machine Translation

Existing neural machine translation systems do not explicitly model what...
research
04/17/2018

Investigating Backtranslation in Neural Machine Translation

A prerequisite for training corpus-based machine translation (MT) system...
research
08/27/2018

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

Neural Machine Translation has achieved state-of-the-art performance for...
research
10/01/2016

Vocabulary Selection Strategies for Neural Machine Translation

Classical translation models constrain the space of possible outputs by ...
research
01/18/2023

Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection

Neural sequence generation models are known to "hallucinate", by produci...

Please sign up or login with your details

Forgot password? Click here to reset