Generalizing Back-Translation in Neural Machine Translation

06/17/2019
by   Miguel Graça, et al.
0

Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of cross-entropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German - English news translation task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2018

Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation

Recent work achieved remarkable results in training neural machine trans...
research
03/18/2021

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Current sequence-to-sequence models are trained to minimize cross-entrop...
research
03/01/2018

Joint Training for Neural Machine Translation Models with Monolingual Data

Monolingual data have been demonstrated to be helpful in improving trans...
research
06/15/2016

The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16

This paper presents the University of Cambridge submission to WMT16. Mot...
research
08/27/2018

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

Neural Machine Translation has achieved state-of-the-art performance for...
research
12/12/2016

Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices

We present a novel scheme to combine neural machine translation (NMT) wi...
research
09/20/2020

Softmax Tempering for Training Neural Machine Translation Models

Neural machine translation (NMT) models are typically trained using a so...

Please sign up or login with your details

Forgot password? Click here to reset