Combining SMT and NMT Back-Translated Data for Efficient NMT

09/09/2019
by   Alberto Poncelas, et al.
0

Neural Machine Translation (NMT) models achieve their best performance when large sets of parallel data are used for training. Consequently, techniques for augmenting the training set have become popular recently. One of these methods is back-translation (Sennrich et al., 2016), which consists on generating synthetic sentences by translating a set of monolingual, target-language sentences using a Machine Translation (MT) model. Generally, NMT models are used for back-translation. In this work, we analyze the performance of models when the training data is extended with synthetic data using different MT approaches. In particular we investigate back-translated data generated not only by NMT but also by Statistical Machine Translation (SMT) models and combinations of both. The results reveal that the models achieve the best performances when the training set is augmented with back-translated data created by merging different MT approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2018

Investigating Backtranslation in Neural Machine Translation

A prerequisite for training corpus-based machine translation (MT) system...
research
06/18/2019

Adaptation of Machine Translation Models with Back-translated Data using Transductive Data Selection Methods

Data selection has proven its merit for improving Neural Machine Transla...
research
09/19/2018

NICT's Neural and Statistical Machine Translation Systems for the WMT18 News Translation Task

This paper presents the NICT's participation to the WMT18 shared news tr...
research
11/14/2018

The ADAPT System Description for the IWSLT 2018 Basque to English Translation Task

In this paper we present the ADAPT system built for the Basque to Englis...
research
05/27/2020

MT-Adapted Datasheets for Datasets: Template and Repository

In this report we are taking the standardized model proposed by Gebru et...
research
08/31/2018

Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection

Measuring domain relevance of data and identifying or selecting well-fit...
research
08/25/2020

The Impact of Indirect Machine Translation on Sentiment Classification

Sentiment classification has been crucial for many natural language proc...

Please sign up or login with your details

Forgot password? Click here to reset