Using Self-Training to Improve Back-Translation in Low Resource Neural Machine Translation

06/04/2020
by   Idris Abdulmumin, et al.
0

Improving neural machine translation (NMT) models using the back-translations of the monolingual target data (synthetic parallel data) is currently the state-of-the-art approach for training improved translation systems. The quality of the backward system - which is trained on the available parallel data and used for the back-translation - has been shown in many studies to affect the performance of the final NMT model. In low resource conditions, the available parallel data is usually not enough to train a backward model that can produce the qualitative synthetic data needed to train a standard translation model. This work proposes a self-training strategy where the output of the backward model is used to improve the model itself through the forward translation technique. The technique was shown to improve baseline low resource IWSLT'14 English-German and IWSLT'15 English-Vietnamese backward translation models by 11.06 and 1.5 BLEUs respectively. The synthetic data generated by the improved English-German backward model was used to train a forward model which out-performed another forward model trained using standard back-translation by 2.7 BLEU.

READ FULL TEXT
research
11/14/2020

Iterative Self-Learning for Enhanced Back-Translation in Low Resource Neural Machine Translation

Many language pairs are low resource - the amount and/or quality of para...
research
11/05/2019

Data Diversification: An Elegant Strategy For Neural Machine Translation

A common approach to improve neural machine translation is to invent new...
research
12/22/2019

Tag-less Back-Translation

An effective method to generate a large number of parallel sentences for...
research
04/05/2020

AR: Auto-Repair the Synthetic Data for Neural Machine Translation

Compared with only using limited authentic parallel data as training cor...
research
08/27/2018

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

Neural Machine Translation has achieved state-of-the-art performance for...
research
06/01/2022

Exploring Diversity in Back Translation for Low-Resource Machine Translation

Back translation is one of the most widely used methods for improving th...
research
11/05/2019

Training Neural Machine Translation (NMT) Models using Tensor Train Decomposition on TensorFlow (T3F)

We implement a Tensor Train layer in the TensorFlow Neural Machine Trans...

Please sign up or login with your details

Forgot password? Click here to reset