DeepAI AI Chat
Log In Sign Up

Tag-less Back-Translation

by   Idris Abdulmumin, et al.

An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of back-translations of the target-side monolingual data. Tagging, or using gates, has been used to enable translation models to distinguish between synthetic and natural data. This improves standard back-translation and also enables the use of iterative back-translation on language pairs that underperformed using standard back-translation. This work presents a simplified approach of differentiating between the two data using pretraining and finetuning. The approach - tag-less back-translation - trains the model on the synthetic data and finetunes it on the natural data. Preliminary experiments have shown the approach to continuously outperform the tagging approach on low resource English-Vietnamese neural machine translation. While the need for tagging (noising) the dataset has been removed, the approach outperformed the tagged back-translation approach by an average of 0.4 BLEU.


Using Self-Training to Improve Back-Translation in Low Resource Neural Machine Translation

Improving neural machine translation (NMT) models using the back-transla...

From direct tagging to Tagging with sentences compression

In essence, the two tagging methods (direct tagging and tagging with sen...

Iterative Self-Learning for Enhanced Back-Translation in Low Resource Neural Machine Translation

Many language pairs are low resource - the amount and/or quality of para...

Tagged Back-Translation

Recent work in Neural Machine Translation (NMT) has shown significant qu...

Knowledge Based Template Machine Translation In Low-Resource Setting

Incorporating tagging into neural machine translation (NMT) systems has ...

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

Neural Machine Translation has achieved state-of-the-art performance for...

Understanding Back-Translation at Scale

An effective method to improve neural machine translation with monolingu...