Improving Zero-shot Multilingual Neural Machine Translation for Low-Resource Languages

10/02/2021
by   Chenyang Li, et al.
0

Although the multilingual Neural Machine Translation(NMT), which extends Google's multilingual NMT, has ability to perform zero-shot translation and the iterative self-learning algorithm can improve the quality of zero-shot translation, it confronts with two problems: the multilingual NMT model is prone to generate wrong target language when implementing zero-shot translation; the self-learning algorithm, which uses beam search to generate synthetic parallel data, demolishes the diversity of the generated source language and amplifies the impact of the same noise during the iterative learning process. In this paper, we propose the tagged-multilingual NMT model and improve the self-learning algorithm to handle these two problems. Firstly, we extend the Google's multilingual NMT model and add target tokens to the target languages, which associates the start tag with the target language to ensure that the source language can be translated to the required target language. Secondly, we improve the self-learning algorithm by replacing beam search with random sample to increases the diversity of the generated data and makes it properly cover the true data distribution. Experimental results on IWSLT show that the adjusted tagged-multilingual NMT separately obtains 9.41 and 7.85 BLEU scores over the multilingual NMT on 2010 and 2017 Romanian-Italian test sets. Similarly, it obtains 9.08 and 7.99 BLEU scores on Italian-Romanian zero-shot translation. Furthermore, the improved self-learning algorithm shows its superiorities over the conventional self-learning algorithm on zero-shot translations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2020

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Massively multilingual models for neural machine translation (NMT) are t...
research
11/10/2019

Translationese as a Language in "Multilingual" NMT

Machine translation has an undesirable propensity to produce "translatio...
research
09/16/2019

Multilingual Neural Machine Translation for Zero-Resource Languages

In recent years, Neural Machine Translation (NMT) has been shown to be m...
research
03/10/2021

Self-Learning for Zero Shot Neural Machine Translation

Neural Machine Translation (NMT) approaches employing monolingual data a...
research
06/18/2018

A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation

Recently, neural machine translation (NMT) has been extended to multilin...
research
08/11/2020

Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity

Recent work has shown that a multilingual neural machine translation (NM...
research
06/11/2021

Towards User-Driven Neural Machine Translation

A good translation should not only translate the original content semant...

Please sign up or login with your details

Forgot password? Click here to reset