Cross-lingual Pre-training Based Transfer for Zero-shot Neural Machine Translation

12/03/2019
by   Baijun Ji, et al.
0

Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme scenario of zero-shot translation, due to the language space mismatch problem between transferor (the parent model) and transferee (the child model) on the source side. To address this challenge, we propose an effective transfer learning approach based on cross-lingual pre-training. Our key idea is to make all source languages share the same feature space and thus enable a smooth transition for zero-shot translation. To this end, we introduce one monolingual pre-training method and two bilingual pre-training methods to obtain a universal encoder for different languages. Once the universal encoder is constructed, the parent model built on such encoder is trained with large-scale annotated data and then directly applied in zero-shot translation scenario. Experiments on two public datasets show that our approach significantly outperforms strong pivot-based baseline and various multilingual NMT approaches.

READ FULL TEXT
research
09/20/2019

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

We present effective pre-training strategies for neural machine translat...
research
10/06/2020

Multi-task Learning for Multilingual Neural Machine Translation

While monolingual data has been shown to be useful in improving bilingua...
research
05/14/2019

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Transfer learning or multilingual model is essential for low-resource ne...
research
09/05/2019

Investigating Multilingual NMT Representations at Scale

Multilingual Neural Machine Translation (NMT) models have yielded large ...
research
05/25/2023

MTCue: Learning Zero-Shot Control of Extra-Textual Attributes by Leveraging Unstructured Context in Neural Machine Translation

Efficient utilisation of both intra- and extra-textual context remains o...
research
01/17/2023

Prompting Large Language Model for Machine Translation: A Case Study

Research on prompting has shown excellent performance with little or eve...
research
09/09/2021

Generalised Unsupervised Domain Adaptation of Neural Machine Translation with Cross-Lingual Data Selection

This paper considers the unsupervised domain adaptation problem for neur...

Please sign up or login with your details

Forgot password? Click here to reset