Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

10/07/2020
by   Zehui Lin, et al.
0

We investigate the following question for machine translation (MT): can we develop a single universal MT model to serve as the common seed and obtain derivative and improved models on arbitrary language pairs? We propose mRASP, an approach to pre-train a universal multilingual neural machine translation model. Our key idea in mRASP is its novel technique of random aligned substitution, which brings words and phrases with similar meanings across multiple languages closer in the representation space. We pre-train a mRASP model on 32 language pairs jointly with only public datasets. The model is then fine-tuned on downstream language pairs to obtain specialized MT models. We carry out extensive experiments on 42 translation directions across a diverse settings, including low, medium, rich resource, and as well as transferring to exotic language pairs. Experimental results demonstrate that mRASP achieves significant performance improvement compared to directly training on those target pairs. It is the first time to verify that multiple low-resource language pairs can be utilized to improve rich resource MT. Surprisingly, mRASP is even able to improve the translation quality on exotic languages that never occur in the pre-training corpus. Code, data, and pre-trained models are available at https://github.com/linzehui/mRASP.

READ FULL TEXT
research
05/09/2021

Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation

The data scarcity in low-resource languages has become a bottleneck to b...
research
01/22/2020

Multilingual Denoising Pre-training for Neural Machine Translation

This paper demonstrates that multilingual denoising pre-training produce...
research
05/19/2021

Learning Language Specific Sub-network for Multilingual Machine Translation

Multilingual neural machine translation aims at learning a single transl...
research
10/15/2018

(Self-Attentive) Autoencoder-based Universal Language Representation for Machine Translation

Universal language representation is the holy grail in machine translati...
research
12/20/2022

Lego-MT: Towards Detachable Models in Massively Multilingual Machine Translation

Traditional multilingual neural machine translation (MNMT) uses a single...
research
10/12/2022

Using Massive Multilingual Pre-Trained Language Models Towards Real Zero-Shot Neural Machine Translation in Clinical Domain

Massively multilingual pre-trained language models (MMPLMs) are develope...

Please sign up or login with your details

Forgot password? Click here to reset