Multilingual Denoising Pre-training for Neural Machine Translation

01/22/2020
by   Yinhan Liu, et al.
0

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART – a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective. mBART is the first method for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages; previous MT pre-training has focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine tuned for supervised (both sentence-level and document-level) and unsupervised machine translation, with no task-specific modifications. We demonstrate that adding mBART initialization produces performance gains in all but the highest-resource settings, including up to 12 BLEU points for low resource MT and over 5 BLEU points for many document-level and unsupervised models. We also show it also enables new types of transfer to language pairs with no bi-text or that were not in the pre-training corpus, and present extensive analysis of which factors contribute the most to effective pre-training.

READ FULL TEXT

page 3

page 16

research
04/03/2023

PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation

Multilingual pre-training significantly improves many multilingual NLP t...
research
06/26/2020

Pre-training via Paraphrasing

We introduce MARGE, a pre-trained sequence-to-sequence model learned wit...
research
12/16/2021

DOCmT5: Document-Level Pretraining of Multilingual Language Models

In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence ...
research
10/07/2020

Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information

We investigate the following question for machine translation (MT): can ...
research
11/14/2021

DEEP: DEnoising Entity Pre-training for Neural Machine Translation

It has been shown that machine translation models usually generate poor ...
research
06/14/2023

Recipes for Sequential Pre-training of Multilingual Encoder and Seq2Seq Models

Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each ...
research
01/20/2022

Linguistically-driven Multi-task Pre-training for Low-resource Neural Machine Translation

In the present study, we propose novel sequence-to-sequence pre-training...

Please sign up or login with your details

Forgot password? Click here to reset