PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining

08/04/2021
by   Machel Reid, et al.
0

Despite the success of multilingual sequence-to-sequence pretraining, most existing approaches rely on monolingual corpora, and do not make use of the strong cross-lingual signal contained in parallel data. In this paper, we present PARADISE (PARAllel Denoising Integration in SEquence-to-sequence models), which extends the conventional denoising objective used to train these models by (i) replacing words in the noised sequence according to a multilingual dictionary, and (ii) predicting the reference translation according to a parallel corpus instead of recovering the original sequence. Our experiments on machine translation and cross-lingual natural language inference show an average improvement of 2.0 BLEU points and 6.7 accuracy points from integrating parallel data into pretraining, respectively, obtaining results that are competitive with several popular models at a fraction of their computational cost.

READ FULL TEXT
research
04/16/2022

Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation

For multilingual sequence-to-sequence pretrained language models (multil...
research
12/15/2022

TRIP: Triangular Document-level Pre-training for Multilingual Language Models

Despite the current success of multilingual pre-training, most prior wor...
research
12/20/2022

On the Role of Parallel Data in Cross-lingual Transfer Learning

While prior work has established that the use of parallel data is conduc...
research
06/10/2021

Exploring Unsupervised Pretraining Objectives for Machine Translation

Unsupervised cross-lingual pretraining has achieved strong results in ne...
research
09/10/2021

AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages

Reproducible benchmarks are crucial in driving progress of machine trans...
research
03/18/2021

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Current sequence-to-sequence models are trained to minimize cross-entrop...
research
02/10/2023

Language-Aware Multilingual Machine Translation with Self-Supervised Learning

Multilingual machine translation (MMT) benefits from cross-lingual trans...

Please sign up or login with your details

Forgot password? Click here to reset