Cross-Lingual Supervision improves Large Language Models Pre-training

05/19/2023
by   Andrea Schioppa, et al.
0

The recent rapid progress in pre-training Large Language Models has relied on using self-supervised language modeling objectives like next token prediction or span corruption. On the other hand, Machine Translation Systems are mostly trained using cross-lingual supervision that requires aligned data between source and target languages. We demonstrate that pre-training Large Language Models on a mixture of a self-supervised Language Modeling objective and the supervised Machine Translation objective, therefore including cross-lingual parallel data during pre-training, yields models with better in-context learning abilities. As pre-training is a very resource-intensive process and a grid search on the best mixing ratio between the two objectives is prohibitively expensive, we propose a simple yet effective strategy to learn it during pre-training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2021

Cross-lingual Visual Pre-training for Multimodal Machine Translation

Pre-trained language models have been shown to improve performance in ma...
research
06/03/2021

nmT5 – Is parallel data still relevant for pre-training massively multilingual language models?

Recently, mT5 - a massively multilingual version of T5 - leveraged a uni...
research
09/15/2023

Structural Self-Supervised Objectives for Transformers

This thesis focuses on improving the pre-training of natural language mo...
research
06/01/2023

On Masked Pre-training and the Marginal Likelihood

Masked pre-training removes random input dimensions and learns a model t...
research
08/23/2022

Learning Better Masking for Better Language Model Pre-training

Masked Language Modeling (MLM) has been widely used as the denoising obj...
research
06/09/2023

WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction

Most existing word alignment methods rely on manual alignment datasets o...
research
07/01/2023

Self-Supervised Query Reformulation for Code Search

Automatic query reformulation is a widely utilized technology for enrich...

Please sign up or login with your details

Forgot password? Click here to reset