Unsupervised Cross-lingual Representation Learning at Scale

11/05/2019
by   Alexis Conneau, et al.
0

This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of cross-lingual transfer tasks. We train a Transformer-based masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +13.8 +12.3 performs particularly well on low-resource languages, improving 11.8 accuracy for Swahili and 9.2 present a detailed empirical evaluation of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale. Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing per-language performance; XLM-Ris very competitive with strong monolingual models on the GLUE and XNLI benchmarks. We will make XLM-R code, data, and models publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2021

Larger-Scale Transformers for Multilingual Masked Language Modeling

Recent work has demonstrated the effectiveness of cross-lingual language...
research
06/04/2021

Language Scaling for Universal Suggested Replies Model

We consider the problem of scaling automated suggested replies for Outlo...
research
05/24/2022

Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models

The emergent cross-lingual transfer seen in multilingual pretrained mode...
research
04/05/2022

Towards Best Practices for Training Multilingual Dense Retrieval Models

Dense retrieval models using a transformer-based bi-encoder design have ...
research
10/26/2022

Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning

In this paper, we elaborate upon recipes for building multilingual repre...
research
09/15/2021

A Conditional Generative Matching Model for Multi-lingual Reply Suggestion

We study the problem of multilingual automated reply suggestions (RS) mo...
research
04/18/2023

Romanization-based Large-scale Adaptation of Multilingual Language Models

Large multilingual pretrained language models (mPLMs) have become the de...

Please sign up or login with your details

Forgot password? Click here to reset