Cross-Lingual Language Model Meta-Pretraining

09/23/2021
by   Zewen Chi, et al.
4

The success of pretrained cross-lingual language models relies on two essential abilities, i.e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages. However, current methods jointly learn the two abilities in a single-phase cross-lingual pretraining process, resulting in a trade-off between generalization and cross-lingual transfer. In this paper, we propose cross-lingual language model meta-pretraining, which learns the two abilities in different training phases. Our method introduces an additional meta-pretraining phase before cross-lingual pretraining, where the model learns generalization ability on a large-scale monolingual corpus. Then, the model focuses on learning cross-lingual transfer on a multilingual corpus. Experimental results show that our method improves both generalization and cross-lingual transfer, and produces better-aligned representations across different languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2022

Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models

The emergent cross-lingual transfer seen in multilingual pretrained mode...
research
10/15/2021

mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models

Recent studies have shown that multilingual pretrained language models c...
research
03/19/2022

Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models

We investigate what kind of structural knowledge learned in neural netwo...
research
02/24/2022

Oolong: Investigating What Makes Crosslingual Transfer Hard with Controlled Studies

Little is known about what makes cross-lingual transfer hard, since fact...
research
09/22/2022

MonoByte: A Pool of Monolingual Byte-level Language Models

The zero-shot cross-lingual ability of models pretrained on multilingual...
research
07/25/2023

XDLM: Cross-lingual Diffusion Language Model for Machine Translation

Recently, diffusion models have excelled in image generation tasks and h...
research
01/29/2022

Learning to pronounce as measuring cross-lingual joint orthography-phonology complexity

Machine learning models allow us to compare languages by showing how har...

Please sign up or login with your details

Forgot password? Click here to reset