On the Cross-lingual Transferability of Monolingual Representations

10/25/2019
by   Mikel Artetxe, et al.
14

State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions. We evaluate this hypothesis by designing an alternative approach that transfers a monolingual model to new languages at the lexical level. More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective -freezing parameters of all other layers. This approach does not rely on a shared vocabulary or joint training. However, we show that it is competitive with multilingual BERT on standard cross-lingual classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD). Our results contradict common beliefs of the basis of the generalization ability of multilingual models and suggest that deep monolingual models learn some abstractions that generalize across languages. We also release XQuAD as a more comprehensive cross-lingual benchmark, which comprises 240 paragraphs and 1190 question-answer pairs from SQuAD v1.1 translated into ten languages by professional translators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2023

Boosting Cross-lingual Transferability in Multilingual Models via In-Context Learning

Existing cross-lingual transfer (CLT) prompting methods are only concern...
research
02/25/2020

BERT Can See Out of the Box: On the Cross-modal Transferability of Text Representations

Pre-trained language models such as BERT have recently contributed to si...
research
11/04/2019

Emerging Cross-lingual Structure in Pretrained Language Models

We study the problem of multilingual masked language modeling, i.e. the ...
research
02/10/2022

Slovene SuperGLUE Benchmark: Translation and Evaluation

We present a Slovene combined machine-human translated SuperGLUE benchma...
research
09/10/2021

Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes

State-of-the-art contextual embeddings are obtained from large language ...
research
09/22/2022

MonoByte: A Pool of Monolingual Byte-level Language Models

The zero-shot cross-lingual ability of models pretrained on multilingual...
research
06/19/2023

Multilingual Few-Shot Learning via Language Model Retrieval

Transformer-based language models have achieved remarkable success in fe...

Please sign up or login with your details

Forgot password? Click here to reset