DeepAI AI Chat
Log In Sign Up

On the Cross-lingual Transferability of Monolingual Representations

by   Mikel Artetxe, et al.

State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions. We evaluate this hypothesis by designing an alternative approach that transfers a monolingual model to new languages at the lexical level. More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective -freezing parameters of all other layers. This approach does not rely on a shared vocabulary or joint training. However, we show that it is competitive with multilingual BERT on standard cross-lingual classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD). Our results contradict common beliefs of the basis of the generalization ability of multilingual models and suggest that deep monolingual models learn some abstractions that generalize across languages. We also release XQuAD as a more comprehensive cross-lingual benchmark, which comprises 240 paragraphs and 1190 question-answer pairs from SQuAD v1.1 translated into ten languages by professional translators.


page 1

page 2

page 3

page 4


Boosting Cross-lingual Transferability in Multilingual Models via In-Context Learning

Existing cross-lingual transfer (CLT) prompting methods are only concern...

BERT Can See Out of the Box: On the Cross-modal Transferability of Text Representations

Pre-trained language models such as BERT have recently contributed to si...

Emerging Cross-lingual Structure in Pretrained Language Models

We study the problem of multilingual masked language modeling, i.e. the ...

Slovene SuperGLUE Benchmark: Translation and Evaluation

We present a Slovene combined machine-human translated SuperGLUE benchma...

Examining Cross-lingual Contextual Embeddings with Orthogonal Structural Probes

State-of-the-art contextual embeddings are obtained from large language ...

MonoByte: A Pool of Monolingual Byte-level Language Models

The zero-shot cross-lingual ability of models pretrained on multilingual...

Multilingual Few-Shot Learning via Language Model Retrieval

Transformer-based language models have achieved remarkable success in fe...

Code Repositories


Experiments on cross lingual transfer (en -> pt) using BERT

view repo