From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

05/01/2020
by   Anne Lauscher, et al.
0

Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance. Current downstream evaluations, however, verify their efficacy predominantly in transfer settings involving languages with sufficient amounts of pretraining data, and with lexically and typologically close languages. In this work, we analyze their limitations and show that cross-lingual transfer via massively multilingual transformers, much like transfer via cross-lingual word embeddings, is substantially less effective in resource-lean scenarios and for distant languages. Our experiments, encompassing three lower-level tasks (POS tagging, dependency parsing, NER), as well as two high-level semantic tasks (NLI, QA), empirically correlate transfer performance with linguistic similarity between the source and target languages, but also with the size of pretraining corpora of target languages. We also demonstrate a surprising effectiveness of inexpensive few-shot transfer (i.e., fine-tuning on a few target-language instances after fine-tuning in the source) across the board. This suggests that additional research efforts should be invested to reach beyond the limiting zero-shot conditions.

READ FULL TEXT

page 8

page 9

page 15

research
12/11/2020

Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer

Adapter modules, additional trainable parameters that enable efficient f...
research
03/04/2023

DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer

Zero-shot cross-lingual transfer is promising, however has been shown to...
research
12/31/2020

Verb Knowledge Injection for Multilingual Event Processing

In parallel to their overwhelming success across NLP tasks, language abi...
research
05/09/2022

A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank

We show that the choice of pretraining languages affects downstream cros...
research
10/20/2021

Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction

We evaluate a simple approach to improving zero-shot multilingual transf...
research
09/02/2021

MultiEURLEX – A multi-lingual and multi-label legal document classification dataset for zero-shot cross-lingual transfer

We introduce MULTI-EURLEX, a new multilingual dataset for topic classifi...
research
02/15/2022

Delving Deeper into Cross-lingual Visual Question Answering

Visual question answering (VQA) is one of the crucial vision-and-languag...

Please sign up or login with your details

Forgot password? Click here to reset