On the Evaluation of Contextual Embeddings for Zero-Shot Cross-Lingual Transfer Learning

04/30/2020
by   Phillip Keung, et al.
0

Pre-trained multilingual contextual embeddings have demonstrated state-of-the-art performance in zero-shot cross-lingual transfer learning, where multilingual BERT is fine-tuned on some source language (typically English) and evaluated on a different target language. However, published results for baseline mBERT zero-shot accuracy vary as much as 17 points on the MLDoc classification task across four papers. We show that the standard practice of using English dev accuracy for model selection in the zero-shot setting makes it difficult to obtain reproducible results on the MLDoc and XNLI tasks. English dev accuracy is often uncorrelated (or even anti-correlated) with target language accuracy, and zero-shot cross-lingual performance varies greatly within the same fine-tuning run and between different fine-tuning runs. We recommend providing oracle scores alongside the zero-shot results: still fine-tune using English, but choose a checkpoint with the target dev set. Reporting this upper bound makes results more consistent by avoiding the variation from bad checkpoints.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2022

Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models

Pre-trained multilingual language models show significant performance ga...
research
10/13/2020

Model Selection for Cross-Lingual Transfer using a Learned Scoring Function

Transformers that are pre-trained on multilingual text corpora, such as,...
research
05/26/2023

Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging

Massively multilingual language models have displayed strong performance...
research
04/18/2021

On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation

We study the power of cross-attention in the Transformer architecture wi...
research
09/29/2020

Cross-lingual Alignment Methods for Multilingual BERT: A Comparative Study

Multilingual BERT (mBERT) has shown reasonable capability for zero-shot ...
research
10/25/2022

Multilingual Relation Classification via Efficient and Effective Prompting

Prompting pre-trained language models has achieved impressive performanc...
research
03/04/2021

A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models

Due to high annotation costs, making the best use of existing human-crea...

Please sign up or login with your details

Forgot password? Click here to reset