Multilingual Alignment of Contextual Word Representations

02/10/2020
by   Steven Cao, et al.
0

We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT. In particular, after our proposed alignment procedure, BERT exhibits significantly improved zero-shot performance on XNLI compared to the base model, remarkably matching pseudo-fully-supervised translate-train models for Bulgarian and Greek. Further, to measure the degree of alignment, we introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer. Using this word retrieval task, we also analyze BERT and find that it exhibits systematic deficiencies, e.g. worse alignment for open-class parts-of-speech and word pairs written in different scripts, that are corrected by the alignment procedure. These results support contextual alignment as a useful concept for understanding large multilingual pre-trained models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer

Multilingual pre-trained models have achieved remarkable transfer perfor...
research
01/26/2021

Deep Subjecthood: Higher-Order Grammatical Features in Multilingual BERT

We investigate how Multilingual BERT (mBERT) encodes grammar by examinin...
research
04/09/2020

On the Language Neutrality of Pre-trained Multilingual Representations

Multilingual contextual embeddings, such as multilingual BERT (mBERT) an...
research
10/27/2021

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer

While recent work on multilingual language models has demonstrated their...
research
10/23/2020

Multilingual BERT Post-Pretraining Alignment

We propose a simple method to align multilingual contextual embeddings a...
research
11/15/2022

ALIGN-MLM: Word Embedding Alignment is Crucial for Multilingual Pre-training

Multilingual pre-trained models exhibit zero-shot cross-lingual transfer...

Please sign up or login with your details

Forgot password? Click here to reset