Discovering Language-neutral Sub-networks in Multilingual Language Models

05/25/2022
by   Negar Foroutan, et al.
0

Multilingual pre-trained language models perform remarkably well on cross-lingual transfer for downstream tasks. Despite their impressive performance, our understanding of their language neutrality (i.e., the extent to which they use shared representations to encode similar phenomena across languages) and its role in achieving such performance remain open questions. In this work, we conceptualize language neutrality of multilingual models as a function of the overlap between language-encoding sub-networks of these models. Using mBERT as a foundation, we employ the lottery ticket hypothesis to discover sub-networks that are individually optimized for various languages and tasks. Using three distinct tasks and eleven typologically-diverse languages in our evaluation, we show that the sub-networks found for different languages are in fact quite similar, supporting the idea that mBERT jointly encodes multiple languages in shared parameters. We conclude that mBERT is comprised of a language-neutral sub-network shared among many languages, along with multiple ancillary language-specific sub-networks, with the former playing a more prominent role in mBERT's impressive cross-lingual performance.

READ FULL TEXT

page 6

page 7

research
05/13/2023

The Geometry of Multilingual Language Models: An Equality Lens

Understanding the representations of different languages in multilingual...
research
03/16/2022

Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure

Multilingual pre-trained language models, such as mBERT and XLM-R, have ...
research
12/23/2021

Do Multi-Lingual Pre-trained Language Models Reveal Consistent Token Attributions in Different Languages?

During the past several years, a surge of multi-lingual Pre-trained Lang...
research
05/22/2022

The Geometry of Multilingual Language Model Representations

We assess how multilingual language models maintain a shared multilingua...
research
05/12/2016

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

We introduce polyglot language models, recurrent neural network models t...
research
05/11/2023

Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting

Large language models (LLMs) demonstrate impressive multilingual capabil...
research
07/10/2022

FairDistillation: Mitigating Stereotyping in Language Models

Large pre-trained language models are successfully being used in a varie...

Please sign up or login with your details

Forgot password? Click here to reset