Locating Language-Specific Information in Contextualized Embeddings

09/16/2021
by   Sheng Liang, et al.
0

Multilingual pretrained language models (MPLMs) exhibit multilinguality and are well suited for transfer across languages. Most MPLMs are trained in an unsupervised fashion and the relationship between their objective and multilinguality is unclear. More specifically, the question whether MPLM representations are language-agnostic or they simply interleave well with learned task prediction heads arises. In this work, we locate language-specific information in MPLMs and identify its dimensionality and the layers where this information occurs. We show that language-specific information is scattered across many dimensions, which can be projected into a linear subspace. Our study contributes to a better understanding of MPLM representations, going beyond treating them as unanalyzable blobs of information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/22/2022

The Geometry of Multilingual Language Model Representations

We assess how multilingual language models maintain a shared multilingua...
research
10/15/2021

mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models

Recent studies have shown that multilingual pretrained language models c...
research
04/20/2022

Analyzing Gender Representation in Multilingual Models

Multilingual language models were shown to allow for nontrivial transfer...
research
08/20/2020

Inducing Language-Agnostic Multilingual Representations

Multilingual representations have the potential to make cross-lingual sy...
research
06/01/2023

Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity

Previous work has shown that the representations output by contextual la...
research
09/23/2018

Towards Language Agnostic Universal Representations

When a bilingual student learns to solve word problems in math, we expec...

Please sign up or login with your details

Forgot password? Click here to reset