Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

05/04/2022
by   Karolina Stańczak, et al.
0

The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision. However, it remains unclear how these models learn to generalise across languages. In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. In particular, we investigate whether morphosyntactic information is encoded in the same subset of neurons in different languages. We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe. Our findings show that the cross-lingual overlap between neurons is significant, but its extent may vary across categories and depends on language proximity and pre-training data size.

READ FULL TEXT

page 3

page 5

page 9

page 10

research
10/23/2020

DICT-MLM: Improved Multilingual Pre-Training using Bilingual Dictionaries

Pre-trained multilingual language models such as mBERT have shown immens...
research
08/25/2023

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Pre-trained language models (PLMs) contain vast amounts of factual knowl...
research
11/02/2022

M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval

This work investigates the use of large-scale, pre-trained models (CLIP ...
research
11/05/2020

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

We propose EXAMS – a new benchmark dataset for cross-lingual and multili...
research
07/15/2020

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

In this work, we formulate cross-lingual language model pre-training as ...
research
10/12/2021

Learning Compact Metrics for MT

Recent developments in machine translation and multilingual text generat...
research
03/27/2023

Lexicon-Enhanced Self-Supervised Training for Multilingual Dense Retrieval

Recent multilingual pre-trained models have shown better performance in ...

Please sign up or login with your details

Forgot password? Click here to reset