How do languages influence each other? Studying cross-lingual data sharing during LLM fine-tuning

05/22/2023
by   Rochelle Choenni, et al.
0

Multilingual large language models (MLLMs) are jointly trained on data from many different languages such that representation of individual languages can benefit from other languages' data. Impressive performance on zero-shot cross-lingual transfer shows that these models are capable of exploiting data from other languages. Yet, it remains unclear to what extent, and under which conditions, languages rely on each other's data. In this study, we use TracIn (Pruthi et al., 2020), a training data attribution (TDA) method, to retrieve the most influential training samples seen during multilingual fine-tuning for a particular test language. This allows us to analyse cross-lingual sharing mechanisms of MLLMs from a new perspective. While previous work studied cross-lingual sharing at the level of model parameters, we present the first approach to study cross-lingual sharing at the data level. We find that MLLMs rely on data from multiple languages from the early stages of fine-tuning and that this reliance gradually increases as fine-tuning progresses. We further study how different fine-tuning languages influence model performance on a given test language and find that they can both reinforce and complement the knowledge acquired from data of the test language itself.

READ FULL TEXT

page 6

page 12

page 13

page 14

page 15

research
10/31/2022

Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

Large multilingual language models typically share their parameters acro...
research
07/21/2021

Soft Layer Selection with Meta-Learning for Zero-Shot Cross-Lingual Transfer

Multilingual pre-trained contextual embedding models (Devlin et al., 201...
research
09/12/2023

Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies

The cross-lingual transfer is a promising technique to solve tasks in le...
research
05/19/2023

Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

Existing research has shown that a multilingual pre-trained language mod...
research
09/10/2021

Efficient Test Time Adapter Ensembling for Low-resource Language Varieties

Adapters are light-weight modules that allow parameter-efficient fine-tu...
research
10/15/2021

Cross-Lingual Fine-Grained Entity Typing

The growth of cross-lingual pre-trained models has enabled NLP tools to ...
research
03/15/2021

Multi-view Subword Regularization

Multilingual pretrained representations generally rely on subword segmen...

Please sign up or login with your details

Forgot password? Click here to reset