Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

01/13/2023
by   Chen Cecilia Liu, et al.
0

Standard fine-tuning of language models typically performs well on in-distribution data, but suffers with generalization to distribution shifts. In this work, we aim to improve generalization of adapter-based cross-lingual task transfer where such cross-language distribution shifts are imminent. We investigate scheduled unfreezing algorithms – originally proposed to mitigate catastrophic forgetting in transfer learning – for fine-tuning task adapters in cross-lingual transfer. Our experiments show that scheduled unfreezing methods close the gap to full fine-tuning and achieve state-of-the-art transfer performance, suggesting that these methods can go beyond just mitigating catastrophic forgetting. Next, aiming to delve deeper into those empirical findings, we investigate the learning dynamics of scheduled unfreezing using Fisher Information. Our in-depth experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance. We additionally propose a general scheduled unfreezing algorithm that achieves an average of 2 points improvement over four datasets compared to standard fine-tuning and provides strong empirical evidence for a theory-based justification of the heuristic unfreezing schedule (i.e., the heuristic schedule is implicitly maximizing Fisher Information). Our code will be publicly available.

READ FULL TEXT
research
09/12/2023

Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies

The cross-lingual transfer is a promising technique to solve tasks in le...
research
10/31/2022

Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks

Large multilingual language models typically share their parameters acro...
research
06/15/2021

Consistency Regularization for Cross-Lingual Fine-Tuning

Fine-tuning pre-trained cross-lingual language models can transfer task-...
research
05/25/2022

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

In this paper, we explore the challenging problem of performing a genera...
research
05/12/2023

Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation

Dialogue systems for non-English languages have long been under-explored...
research
05/19/2023

Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

Existing research has shown that a multilingual pre-trained language mod...
research
09/15/2023

Bridging Topic, Domain, and Language Shifts: An Evaluation of Comprehensive Out-of-Distribution Scenarios

Language models (LMs) excel in in-distribution (ID) scenarios where trai...

Please sign up or login with your details

Forgot password? Click here to reset