Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

05/25/2022
by   Tu Vu, et al.
6

In this paper, we explore the challenging problem of performing a generative task (i.e., summarization) in a target language when labeled data is only available in English. We assume a strict setting with no access to parallel data or machine translation. Prior work has shown, and we confirm, that standard transfer learning techniques struggle in this setting, as a generative multilingual model fine-tuned purely on English catastrophically forgets how to generate non-English. Given the recent rise of parameter-efficient adaptation techniques (e.g., prompt tuning), we conduct the first investigation into how well these methods can overcome catastrophic forgetting to enable zero-shot cross-lingual generation. We find that parameter-efficient adaptation provides gains over standard fine-tuning when transferring between less-related languages, e.g., from English to Thai. However, a significant gap still remains between these methods and fully-supervised baselines. To improve cross-lingual transfer further, we explore three approaches: (1) mixing in unlabeled multilingual data, (2) pre-training prompts on target language data, and (3) explicitly factoring prompts into recombinable language and task components. Our methods can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/05/2022

Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval

State-of-the-art neural (re)rankers are notoriously data hungry which - ...
research
05/12/2023

Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation

Dialogue systems for non-English languages have long been under-explored...
research
04/18/2021

On the Strengths of Cross-Attention in Pretrained Transformers for Machine Translation

We study the power of cross-attention in the Transformer architecture wi...
research
09/14/2022

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning

NLU systems deployed in the real world are expected to be regularly upda...
research
10/13/2020

Model Selection for Cross-Lingual Transfer using a Learned Scoring Function

Transformers that are pre-trained on multilingual text corpora, such as,...
research
01/13/2023

Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

Standard fine-tuning of language models typically performs well on in-di...
research
05/26/2023

Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging

Massively multilingual language models have displayed strong performance...

Please sign up or login with your details

Forgot password? Click here to reset