Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction

05/23/2023
by   Samuel Cahyawijaya, et al.
0

Instruction-tuned large language models (LLMs) have shown remarkable generalization capability over multiple tasks in multiple languages. Nevertheless, their generalization towards different languages varies especially to underrepresented languages or even to unseen languages. Prior works on adapting new languages to LLMs find that naively adapting new languages to instruction-tuned LLMs will result in catastrophic forgetting, which in turn causes the loss of multitasking ability in these LLMs. To tackle this, we propose the Instruct-Align a.k.a (IA)^1 framework, which enables instruction-tuned LLMs to learn cross-lingual alignment between unseen and previously learned languages via alignment-based cross-lingual instruction-tuning. Our preliminary result on BLOOMZ-560M shows that (IA)^1 is able to learn a new language effectively with only a limited amount of parallel data and at the same time prevent catastrophic forgetting by applying continual instruction-tuning through experience replay. Our work contributes to the progression of language adaptation methods for instruction-tuned LLMs and opens up the possibility of adapting underrepresented low-resource languages into existing instruction-tuned LLMs. Our code will be publicly released upon acceptance.

READ FULL TEXT
research
08/27/2023

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

The language ability of Large Language Models (LLMs) is often unbalanced...
research
08/09/2023

Extrapolating Large Language Models to Non-English by Aligning Languages

Due to the unbalanced training data distribution, the language ability o...
research
09/12/2023

Measuring Catastrophic Forgetting in Cross-Lingual Transfer Paradigms: Exploring Tuning Strategies

The cross-lingual transfer is a promising technique to solve tasks in le...
research
10/24/2019

Cross-Lingual Vision-Language Navigation

Vision-Language Navigation (VLN) is the task where an agent is commanded...
research
06/04/2021

Language Scaling for Universal Suggested Replies Model

We consider the problem of scaling automated suggested replies for Outlo...
research
09/11/2023

Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models

Readability metrics and standards such as Flesch Kincaid Grade Level (FK...
research
10/06/2021

Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning

Multilingual models jointly pretrained on multiple languages have achiev...

Please sign up or login with your details

Forgot password? Click here to reset