Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation

05/24/2023
by   Haonan Li, et al.
0

Instruction tuning has shown great promise in the field of natural language processing. However, the research on multilingual instruction tuning has been limited due to the scarcity of high-quality instruction-response datasets. To address this gap, we present Bactrian-X, a comprehensive multilingual parallel dataset of 3.4 million instruction-response pairs across 52 languages. Leveraging this dataset, we train a set of adapters using low-rank adaptation (LoRA), which are lightweight components seamlessly integrated with foundational models. These adapters have a significantly smaller parameter count than the base model, making them easily replaceable and usable as plug-ins for different languages or language groups. Through extensive experiments on 52 languages, we demonstrate the superior performance of our models in various multilingual evaluation settings. Our proposed models outperform both the vanilla models and the existing instruction-tuned models. The code and models are publicly available at https://github.com/mbzuai-nlp/bactrian-x.

READ FULL TEXT

page 8

page 14

page 15

page 16

research
07/29/2023

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

A key technology for the development of large language models (LLMs) inv...
research
09/07/2023

From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models

Instruction tuning is essential for large language models (LLMs) to beco...
research
09/16/2023

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Foundational large language models (LLMs) can be instruction-tuned to de...
research
06/21/2023

Improving Long-Horizon Imitation Through Instruction Prediction

Complex, long-horizon planning and its combinatorial nature pose steep c...
research
12/09/2022

FLAG3D: A 3D Fitness Activity Dataset with Language Instruction

With the continuously thriving popularity around the world, fitness acti...
research
05/19/2023

InstructIE: A Chinese Instruction-based Information Extraction Dataset

We introduce a new Information Extraction (IE) task dubbed Instruction-b...
research
06/28/2023

On the Exploitability of Instruction Tuning

Instruction tuning is an effective technique to align large language mod...

Please sign up or login with your details

Forgot password? Click here to reset