Extrapolating Large Language Models to Non-English by Aligning Languages

08/09/2023
by   Wenhao Zhu, et al.
0

Due to the unbalanced training data distribution, the language ability of large language models (LLMs) is often biased towards English. In this paper, we propose to empower pre-trained LLMs on non-English languages by building semantic alignment across languages. We perform instruction-tuning on LLaMA with both translation task data and cross-lingual general task data to obtain cross-lingual models (x-LLaMA). Experiment results on cross-lingual benchmark XQUAD and MLQA show that x-LLaMA models outperform the English instruction-tuned counterpart (Alpaca) by 42.50 languages. Further experiments on Chinese benchmark C-Eval show that x-LLaMA achieves significant improvement on Chinese humanities tasks, outperforming Alpaca by 8.2 target side of translation data is particularly effective for boosting non-English ability. Besides, we find that semantic alignment within LLM can be further strengthened as translation task data scales up and we present the formulation of the underlying scaling law. Evaluation results on translation dataset Flores-101 show that outperforms previous LLaMA-based models in all evaluated directions. Code and data will be available at: https://github.com/OwenNJU/x-LLM.

READ FULL TEXT

page 6

page 7

research
08/27/2023

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

The language ability of Large Language Models (LLMs) is often unbalanced...
research
06/19/2023

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

Large language models (LLMs) have demonstrated remarkable prowess in lan...
research
08/31/2020

Langevin Cooling for Domain Translation

Domain translation is the task of finding correspondence between two dom...
research
12/06/2016

Cross-Lingual Predicate Mapping Between Linked Data Ontologies

Ontologies in different natural languages often differ in quality in ter...
research
05/23/2023

Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction

Instruction-tuned large language models (LLMs) have shown remarkable gen...
research
05/22/2023

llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology

This study constructed a Japanese chat dataset for tuning large language...
research
03/11/2018

Generating Bilingual Pragmatic Color References

Contextual influences on language exhibit substantial language-independe...

Please sign up or login with your details

Forgot password? Click here to reset