PolyLM: An Open Source Polyglot Large Language Model

07/12/2023
by   Xiangpeng Wei, et al.
0

Large language models (LLMs) demonstrate remarkable ability to comprehend, reason, and generate following nature language instructions. However, the development of LLMs has been primarily focused on high-resource languages, such as English, thereby limiting their applicability and research in other languages. Consequently, we present PolyLM, a multilingual LLM trained on 640 billion (B) tokens, avaliable in two model sizes: 1.7B and 13B. To enhance its multilingual capabilities, we 1) integrate bilingual data into training data; and 2) adopt a curriculum learning strategy that increases the proportion of non-English data from 30 pre-training. Further, we propose a multilingual self-instruct method which automatically generates 132.7K diverse multilingual instructions for model fine-tuning. To assess the model's performance, we collect several existing multilingual tasks, including multilingual understanding, question answering, generation, and translation. Extensive experiments show that PolyLM surpasses other open-source models such as LLaMA and BLOOM on multilingual tasks while maintaining comparable performance in English. Our models, alone with the instruction data and multilingual benchmark, are available at: <https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation>.

READ FULL TEXT

page 7

page 9

research
05/24/2023

Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions

Large-scale Pretrained Language Models (LLMs), such as ChatGPT and GPT4,...
research
09/17/2023

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

The driving factors behind the development of large language models (LLM...
research
10/08/2022

Generative Language Models for Paragraph-Level Question Generation

Powerful generative models have led to recent progress in question gener...
research
06/19/2023

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

Large language models (LLMs) have demonstrated remarkable prowess in lan...
research
09/16/2023

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Foundational large language models (LLMs) can be instruction-tuned to de...
research
05/29/2023

BigTrans: Augmenting Large Language Models with Multilingual Translation Capability over 100 Languages

Large language models (LLMs) demonstrate promising translation performan...
research
05/28/2023

Breaking Language Barriers with a LEAP: Learning Strategies for Polyglot LLMs

Large language models (LLMs) are at the forefront of transforming numero...

Please sign up or login with your details

Forgot password? Click here to reset