Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models

05/04/2023
by   Fangkai Jiao, et al.
0

This project focuses on enhancing open-source large language models through instruction-tuning and providing comprehensive evaluations of their performance. We explore how various training data factors, such as quantity, quality, and linguistic distribution, influence the performance of instruction-tuned models trained on publicly accessible high-quality instruction datasets for both English and Chinese languages. Our goal is to supplement evaluation with quantitative analyses, providing valuable insights for the continued advancement of open-source chat models. Our model, data, and code are publicly available for others to use and build upon.

READ FULL TEXT
research
04/16/2023

Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation

Recently, significant public efforts have been directed towards developi...
research
04/17/2023

Chinese Open Instruction Generalist: A Preliminary Release

Instruction tuning is widely recognized as a key technique for building ...
research
06/07/2023

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Instruction-tuned large language models have revolutionized natural lang...
research
08/23/2023

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

In the realm of Large Language Models, the balance between instruction d...
research
08/12/2023

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for...
research
05/01/2023

Poisoning Language Models During Instruction Tuning

Instruction-tuned LMs such as ChatGPT, FLAN, and InstructGPT are finetun...
research
05/24/2023

ExpertPrompting: Instructing Large Language Models to be Distinguished Experts

The answering quality of an aligned large language model (LLM) can be dr...

Please sign up or login with your details

Forgot password? Click here to reset