From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

08/23/2023
by   Ming Li, et al.
0

In the realm of Large Language Models, the balance between instruction data quality and quantity has become a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from vast open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal tool to identify discrepancies between a model's expected responses and its autonomous generation prowess. Through the adept application of IFD, cherry samples are pinpointed, leading to a marked uptick in model training efficiency. Empirical validations on renowned datasets like Alpaca and WizardLM underpin our findings; with a mere 10 improved results. This synthesis of self-guided cherry-picking and the IFD metric signifies a transformative leap in the optimization of LLMs, promising both efficiency and resource-conscious advancements.

READ FULL TEXT

page 4

page 6

page 14

research
05/04/2023

Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models

This project focuses on enhancing open-source large language models thro...
research
05/19/2023

Self-QA: Unsupervised Knowledge Guided Language Model Alignment

Large-scale language models like ChatGPT and GPT-4 have gained attention...
research
09/09/2023

Efficient Finetuning Large Language Models For Vietnamese Chatbot

Large language models (LLMs), such as GPT-4, PaLM, and LLaMa, have been ...
research
04/16/2023

Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation

Recently, significant public efforts have been directed towards developi...
research
08/16/2023

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Data contamination, i.e., the presence of test data from downstream task...
research
09/04/2023

Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets?

Instruction-tuning has become an integral part of training pipelines for...
research
09/11/2023

Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models

Readability metrics and standards such as Flesch Kincaid Grade Level (FK...

Please sign up or login with your details

Forgot password? Click here to reset