Careful Data Curation Stabilizes In-context Learning

12/20/2022
by   Ting-Yun Chang, et al.
0

In-context learning (ICL) enables large language models (LLMs) to perform new tasks by prompting them with a sequence of training examples. However, ICL is very sensitive to the choice of training examples: randomly sampling examples from a training set leads to high variance in performance. In this paper, we show that curating a carefully chosen subset of training data greatly stabilizes ICL performance. We propose two methods to choose training subsets, both of which score training examples individually and then select the highest-scoring ones. CondAcc scores a training example by its average ICL accuracy when combined with random training examples, while Datamodels learns a linear proxy model that estimates how the presence of each training example influences LLM accuracy. On average, CondAcc and Datamodels outperform sampling from the entire training set by 7.7 two LLMs. Our analysis shows that stable subset examples are no more diverse than average, and are not outliers in terms of sequence length and perplexity.

READ FULL TEXT
research
11/25/2013

Are all training examples equally valuable?

When learning a new concept, not all training examples may prove equally...
research
06/22/2011

Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction

For large, real-world inductive learning problems, the number of trainin...
research
06/26/2016

Fast Incremental Learning for Off-Road Robot Navigation

A promising approach to autonomous driving is machine learning. In such ...
research
07/28/2022

Efficient Model Finetuning for Text Classification via Data Filtering

As model finetuning is central to the modern NLP, we set to maximize its...
research
12/11/2020

When is Memorization of Irrelevant Training Data Necessary for High-Accuracy Learning?

Modern machine learning models are complex and frequently encode surpris...
research
07/19/2019

Learning More From Less: Towards Strengthening Weak Supervision for Ad-Hoc Retrieval

The limited availability of ground truth relevance labels has been a maj...
research
10/27/2022

Outlier-Aware Training for Improving Group Accuracy Disparities

Methods addressing spurious correlations such as Just Train Twice (JTT, ...

Please sign up or login with your details

Forgot password? Click here to reset