True Few-Shot Learning with Language Models

05/24/2021
by   Ethan Perez, et al.
12

Pretrained language models (LMs) perform well on many tasks even when learning from a few examples, but prior work uses many held-out examples to tune various aspects of learning, such as hyperparameters, training objectives, and natural language templates ("prompts"). Here, we evaluate the few-shot ability of LMs when such held-out examples are unavailable, a setting we call true few-shot learning. We test two model selection criteria, cross-validation and minimum description length, for choosing LM prompts and hyperparameters in the true few-shot setting. On average, both marginally outperform random selection and greatly underperform selection based on held-out examples. Moreover, selection criteria often prefer models that perform significantly worse than randomly-selected ones. We find similar results even when taking into account our uncertainty in a model's true performance during selection, as well as when varying the amount of computation and number of examples used for selection. Overall, our findings suggest that prior work significantly overestimated the true few-shot ability of LMs given the difficulty of few-shot model selection.

READ FULL TEXT

page 5

page 8

page 18

research
07/06/2023

Evaluating the Evaluators: Are Current Few-Shot Learning Benchmarks Fit for Purpose?

Numerous benchmarks for Few-Shot Learning have been proposed in the last...
research
11/26/2021

True Few-Shot Learning with Prompts – A Real-World Perspective

Prompt-based approaches are strong at few-shot learning. However, Perez ...
research
06/24/2021

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Prompting language models (LMs) with training examples and task descript...
research
05/23/2023

Active Learning Principles for In-Context Learning with Large Language Models

The remarkable advancements in large language models (LLMs) have signifi...
research
12/20/2022

In-context Learning Distillation: Transferring Few-shot Learning Ability of Pre-trained Language Models

Given the success with in-context learning of large pre-trained language...
research
01/31/2023

Differentiable Entailment for Parameter Efficient Few Shot Learning

Few-shot learning allows pre-trained language models to adapt to downstr...
research
06/26/2012

Predictive Approaches For Gaussian Process Classifier Model Selection

In this paper we consider the problem of Gaussian process classifier (GP...

Please sign up or login with your details

Forgot password? Click here to reset