How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench

05/24/2023
by   Qinyuan Ye, et al.
0

We investigate the predictability of large language model (LLM) capabilities: given records of past experiments using different model families, numbers of parameters, tasks, and numbers of in-context examples, can we accurately predict LLM performance on new experiment configurations? Answering this question has practical implications for LLM users (e.g., deciding which models to try), developers (e.g., prioritizing evaluation on representative tasks), and the research community (e.g., identifying hard-to-predict capabilities that warrant further investigation). We study the performance prediction problem on experiment records from BIG-bench. On a random train-test split, an MLP-based predictor achieves RMSE below 5 experiment records. Further, we formulate the problem of searching for "small-bench," an informative subset of BIG-bench tasks from which the performance of the full set can be maximally recovered, and find a subset as informative for evaluating new model families as BIG-bench Hard, while being 3x smaller.

READ FULL TEXT

page 6

page 12

page 13

page 14

research
05/22/2023

In-Context Learning of Large Language Models Explained as Kernel Regression

Large language models (LLMs) have initiated a paradigm shift in transfer...
research
08/08/2023

Shepherd: A Critic for Language Model Generation

As large language models improve, there is increasing interest in techni...
research
10/20/2022

Transcending Scaling Laws with 0.1

Scaling language models improves performance but comes with significant ...
research
03/15/2022

Evaluating the Text-to-SQL Capabilities of Large Language Models

We perform an empirical evaluation of Text-to-SQL capabilities of the Co...
research
10/27/2022

What Language Model to Train if You Have One Million GPU Hours?

The crystallization of modeling methods around the Transformer architect...
research
02/27/2023

Finding Supporting Examples for In-Context Learning

In-context learning is a new learning paradigm where a language model ob...

Please sign up or login with your details

Forgot password? Click here to reset