Can language models learn from explanations in context?

by   Andrew K. Lampinen, et al.

Large language models can perform new tasks by adapting to a few in-context examples. For humans, rapid learning from examples can benefit from explanations that connect examples to task principles. We therefore investigate whether explanations of few-shot examples can allow language models to adapt more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with explanations of answers to a small subset of questions, as well as a variety of matched control explanations. We evaluate the effects of various zero-shot and few-shot prompts that include different types of explanations, instructions, and controls on the performance of a range of large language models. We analyze these results using statistical multilevel modeling techniques that account for the nested dependencies among conditions, tasks, prompts, and models. We find that explanations of examples can improve performance. Adding untuned explanations to a few-shot prompt offers a modest improvement in performance; about 1/3 the effect size of adding few-shot examples, but twice the effect size of task instructions. We then show that explanations tuned for performance on a small validation set offer substantially larger benefits; building a prompt by selecting examples and explanations together substantially improves performance over selecting examples alone. Hand-tuning explanations can substantially improve performance on challenging tasks. Furthermore, even untuned explanations outperform carefully matched controls, suggesting that the benefits are due to the link between an example and its explanation, rather than lower-level features of the language used. However, only large models can benefit from explanations. In summary, explanations can support the in-context learning abilities of large language models on


ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

Language models (LMs) that jointly generate end-task answers as well as ...

HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models

Controlling the text generated by language models and customizing the co...

TopEx: Topic-based Explanations for Model Comparison

Meaningfully comparing language models is challenging with current expla...

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

Large Language Models (LLMs) can achieve strong performance on many task...

Three Ways of Using Large Language Models to Evaluate Chat

This paper describes the systems submitted by team6 for ChatEval, the DS...

Learning by Distilling Context

Language models significantly benefit from context tokens, such as promp...

Majority Rule: better patching via Self-Consistency

Large Language models (LLMs) can be induced to solve non-trivial problem...

Please sign up or login with your details

Forgot password? Click here to reset