OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

12/22/2022
by   Srinivasan Iyer, et al.
0

Recent work has shown that fine-tuning large pre-trained language models on a collection of tasks described via instructions, a.k.a. instruction-tuning, improves their zero and few-shot generalization to unseen tasks. However, there is a limited understanding of the performance trade-offs of different decisions made during the instruction-tuning process. These decisions include the scale and diversity of the instruction-tuning benchmark, different task sampling strategies, fine-tuning with and without demonstrations, training using specialized datasets for reasoning and dialogue, and finally, the fine-tuning objectives themselves. In this paper, we characterize the effect of instruction-tuning decisions on downstream task performance when scaling both model and benchmark sizes. To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks. Through the lens of this framework, we first present insights about instruction-tuning decisions as applied to OPT-30B and further exploit these insights to train OPT-IML 30B and 175B, which are instruction-tuned versions of OPT. OPT-IML demonstrates all three generalization abilities at both scales on four different evaluation benchmarks with diverse tasks and input formats – PromptSource, FLAN, Super-NaturalInstructions, and UnifiedSKG. Not only does it significantly outperform OPT on all benchmarks but is also highly competitive with existing models fine-tuned on each specific benchmark. We release OPT-IML at both scales, together with the OPT-IML Bench evaluation framework.

READ FULL TEXT

page 2

page 36

page 39

page 40

page 41

research
02/28/2023

In-Context Instruction Learning

Instruction learning of Large Language Models (LLMs) has enabled zero-sh...
research
02/08/2023

CrossCodeBench: Benchmarking Cross-Task Generalization of Source Code Models

Despite the recent advances showing that a model pre-trained on large-sc...
research
07/05/2023

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

Recently, the release of INSTRUCTEVAL has provided valuable insights int...
research
02/07/2023

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

Recently, Language Models (LMs) instruction-tuned on multiple tasks, als...
research
07/01/2023

InstructEval: Systematic Evaluation of Instruction Selection Methods

In-context learning (ICL) performs tasks by prompting a large language m...
research
10/15/2021

Meta-learning via Language Model In-context Tuning

The goal of meta-learning is to learn to adapt to a new task with only a...
research
05/26/2023

HUB: Guiding Learned Optimizers with Continuous Prompt Tuning

Learned optimizers are a crucial component of meta-learning. Recent adva...

Please sign up or login with your details

Forgot password? Click here to reset