Finetuned Language Models Are Zero-Shot Learners

09/03/2021
by   Jason Wei, et al.
0

This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning – finetuning language models on a collection of tasks described via instructions – substantially boosts zero-shot performance on unseen tasks. We take a 137B parameter pretrained language model and instruction-tune it on over 60 NLP tasks verbalized via natural language instruction templates. We evaluate this instruction-tuned model, which we call FLAN, on unseen task types. FLAN substantially improves the performance of its unmodified counterpart and surpasses zero-shot 175B GPT-3 on 19 of 25 tasks that we evaluate. FLAN even outperforms few-shot GPT-3 by a large margin on ANLI, RTE, BoolQ, AI2-ARC, OpenbookQA, and StoryCloze. Ablation studies reveal that number of tasks and model scale are key components to the success of instruction tuning.

READ FULL TEXT
research
05/18/2023

Aligning Instruction Tasks Unlocks Large Language Models as Zero-Shot Relation Extractors

Recent work has shown that fine-tuning large language models (LLMs) on l...
research
05/25/2022

Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning

Instruction tuning is an emergent paradigm in NLP wherein natural langua...
research
05/23/2023

Are Large Language Models Robust Zero-shot Coreference Resolvers?

Recent progress in domain adaptation for coreference resolution relies o...
research
02/28/2023

In-Context Instruction Learning

Instruction learning of Large Language Models (LLMs) has enabled zero-sh...
research
08/23/2023

Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning

The recent surge of generative AI has been fueled by the generative powe...
research
05/23/2023

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

Large Language Models (LLMs) have shown enhanced capabilities of solving...
research
11/03/2022

Large Language Models Are Human-Level Prompt Engineers

By conditioning on natural language instructions, large language models ...

Please sign up or login with your details

Forgot password? Click here to reset