Multitask Prompted Training Enables Zero-Shot Task Generalization

10/15/2021
by   Victor Sanh, et al.
10

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks. It has been hypothesized that this is a consequence of implicit multitask learning in language model training. Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models 6x its size. All prompts and trained models are available at github.com/bigscience-workshop/promptsource/.

READ FULL TEXT

page 7

page 8

research
10/01/2022

Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks

Although large language models have achieved impressive zero-shot abilit...
research
04/12/2022

What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

Large pretrained Transformer language models have been shown to exhibit ...
research
12/01/2022

Data-Efficient Finetuning Using Cross-Task Nearest Neighbors

Language models trained on massive prompted multitask datasets like T0 (...
research
09/07/2021

Patient Outcome and Zero-shot Diagnosis Prediction with Hypernetwork-guided Multitask Learning

Multitask deep learning has been applied to patient outcome prediction f...
research
01/18/2022

ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization

We propose a multitask pretraining approach ZeroPrompt for zero-shot gen...
research
10/06/2022

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Meta-training, which fine-tunes the language model (LM) on various downs...
research
09/21/2022

WeLM: A Well-Read Pre-trained Language Model for Chinese

Large Language Models pre-trained with self-supervised learning have dem...

Please sign up or login with your details

Forgot password? Click here to reset