MetaICL: Learning to Learn In Context

10/29/2021
by   Sewon Min, et al.
0

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks. This meta-training enables the model to more effectively learn a new task in context at test time, by simply conditioning on a few training examples with no parameter updates or task-specific templates. We experiment on a large, diverse collection of tasks consisting of 142 NLP datasets including classification, question answering, natural language inference, paraphrase detection and more, across seven different meta-training/target splits. MetaICL outperforms a range of baselines including in-context learning without meta-training and multi-task learning followed by zero-shot transfer. We find that the gains are particularly significant for target tasks that have domain shifts from the meta-training tasks, and that using a diverse set of the meta-training tasks is key to improvements. We also show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2019

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

Self-supervised pre-training of transformer models has shown enormous su...
research
06/03/2021

Reordering Examples Helps during Priming-based Few-Shot Learning

The ability to learn from limited data, or few-shot learning, is a desir...
research
10/08/2019

Detecting AI Trojans Using Meta Neural Analysis

Machine learning models, especially neural networks (NNs), have achieved...
research
01/25/2021

Meta-Learning for Effective Multi-task and Multilingual Modelling

Natural language processing (NLP) tasks (e.g. question-answering in Engl...
research
04/27/2022

Adaptable Text Matching via Meta-Weight Regulator

Neural text matching models have been used in a range of applications su...
research
10/13/2022

Assessing Out-of-Domain Language Model Performance from Few Examples

While pretrained language models have exhibited impressive generalizatio...
research
08/19/2022

Evaluating Diverse Knowledge Sources for Online One-shot Learning of Novel Tasks

Online autonomous agents are able to draw on a wide variety of potential...

Please sign up or login with your details

Forgot password? Click here to reset