Meta-learning via Language Model In-context Tuning

10/15/2021
by   Yanda Chen, et al.
12

The goal of meta-learning is to learn to adapt to a new task with only a few labeled examples. To tackle this problem in NLP, we propose in-context tuning, which recasts adaptation and prediction as a simple sequence prediction problem: to form the input sequence, we concatenate the task instruction, the labeled examples, and the target input to predict; to meta-train the model to learn from in-context examples, we fine-tune a pre-trained language model (LM) to predict the target label from the input sequences on a collection of tasks. We benchmark our method on two collections of text classification tasks: LAMA and BinaryClfs. Compared to first-order MAML which adapts the model with gradient descent, our method better leverages the inductive bias of LMs to perform pattern matching, and outperforms MAML by an absolute 6% AUC ROC score on BinaryClfs, with increasing advantage w.r.t. model size. Compared to non-fine-tuned in-context learning (i.e. prompting a raw LM), in-context tuning directly learns to learn from in-context examples. On BinaryClfs, in-context tuning improves the average AUC-ROC score by an absolute 10%, and reduces the variance with respect to example ordering by 6x and example choices by 2x.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Self-supervised pre-training of transformer models has revolutionized NL...
research
12/22/2022

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

Recent work has shown that fine-tuning large pre-trained language models...
research
03/20/2020

Weighted Meta-Learning

Meta-learning leverages related source tasks to learn an initialization ...
research
02/27/2023

Finding Supporting Examples for In-Context Learning

In-context learning is a new learning paradigm where a language model ob...
research
05/24/2023

Estimating Large Language Model Capabilities without Labeled Test Data

Large Language Models (LLMs) have exhibited an impressive ability to per...
research
05/27/2021

ProtAugment: Unsupervised diverse short-texts paraphrasing for intent detection meta-learning

Recent research considers few-shot intent detection as a meta-learning p...
research
09/30/2022

Learning by Distilling Context

Language models significantly benefit from context tokens, such as promp...

Please sign up or login with your details

Forgot password? Click here to reset