Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

06/30/2023
by   Aaron Mueller, et al.
0

Large language models show impressive results on few-shot NLP tasks. However, these models are memory and computation-intensive. Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner; however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks. To overcome this issue, we propose meta-training with demonstration retrieval, where we use a dense passage retriever to retrieve semantically similar labeled demonstrations to each example for more varied supervision. By separating external knowledge from model parameters, we can use meta-training to train parameter-efficient models that generalize well on a larger variety of tasks. We construct a meta-training set from UnifiedQA and CrossFit, and propose a demonstration bank based on UnifiedQA tasks. To our knowledge, our work is the first to combine retrieval with meta-training, to use DPR models to retrieve demonstrations, and to leverage demonstrations from many tasks simultaneously, rather than randomly sampling demonstrations from the training set of the target task. Our approach outperforms a variety of targeted parameter-efficient and retrieval-augmented few-shot methods on QA, NLI, and text classification tasks (including SQuAD, QNLI, and TREC). Our approach can be meta-trained and fine-tuned quickly on a single GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2022

Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

Large-scale pre-trained language models (PLMs) are well-known for being ...
research
11/06/2022

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Recent studies have revealed the intriguing few-shot learning ability of...
research
05/23/2023

Dr.ICL: Demonstration-Retrieved In-context Learning

In-context learning (ICL), teaching a large language model (LLM) to perf...
research
08/05/2022

Few-shot Learning with Retrieval Augmented Language Models

Large language models have shown impressive few-shot results on a wide r...
research
08/29/2023

TransPrompt v2: A Transferable Prompting Framework for Cross-task Text Classification

Text classification is one of the most imperative tasks in natural langu...
research
01/17/2021

What Makes Good In-Context Examples for GPT-3?

GPT-3 has attracted lots of attention due to its superior performance ac...
research
05/07/2023

Unified Demonstration Retriever for In-Context Learning

In-context learning is a new learning paradigm where a language model co...

Please sign up or login with your details

Forgot password? Click here to reset