Making Pre-trained Language Models Better Few-shot Learners

12/31/2020
by   Tianyu Gao, et al.
11

The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF–better few-shot fine-tuning of language models–a suite of simple and complementary techniques for fine-tuning language models on a small number of annotated examples. Our approach includes (1) prompt-based fine-tuning together with a novel pipeline for automating prompt generation; and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. Finally, we present a systematic evaluation for analyzing few-shot performance on a range of NLP tasks, including classification and regression. Our experiments demonstrate that our methods combine to dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30 makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Language Models are Few-Shot Learners

Recent work has demonstrated substantial gains on many NLP tasks and ben...
research
04/03/2022

PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models

Current methods for few-shot fine-tuning of pretrained masked language m...
research
06/13/2023

Few-shot learning for sentence pair classification and its applications in software engineering

Few-shot learning-the ability to train models with access to limited dat...
research
09/05/2021

Teaching Autoregressive Language Models Complex Tasks By Demonstration

This paper demonstrates that by fine-tuning an autoregressive language m...
research
02/14/2023

Few-shot learning approaches for classifying low resource domain specific software requirements

With the advent of strong pre-trained natural language processing models...
research
12/13/2022

Localized Latent Updates for Fine-Tuning Vision-Language Models

Although massive pre-trained vision-language models like CLIP show impre...
research
05/31/2023

Measuring the Robustness of Natural Language Processing Models to Domain Shifts

Large Language Models have shown promising performance on various tasks,...

Please sign up or login with your details

Forgot password? Click here to reset