Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

11/06/2022
by   Yu Meng, et al.
0

Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label-discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2022

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

Pretrained language models (PLMs) have demonstrated remarkable performan...
research
07/31/2019

Few-Shot Meta-Denoising

We study the problem of learning-based denoising where the training set ...
research
04/18/2021

Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity

When primed with only a handful of training samples, very large pretrain...
research
06/30/2023

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

Large language models show impressive results on few-shot NLP tasks. How...
research
04/03/2023

Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection

This paper investigates the effectiveness of large language models (LLMs...
research
06/29/2020

Improving Few-Shot Learning using Composite Rotation based Auxiliary Task

In this paper, we propose an approach to improve few-shot classification...
research
05/24/2019

Learning to learn by Self-Critique

In few-shot learning, a machine learning system learns from a small set ...

Please sign up or login with your details

Forgot password? Click here to reset