STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

09/13/2021
by   Tu Vu, et al.
0

Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available. To address this shortcoming, we propose STraTA, which stands for Self-Training with Task Augmentation, an approach that builds on two key ideas for effective leverage of unlabeled data. First, STraTA uses task augmentation, a novel technique that synthesizes a large amount of data for auxiliary-task fine-tuning from target-task unlabeled texts. Second, STraTA performs self-training by further fine-tuning the strong base model created by task augmentation on a broad distribution of pseudo-labeled data. Our experiments demonstrate that STraTA can substantially improve sample efficiency across 12 few-shot benchmarks. Remarkably, on the SST-2 sentiment dataset, STraTA, with only 8 training examples per class, achieves comparable results to standard fine-tuning with 67K training examples. Our analyses reveal that task augmentation and self-training are both complementary and independently effective.

READ FULL TEXT
research
03/22/2021

Improving and Simplifying Pattern Exploiting Training

Recently, pre-trained language models (LMs) have achieved strong perform...
research
04/01/2020

Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning

Few-shot learning aims to classify unseen classes with a few training ex...
research
10/12/2021

LiST: Lite Self-training Makes Efficient Few-shot Learners

We present a new method LiST for efficient fine-tuning of large pre-trai...
research
10/04/2021

Revisiting Self-Training for Few-Shot Learning of Language Model

As unlabeled data carry rich task-relevant information, they are proven ...
research
09/21/2019

Positive-Unlabeled Compression on the Cloud

Many attempts have been done to extend the great success of convolutiona...
research
09/09/2021

MetaXT: Meta Cross-Task Transfer between Disparate Label Spaces

Albeit the universal representational power of pre-trained language mode...
research
12/01/2022

AUG-FedPrompt: Practical Few-shot Federated NLP with Data-augmented Prompts

Transformer-based pre-trained models have become the de-facto solution f...

Please sign up or login with your details

Forgot password? Click here to reset