LiST: Lite Self-training Makes Efficient Few-shot Learners

10/12/2021
by   Yaqing Wang, et al.
9

We present a new method LiST for efficient fine-tuning of large pre-trained language models (PLMs) in few-shot learning settings. LiST significantly improves over recent methods that adopt prompt fine-tuning using two key techniques. The first one is the use of self-training to leverage large amounts of unlabeled data for prompt-tuning to significantly boost the model performance in few-shot settings. We use self-training in conjunction with meta-learning for re-weighting noisy pseudo-prompt labels. However, traditional self-training is expensive as it requires updating all the model parameters repetitively. Therefore, we use a second technique for light-weight fine-tuning where we introduce a small number of task-specific adapter parameters that are fine-tuned during self-training while keeping the PLM encoder frozen. This also significantly reduces the overall model footprint across several tasks that can now share a common PLM encoder as backbone for inference. Combining the above techniques, LiST not only improves the model performance for few-shot learning on target domains but also reduces the model memory footprint. We present a comprehensive study on six NLU tasks to validate the effectiveness of LiST. The results show that LiST improves by 35 over prompt-tuning with 96 fine-tuned with no more than 30 labeled examples from each target domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2021

Improving and Simplifying Pattern Exploiting Training

Recently, pre-trained language models (LMs) have achieved strong perform...
research
09/13/2021

STraTA: Self-Training with Task Augmentation for Better Few-shot Learning

Despite their recent successes in tackling many NLP tasks, large-scale p...
research
03/10/2023

UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation

In this paper, we introduce UnFuSeD, a novel approach to leverage self-s...
research
10/04/2021

Revisiting Self-Training for Few-Shot Learning of Language Model

As unlabeled data carry rich task-relevant information, they are proven ...
research
05/22/2023

Small Language Models Improve Giants by Rewriting Their Outputs

Large language models (LLMs) have demonstrated impressive few-shot learn...
research
10/06/2017

Efficient K-Shot Learning with Regularized Deep Networks

Feature representations from pre-trained deep neural networks have been ...
research
03/16/2022

Thinking about GPT-3 In-Context Learning for Biomedical IE? Think Again

The strong few-shot in-context learning capability of large pre-trained ...

Please sign up or login with your details

Forgot password? Click here to reset