Embedding Hallucination for Few-Shot Language Fine-tuning

05/03/2022
by   Yiren Jian, et al.
0

Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences. In such settings, fine-tuning a pre-trained language model can cause severe over-fitting. In this paper, we propose an Embedding Hallucination (EmbedHalluc) method, which generates auxiliary embedding-label pairs to expand the fine-tuning dataset. The hallucinator is trained by playing an adversarial game with the discriminator, such that the hallucinated embedding is indiscriminative to the real ones in the fine-tuning dataset. By training with the extended dataset, the language learner effectively learns from the diverse hallucinated embeddings to overcome the over-fitting issue. Experiments demonstrate that our proposed method is effective in a wide range of language tasks, outperforming current fine-tuning methods. Further, we show that EmbedHalluc outperforms other methods that address this over-fitting problem, such as common data augmentation, semi-supervised pseudo-labeling, and regularization. The code will be made available at: https://github.com/yiren-jian/EmbedHalluc.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Fine-tuned pre-trained language models can suffer from severe miscalibra...
research
03/24/2023

Prompt Tuning based Adapter for Vision-Language Model Adaption

Large pre-trained vision-language (VL) models have shown significant pro...
research
07/13/2020

Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

It has been a common approach to pre-train a language model on a large c...
research
06/13/2022

Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning

Freezing the pre-trained backbone has become a standard paradigm to avoi...
research
09/13/2021

Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models

Recent works have shown that powerful pre-trained language models (PLM) ...
research
05/05/2023

Physics-based network fine-tuning for robust quantitative susceptibility mapping from high-pass filtered phase

Purpose: To improve the generalization ability of convolutional neural n...
research
05/30/2022

Prompt-aligned Gradient for Prompt Tuning

Thanks to the large pre-trained vision-language models (VLMs) like CLIP,...

Please sign up or login with your details

Forgot password? Click here to reset