Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning

04/01/2022
by   Ziyun Xu, et al.
0

Pre-trained Language Models (PLMs) have achieved remarkable performance for various language understanding tasks in IR systems, which require the fine-tuning process based on labeled training data. For low-resource scenarios, prompt-based learning for PLMs exploits prompts as task guidance and turns downstream tasks into masked language problems for effective few-shot fine-tuning. In most existing approaches, the high performance of prompt-based learning heavily relies on handcrafted prompts and verbalizers, which may limit the application of such approaches in real-world scenarios. To solve this issue, we present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning PLMs without any manual engineering of task-specific prompts and verbalizers. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. We further propose the pair-wise cost-sensitive contrastive learning procedure to optimize the model in order to achieve verbalizer-free class mapping and enhance the task-invariance of prompts. It explicitly learns to distinguish different classes and makes the decision boundary smoother by assigning different costs to easy and hard cases. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2021

CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models

Fine-tuning pre-trained language models (PLMs) has demonstrated its effe...
research
12/04/2020

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning

Recently, leveraging pre-trained Transformer based language models in do...
research
05/24/2023

Bi-Drop: Generalizable Fine-tuning for Pre-trained Language Models via Adaptive Subnetwork Optimization

Pretrained language models have achieved remarkable success in a variety...
research
02/19/2023

Few-shot Multimodal Multitask Multilingual Learning

While few-shot learning as a transfer learning paradigm has gained signi...
research
03/18/2021

GPT Understands, Too

While GPTs with traditional fine-tuning fail to achieve strong results o...
research
07/06/2023

Focused Transformer: Contrastive Training for Context Scaling

Large language models have an exceptional capability to incorporate new ...
research
05/06/2022

KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive Question Answering

Extractive Question Answering (EQA) is one of the most important tasks i...

Please sign up or login with your details

Forgot password? Click here to reset