Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

08/30/2021
by   Ningyu Zhang, et al.
0

Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2022

Pre-trained Token-replaced Detection Model as Few-shot Learner

Pre-trained masked language models have demonstrated remarkable ability ...
research
04/29/2021

Entailment as Few-Shot Learner

Large pre-trained language models (LMs) have demonstrated remarkable abi...
research
03/15/2023

Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!

Large Language Models (LLMs) have made remarkable strides in various tas...
research
04/30/2023

How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model

Pre-trained language models can be surprisingly adept at tasks they were...
research
03/13/2023

Architext: Language-Driven Generative Architecture Design

Architectural design is a highly complex practice that involves a wide d...
research
07/24/2023

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models

Prompt engineering is a technique that involves augmenting a large pre-t...
research
04/20/2018

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Many efforts have been made to facilitate natural language processing ta...

Please sign up or login with your details

Forgot password? Click here to reset