Exploring the Universal Vulnerability of Prompt-based Learning Paradigm

04/11/2022
by   Lei Xu, et al.
0

Prompt-based learning paradigm bridges the gap between pre-training and fine-tuning, and works effectively under the few-shot setting. However, we find that this learning paradigm inherits the vulnerability from the pre-training stage, where model predictions can be misled by inserting certain triggers into the text. In this paper, we explore this universal vulnerability by either injecting backdoor triggers or searching for adversarial triggers on pre-trained language models using only plain text. In both scenarios, we demonstrate that our triggers can totally control or severely decrease the performance of prompt-based models fine-tuned on arbitrary downstream tasks, reflecting the universal vulnerability of the prompt-based learning paradigm. Further experiments show that adversarial triggers have good transferability among language models. We also find conventional fine-tuning models are not vulnerable to adversarial triggers constructed from pre-trained language models. We conclude by proposing a potential solution to mitigate our attack methods. Code and data are publicly available at https://github.com/leix28/prompt-universal-vulnerability

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2022

P^3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

Compared to other language tasks, applying pre-trained language models (...
research
05/02/2023

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

The prompt-based learning paradigm, which bridges the gap between pre-tr...
research
08/14/2021

Few-Sample Named Entity Recognition for Security Vulnerability Reports by Fine-Tuning Pre-Trained Language Models

Public security vulnerability reports (e.g., CVE reports) play an import...
research
09/20/2023

Are Large Language Models Really Robust to Word-Level Perturbations?

The swift advancement in the scale and capabilities of Large Language Mo...
research
11/27/2022

BadPrompt: Backdoor Attacks on Continuous Prompts

The prompt-based learning paradigm has gained much research attention re...
research
06/09/2023

COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language Models

Prompt-based learning has been proved to be an effective way in pre-trai...
research
10/11/2022

A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models

Despite the remarkable success of pre-trained language models (PLMs), th...

Please sign up or login with your details

Forgot password? Click here to reset