ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger

04/27/2023
by   Jiazhao Li, et al.
0

Textual backdoor attacks pose a practical threat to existing systems, as they can compromise the model by inserting imperceptible triggers into inputs and manipulating labels in the training dataset. With cutting-edge generative models such as GPT-4 pushing rewriting to extraordinary levels, such attacks are becoming even harder to detect. We conduct a comprehensive investigation of the role of black-box generative models as a backdoor attack tool, highlighting the importance of researching relative defense strategies. In this paper, we reveal that the proposed generative model-based attack, BGMAttack, could effectively deceive textual classifiers. Compared with the traditional attack methods, BGMAttack makes the backdoor trigger less conspicuous by leveraging state-of-the-art generative models. Our extensive evaluation of attack effectiveness across five datasets, complemented by three distinct human cognition assessments, reveals that Figure 4 achieves comparable attack performance while maintaining superior stealthiness relative to baseline methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/09/2023

Generative Model-Based Attack on Learnable Image Encryption for Privacy-Preserving Deep Learning

In this paper, we propose a novel generative model-based attack on learn...
research
05/26/2021

Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

Backdoor attacks are a kind of insidious security threat against machine...
research
08/23/2023

A Probabilistic Fluctuation based Membership Inference Attack for Diffusion Models

Membership Inference Attack (MIA) identifies whether a record exists in ...
research
12/16/2019

CAG: A Real-time Low-cost Enhanced-robustness High-transferability Content-aware Adversarial Attack Generator

Deep neural networks (DNNs) are vulnerable to adversarial attack despite...
research
09/09/2021

Multi-granularity Textual Adversarial Attack with Behavior Cloning

Recently, the textual adversarial attack models become increasingly popu...
research
01/24/2022

Hiding Behind Backdoors: Self-Obfuscation Against Generative Models

Attack vectors that compromise machine learning pipelines in the physica...
research
06/17/2022

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Textual backdoor attacks are a kind of practical threat to NLP systems. ...

Please sign up or login with your details

Forgot password? Click here to reset