Rethink Stealthy Backdoor Attacks in Natural Language Processing

01/09/2022
by   Lingfeng Shen, et al.
0

Recently, it has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a `backdoor trigger' paradigm to mislead the models. The most threatening backdoor attack is the stealthy backdoor, which defines the triggers as text style or syntactic. Although they have achieved an incredible high attack success rate (ASR), we find that the principal factor contributing to their ASR is not the `backdoor trigger' paradigm. Thus the capacity of these stealthy backdoor attacks is overestimated when categorized as backdoor attacks. Therefore, to evaluate the real attack power of backdoor attacks, we propose a new metric called attack successful rate difference (ASRD), which measures the ASR difference between clean state and poison state models. Besides, since the defenses against stealthy backdoor attacks are absent, we propose Trigger Breaker, consisting of two too simple tricks that can defend against stealthy backdoor attacks effectively. Experiments on text classification tasks show that our method achieves significantly better performance than state-of-the-art defense methods against stealthy backdoor attacks.

READ FULL TEXT
research
05/26/2021

Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger

Backdoor attacks are a kind of insidious security threat against machine...
research
06/23/2023

Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

Natural language processing (NLP) models have become increasingly popula...
research
05/27/2022

Defending Against Stealthy Backdoor Attacks

Defenses against security threats have been an interest of recent studie...
research
05/01/2021

Hidden Backdoors in Human-Centric Language Models

Natural language processing (NLP) systems have been proven to be vulnera...
research
05/25/2023

IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks

Backdoor attacks are an insidious security threat against machine learni...
research
10/06/2020

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder

This paper demonstrates a fatal vulnerability in natural language infere...
research
11/22/2022

A Survey on Backdoor Attack and Defense in Natural Language Processing

Deep learning is becoming increasingly popular in real-life applications...

Please sign up or login with your details

Forgot password? Click here to reset