Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution

06/11/2021
by   Fanchao Qi, et al.
0

Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks. Injected with backdoors, models perform normally on benign examples but produce attacker-specified predictions when the backdoor is activated, presenting serious security threats to real-world applications. Since existing textual backdoor attacks pay little attention to the invisibility of backdoors, they can be easily detected and blocked. In this work, we present invisible backdoors that are activated by a learnable combination of word substitution. We show that NLP models can be injected with backdoors that lead to a nearly 100 invisible to existing defense strategies and even human inspections. The results raise a serious alarm to the security of NLP models, which requires further research to be resolved. All the data and code of this paper are released at https://github.com/thunlp/BkdAtk-LWS.

READ FULL TEXT
research
04/29/2020

TextAttack: A Framework for Adversarial Attacks in Natural Language Processing

TextAttack is a library for running adversarial attacks against natural ...
research
10/19/2022

Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP

Textual adversarial samples play important roles in multiple subfields o...
research
02/18/2023

RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks

As machine learning (ML) systems are being increasingly employed in the ...
research
03/29/2021

Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models

Recent studies have revealed a security threat to natural language proce...
research
10/14/2022

Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks

Natural language processing (NLP) models are known to be vulnerable to b...
research
05/27/2022

Defending Against Stealthy Backdoor Attacks

Defenses against security threats have been an interest of recent studie...
research
11/29/2022

Textual Enhanced Contrastive Learning for Solving Math Word Problems

Solving math word problems is the task that analyses the relation of qua...

Please sign up or login with your details

Forgot password? Click here to reset