BadNL: Backdoor Attacks Against NLP Models

06/01/2020
by   Xiaoyi Chen, et al.
0

Machine learning (ML) has progressed rapidly during the past decade and ML models have been deployed in various real-world applications. Meanwhile, machine learning models have been shown to be vulnerable to various security and privacy attacks. One attack that has attracted a great deal of attention recently is the backdoor attack. Specifically, the adversary poisons the target model training set, to mislead any input with an added secret trigger to a target class, while keeping the accuracy for original inputs unchanged. Previous backdoor attacks mainly focus on computer vision tasks. In this paper, we present the first systematic investigation of the backdoor attack against models designed for natural language processing (NLP) tasks. Specifically, we propose three methods to construct triggers in the NLP setting, including Char-level, Word-level, and Sentence-level triggers. Our Attacks achieve an almost perfect success rate without jeopardizing the original model utility. For instance, using the word-level triggers, our backdoor attack achieves 100 1.26 Sentiment Treebank datasets, respectively.

READ FULL TEXT
research
07/30/2020

Label-Leaks: Membership Inference Attack with Label

Machine learning (ML) has made tremendous progress during the past decad...
research
12/22/2021

An Attention Score Based Attacker for Black-box NLP Classifier

Deep neural networks have a wide range of applications in solving variou...
research
03/07/2020

Dynamic Backdoor Attacks Against Machine Learning Models

Machine learning (ML) has made tremendous progress during the past decad...
research
08/30/2018

Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Deep learning models have consistently outperformed traditional machine ...
research
03/25/2023

Backdoor Attacks with Input-unique Triggers in NLP

Backdoor attack aims at inducing neural models to make incorrect predict...
research
06/23/2023

Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

Natural language processing (NLP) models have become increasingly popula...
research
02/18/2023

RobustNLP: A Technique to Defend NLP Models Against Backdoor Attacks

As machine learning (ML) systems are being increasingly employed in the ...

Please sign up or login with your details

Forgot password? Click here to reset