A backdoor attack against LSTM-based text classification systems

by   Jiazhu Dai, et al.

With the widespread use of deep learning system in many applications, the adversary has strong incentive to explore vulnerabilities of deep neural networks and manipulate them. Backdoor attacks against deep neural networks have been reported to be a new type of threat. In this attack, the adversary will inject backdoors into the model and then cause the misbehavior of the model through inputs including backdoor triggers. Existed research mainly focuses on backdoor attacks in image classification based on CNN, little attention has been paid to the backdoor attacks in RNN. In this paper, we implement a backdoor attack in text classification based on LSTM by data poisoning. When the backdoor is injected, the model will misclassify any text samples that contains a specific trigger sentence into the target category determined by the adversary. The existence of the backdoor trigger is stealthy and the backdoor injected has little impact on the performance of the model. We consider the backdoor attack in black-box setting where the adversary has no knowledge of model structures or training algorithms except for small amount of training data. We verify the attack through sentiment analysis on the dataset of IMDB movie reviews. The experimental results indicate that our attack can achieve around 95


page 1

page 2

page 3

page 4


Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification

It has been proved that deep neural networks are facing a new threat cal...

Detecting Backdoors in Deep Text Classifiers

Deep neural networks are vulnerable to adversarial attacks, such as back...

Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection

A security threat to deep neural networks (DNN) is backdoor contaminatio...

Adversarial Reprogramming of Sequence Classification Neural Networks

Adversarial Reprogramming has demonstrated success in utilizing pre-trai...

A Closer Look at Evaluating the Bit-Flip Attack Against Deep Neural Networks

Deep neural network models are massively deployed on a wide variety of h...

Spinning Language Models for Propaganda-As-A-Service

We investigate a new threat to neural sequence-to-sequence (seq2seq) mod...

Attacks against Ranking Algorithms with Text Embeddings: a Case Study on Recruitment Algorithms

Recently, some studies have shown that text classification tasks are vul...

Please sign up or login with your details

Forgot password? Click here to reset