Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review

09/12/2023
by   Pengzhou Cheng, et al.
0

Deep Neural Networks (DNNs) have led to unprecedented progress in various natural language processing (NLP) tasks. Owing to limited data and computation resources, using third-party data and models has become a new paradigm for adapting various tasks. However, research shows that it has some potential security vulnerabilities because attackers can manipulate the training process and data source. Such a way can set specific triggers, making the model exhibit expected behaviors that have little inferior influence on the model's performance for primitive tasks, called backdoor attacks. Hence, it could have dire consequences, especially considering that the backdoor attack surfaces are broad. To get a precise grasp and understanding of this problem, a systematic and comprehensive review is required to confront various security challenges from different phases and attack purposes. Additionally, there is a dearth of analysis and comparison of the various emerging backdoor countermeasures in this situation. In this paper, we conduct a timely review of backdoor attacks and countermeasures to sound the red alarm for the NLP security community. According to the affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into three categorizations: attacking pre-trained model with fine-tuning (APMF) or prompt-tuning (APMP), and attacking final model with training (AFMT), where AFMT can be subdivided into different attack aims. Thus, attacks under each categorization are combed. The countermeasures are categorized into two general classes: sample inspection and model inspection. Overall, the research on the defense side is far behind the attack side, and there is no single defense that can prevent all types of backdoor attacks. An attacker can intelligently bypass existing defenses with a more invisible attack. ......

READ FULL TEXT

page 1

page 24

research
11/22/2022

A Survey on Backdoor Attack and Defense in Natural Language Processing

Deep learning is becoming increasingly popular in real-life applications...
research
07/21/2020

Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review

This work provides the community with a timely comprehensive review of b...
research
02/20/2021

WaNet – Imperceptible Warping-based Backdoor Attack

With the thriving of deep learning and the widespread practice of using ...
research
05/27/2022

Defending Against Stealthy Backdoor Attacks

Defenses against security threats have been an interest of recent studie...
research
01/18/2021

Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks

Due to the success of pre-trained models (PTMs), people usually fine-tun...
research
08/28/2023

A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks

The Large Language Models (LLMs) are poised to offer efficient and intel...
research
07/18/2022

Towards Automated Classification of Attackers' TTPs by combining NLP with ML Techniques

The increasingly sophisticated and growing number of threat actors along...

Please sign up or login with your details

Forgot password? Click here to reset