Backdoor Mitigation by Correcting the Distribution of Neural Activations

08/18/2023
by   Xi Li, et al.
0

Backdoor (Trojan) attacks are an important type of adversarial exploit against deep neural networks (DNNs), wherein a test instance is (mis)classified to the attacker's target class whenever the attacker's backdoor trigger is present. In this paper, we reveal and analyze an important property of backdoor attacks: a successful attack causes an alteration in the distribution of internal layer activations for backdoor-trigger instances, compared to that for clean instances. Even more importantly, we find that instances with the backdoor trigger will be correctly classified to their original source classes if this distribution alteration is corrected. Based on our observations, we propose an efficient and effective method that achieves post-training backdoor mitigation by correcting the distribution alteration using reverse-engineered triggers. Notably, our method does not change any trainable parameters of the DNN, but achieves generally better mitigation performance than existing methods that do require intensive DNN parameter tuning. It also efficiently detects test instances with the trigger, which may help to catch adversarial entities in the act of exploiting the backdoor.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2023

Improved Activation Clipping for Universal Backdoor Mitigation and Test-Time Detection

Deep neural networks are vulnerable to backdoor attacks (Trojans), where...
research
10/20/2020

L-RED: Efficient Post-Training Detection of Imperceptible Backdoor Attacks without Access to the Training Set

Backdoor attacks (BAs) are an emerging form of adversarial attack typica...
research
05/13/2022

Universal Post-Training Backdoor Detection

A Backdoor attack (BA) is an important type of adversarial attack agains...
research
03/17/2022

PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks

Backdoor attacks impose a new threat in Deep Neural Networks (DNNs), whe...
research
07/16/2020

Odyssey: Creation, Analysis and Detection of Trojan Models

Along with the success of deep neural network (DNN) models in solving va...
research
05/31/2022

CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences

Backdoor attacks have been a critical threat to deep neural network (DNN...
research
12/14/2022

Backdoor Mitigation in Deep Neural Networks via Strategic Retraining

Deep Neural Networks (DNN) are becoming increasingly more important in a...

Please sign up or login with your details

Forgot password? Click here to reset