FMT: Removing Backdoor Feature Maps via Feature Map Testing in Deep Neural Networks

07/21/2023
by   Dong Huang, et al.
0

Deep neural networks have been widely used in many critical applications, such as autonomous vehicles and medical diagnosis. However, their security is threatened by backdoor attack, which is achieved by adding artificial patterns to specific training data. Existing defense strategies primarily focus on using reverse engineering to reproduce the backdoor trigger generated by attackers and subsequently repair the DNN model by adding the trigger into inputs and fine-tuning the model with ground-truth labels. However, once the trigger generated by the attackers is complex and invisible, the defender can not successfully reproduce the trigger. Consequently, the DNN model will not be repaired since the trigger is not effectively removed. In this work, we propose Feature Map Testing (FMT). Different from existing defense strategies, which focus on reproducing backdoor triggers, FMT tries to detect the backdoor feature maps, which are trained to extract backdoor information from the inputs. After detecting these backdoor feature maps, FMT will erase them and then fine-tune the model with a secure subset of training data. Our experiments demonstrate that, compared to existing defense strategies, FMT can effectively reduce the Attack Success Rate (ASR) even against the most complex and invisible attack triggers. Second, unlike conventional defense methods that tend to exhibit low Robust Accuracy (i.e., the model's accuracy on the poisoned data), FMT achieves higher RA, indicating its superiority in maintaining model performance while mitigating the effects of backdoor attacks (e.g., FMT obtains 87.40% RA in CIFAR10). Third, compared to existing feature map pruning techniques, FMT can cover more backdoor feature maps (e.g., FMT removes 83.33% of backdoor feature maps from the model in the CIFAR10 & BadNet scenario).

READ FULL TEXT
research
07/21/2023

Feature Map Testing for Deep Neural Networks

Due to the widespread application of deep neural networks (DNNs) in safe...
research
10/31/2022

Poison Attack and Defense on Deep Source Code Processing Models

In the software engineering community, deep learning (DL) has recently b...
research
09/13/2018

Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

Deep neural networks (DNNs) are known vulnerable to adversarial attacks....
research
05/31/2022

CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences

Backdoor attacks have been a critical threat to deep neural network (DNN...
research
06/15/2021

Detect and remove watermark in deep neural networks via generative adversarial networks

Deep neural networks (DNN) have achieved remarkable performance in vario...
research
12/30/2021

Few-shot Backdoor Defense Using Shapley Estimation

Deep neural networks have achieved impressive performance in a variety o...
research
02/23/2020

Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks

Deep neural networks often consist of a great number of trainable parame...

Please sign up or login with your details

Forgot password? Click here to reset