Universal Soldier: Using Universal Adversarial Perturbations for Detecting Backdoor Attacks

02/01/2023
by   Xiaoyun Xu, et al.
0

Deep learning models achieve excellent performance in numerous machine learning tasks. Yet, they suffer from security-related issues such as adversarial examples and poisoning (backdoor) attacks. A deep learning model may be poisoned by training with backdoored data or by modifying inner network parameters. Then, a backdoored model performs as expected when receiving a clean input, but it misclassifies when receiving a backdoored input stamped with a pre-designed pattern called "trigger". Unfortunately, it is difficult to distinguish between clean and backdoored models without prior knowledge of the trigger. This paper proposes a backdoor detection method by utilizing a special type of adversarial attack, universal adversarial perturbation (UAP), and its similarities with a backdoor trigger. We observe an intuitive phenomenon: UAPs generated from backdoored models need fewer perturbations to mislead the model than UAPs from clean models. UAPs of backdoored models tend to exploit the shortcut from all classes to the target class, built by the backdoor trigger. We propose a novel method called Universal Soldier for Backdoor detection (USB) and reverse engineering potential backdoor triggers via UAPs. Experiments on 345 models trained on several datasets show that USB effectively detects the injected backdoor and provides comparable or better results than state-of-the-art methods.

READ FULL TEXT

page 3

page 5

page 7

research
08/10/2021

On Procedural Adversarial Noise Attack And Defense

Deep Neural Networks (DNNs) are vulnerable to adversarial examples which...
research
04/01/2019

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

This paper focuses on learning transferable adversarial examples specifi...
research
09/15/2021

Universal Adversarial Attack on Deep Learning Based Prognostics

Deep learning-based time series models are being extensively utilized in...
research
09/03/2021

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

Since training a large-scale backdoored model from scratch requires a la...
research
06/10/2020

Scalable Backdoor Detection in Neural Networks

Recently, it has been shown that deep learning models are vulnerable to ...
research
02/03/2022

Learnability Lock: Authorized Learnability Control Through Adversarial Invertible Transformations

Owing much to the revolution of information technology, the recent progr...
research
07/13/2020

Understanding Adversarial Examples from the Mutual Influence of Images and Perturbations

A wide variety of works have explored the reason for the existence of ad...

Please sign up or login with your details

Forgot password? Click here to reset