Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

08/13/2022
by   Tong Wang, et al.
2

Backdoor attacks have been shown to be a serious security threat against deep learning models, and detecting whether a given model has been backdoored becomes a crucial task. Existing defenses are mainly built upon the observation that the backdoor trigger is usually of small size or affects the activation of only a few neurons. However, the above observations are violated in many cases especially for advanced backdoor attacks, hindering the performance and applicability of the existing defenses. In this paper, we propose a backdoor defense DTInspector built upon a new observation. That is, an effective backdoor attack usually requires high prediction confidence on the poisoned training samples, so as to ensure that the trained model exhibits the targeted behavior with a high probability. Based on this observation, DTInspector first learns a patch that could change the predictions of most high-confidence data, and then decides the existence of backdoor by checking the ratio of prediction changes after applying the learned patch on the low-confidence data. Extensive evaluations on five backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.

READ FULL TEXT

page 4

page 6

research
04/28/2020

Minority Reports Defense: Defending Against Adversarial Patches

Deep learning image classification is vulnerable to adversarial attack, ...
research
12/14/2020

HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios

We have witnessed the continuing arms race between backdoor attacks and ...
research
06/28/2022

Increasing Confidence in Adversarial Robustness Evaluations

Hundreds of defenses have been proposed to make deep neural networks rob...
research
04/16/2023

A Random-patch based Defense Strategy Against Physical Attacks for Face Recognition Systems

The physical attack has been regarded as a kind of threat against real-w...
research
11/19/2019

Poison as a Cure: Detecting Neutralizing Variable-Sized Backdoor Attacks in Deep Neural Networks

Deep learning models have recently shown to be vulnerable to backdoor po...
research
02/16/2019

A Little Is Enough: Circumventing Defenses For Distributed Learning

Distributed learning is central for large-scale training of deep-learnin...
research
05/03/2021

Physical world assistive signals for deep neural network classifiers – neither defense nor attack

Deep Neural Networks lead the state of the art of computer vision tasks....

Please sign up or login with your details

Forgot password? Click here to reset