Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection

08/02/2019
by   Di Tang, et al.
0

A security threat to deep neural networks (DNN) is backdoor contamination, in which an adversary poisons the training data of a target model to inject a Trojan so that images carrying a specific trigger will always be classified into a specific label. Prior research on this problem assumes the dominance of the trigger in an image's representation, which causes any image with the trigger to be recognized as a member in the target class. Such a trigger also exhibits unique features in the representation space and can therefore be easily separated from legitimate images. Our research, however, shows that simple target contamination can cause the representation of an attack image to be less distinguishable from that of legitimate ones, thereby evading existing defenses against the backdoor infection. In our research, we show that such a contamination attack actually subtly changes the representation distribution for the target class, which can be captured by a statistic analysis. More specifically, we leverage an EM algorithm to decompose an image into its identity part (e.g., person, traffic sign) and variation part within a class (e.g., lighting, poses). Then we analyze the distribution in each class, identifying those more likely to be characterized by a mixture model resulted from adding attack samples to the legitimate image pool. Our research shows that this new technique effectively detects data contamination attacks, including the new one we propose, and is also robust against the evasion attempts made by a knowledgeable adversary.

READ FULL TEXT

page 1

page 11

research
05/29/2019

A backdoor attack against LSTM-based text classification systems

With the widespread use of deep learning system in many applications, th...
research
12/07/2020

Backdoor Attack with Sample-Specific Triggers

Recently, backdoor attacks pose a new security threat to the training pr...
research
10/07/2020

Don't Trigger Me! A Triggerless Backdoor Attack Against Deep Neural Networks

Backdoor attack against deep neural networks is currently being profound...
research
10/17/2022

Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class

In recent years, machine learning models have been shown to be vulnerabl...
research
05/29/2023

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

Backdoor (Trojan) attack is a common threat to deep neural networks, whe...
research
12/21/2022

Vulnerabilities of Deep Learning-Driven Semantic Communications to Backdoor (Trojan) Attacks

This paper highlights vulnerabilities of deep learning-driven semantic c...
research
05/15/2019

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

In this paper, we explore clean-label poisoning attacks on deep convolut...

Please sign up or login with your details

Forgot password? Click here to reset