ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks

Deep Neural Networks (DNNs) have been applied successfully in computer vision. However, their wide adoption in image-related applications is threatened by their vulnerability to trojan attacks. These attacks insert some misbehavior at training using samples with a mark or trigger, which is exploited at inference or testing time. In this work, we analyze the composition of the features learned by DNNs at training. We identify that they, including those related to the inserted triggers, contain both content (semantic information) and style (texture information), which are recognized as a whole by DNNs at testing time. We then propose a novel defensive technique against trojan attacks, in which DNNs are taught to disregard the styles of inputs and focus on their content only to mitigate the effect of triggers during the classification. The generic applicability of the approach is demonstrated in the context of a traffic sign and a face recognition application. Each of them is exposed to a different attack with a variety of triggers. Results show that the method reduces the attack success rate significantly to values < 1 improving the initial accuracy of the models when processing both benign and adversarial data.

READ FULL TEXT

page 1

page 2

research
01/16/2020

Universal Adversarial Attack on Attention and the Resulting Dataset DAmageNet

Adversarial attacks on deep neural networks (DNNs) have been found for s...
research
05/24/2023

Sharpness-Aware Data Poisoning Attack

Recent research has highlighted the vulnerability of Deep Neural Network...
research
04/26/2020

Towards Feature Space Adversarial Attack

We propose a new type of adversarial attack to Deep Neural Networks (DNN...
research
01/28/2019

Interpreting Deep Neural Networks Through Variable Importance

While the success of deep neural networks (DNNs) is well-established acr...
research
09/17/2017

Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification

Deep neural networks (DNNs) have transformed several artificial intellig...
research
12/31/2017

Adversarial Generative Nets: Neural Network Attacks on State-of-the-Art Face Recognition

In this paper we show that misclassification attacks against face-recogn...
research
03/24/2020

PoisHygiene: Detecting and Mitigating Poisoning Attacks in Neural Networks

The black-box nature of deep neural networks (DNNs) facilitates attacker...

Please sign up or login with your details

Forgot password? Click here to reset