Defense Against Multi-target Trojan Attacks

07/08/2022
by   Haripriya Harikumar, et al.
0

Adversarial attacks on deep learning-based models pose a significant threat to the current AI infrastructure. Among them, Trojan attacks are the hardest to defend against. In this paper, we first introduce a variation of the Badnet kind of attacks that introduces Trojan backdoors to multiple target classes and allows triggers to be placed anywhere in the image. The former makes it more potent and the latter makes it extremely easy to carry out the attack in the physical space. The state-of-the-art Trojan detection methods fail with this threat model. To defend against this attack, we first introduce a trigger reverse-engineering mechanism that uses multiple images to recover a variety of potential triggers. We then propose a detection mechanism by measuring the transferability of such recovered triggers. A Trojan trigger will have very high transferability i.e. they make other images also go to the same class. We study many practical advantages of our attack method and then demonstrate the detection performance using a variety of image datasets. The experimental results show the superior detection performance of our method over the state-of-the-arts.

READ FULL TEXT
research
01/20/2022

Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios

Backdoor attacks (BAs) are an emerging threat to deep neural network cla...
research
04/05/2023

How to choose your best allies for a transferable attack?

The transferability of adversarial examples is a key issue in the securi...
research
05/29/2023

UMD: Unsupervised Model Detection for X2X Backdoor Attacks

Backdoor (Trojan) attack is a common threat to deep neural networks, whe...
research
06/07/2023

A Linearly Convergent GAN Inversion-based Algorithm for Reverse Engineering of Deceptions

An important aspect of developing reliable deep learning systems is devi...
research
10/26/2021

Semantic Host-free Trojan Attack

In this paper, we propose a novel host-free Trojan attack with triggers ...
research
04/27/2022

The MeVer DeepFake Detection Service: Lessons Learnt from Developing and Deploying in the Wild

Enabled by recent improvements in generation methodologies, DeepFakes ha...
research
08/20/2023

Towards Generalizable Morph Attack Detection with Consistency Regularization

Though recent studies have made significant progress in morph attack det...

Please sign up or login with your details

Forgot password? Click here to reset