Februus: Input Purification Defence Against Trojan Attacks on Deep Neural Network Systems

08/09/2019
by   Bao Gia Doan, et al.
0

We propose Februus; a novel idea to neutralize insidous and highly potent Trojan attacks on Deep Neural Network (DNN) systems at run-time. In Trojan attacks, an adversary activates a backdoor crafted in a deep neural network model using a secret trigger, a Trojan, applied to any input to alter the model's decision to a target prediction---a target determined by and only known to the attacker. Februus sanitizes the incoming input by devising an extraction method to surgically remove the potential trigger artifacts and use an inpainting method we propose for restoring the input for the classification task. Through extensive experiments, we demonstrate the efficacy of Februus against backdoor attacks, including advance variants and adaptive attacks, across vision applications. Notably, in contrast to existing approaches, our approach removes the need for ground-truth labelled data or anomaly detection methods for Trojan detection or retraining a model or prior knowledge of an attack. We achieve dramatic reductions in the attack success rates; from 100% to 0.25% (in the worst case) with no loss of performance for benign or trojaned inputs sanitized by Februus. To the best of our knowledge, this is the first backdoor defense method for operation in black-box setting capable of sanitizing trojaned inputs without requiring costly labelled data.

READ FULL TEXT

page 2

page 11

page 12

page 13

research
08/09/2019

DeepCleanse: Input Sanitization Framework Against Trojan Attacks on Deep Neural Network Systems

Doubts over safety and trustworthiness of deep learning systems have eme...
research
08/09/2019

DeepCleanse: A Black-box Input SanitizationFramework Against Backdoor Attacks on DeepNeural Networks

As Machine Learning, especially Deep Learning, has been increasingly use...
research
02/18/2019

STRIP: A Defence Against Trojan Attacks on Deep Neural Networks

Recent trojan attacks on deep neural network (DNN) models are one insidi...
research
11/16/2019

Defending Against Model Stealing Attacks with Adaptive Misinformation

Deep Neural Networks (DNNs) are susceptible to model stealing attacks, w...
research
10/08/2020

Decamouflage: A Framework to Detect Image-Scaling Attacks on Convolutional Neural Networks

As an essential processing step in computer vision applications, image r...
research
03/17/2022

PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection and Mitigation in Deep Neural Networks

Backdoor attacks impose a new threat in Deep Neural Networks (DNNs), whe...
research
11/08/2020

Bait and Switch: Online Training Data Poisoning of Autonomous Driving Systems

We show that by controlling parts of a physical environment in which a p...

Please sign up or login with your details

Forgot password? Click here to reset