Neural Network Trojans Analysis and Mitigation from the Input Domain

02/13/2022
by   Zhenting Wang, et al.
0

Deep Neural Networks (DNNs) can learn Trojans (or backdoors) from benign or poisoned data, which raises security concerns of using them. By exploiting such Trojans, the adversary can add a fixed input space perturbation to any given input to mislead the model predicting certain outputs (i.e., target labels). In this paper, we analyze such input space Trojans in DNNs, and propose a theory to explain the relationship of a model's decision regions and Trojans: a complete and accurate Trojan corresponds to a hyperplane decision region in the input domain. We provide a formal proof of this theory, and provide empirical evidence to support the theory and its relaxations. Based on our analysis, we design a novel training method that removes Trojans during training even on poisoned datasets, and evaluate our prototype on five datasets and five different attacks. Results show that our method outperforms existing solutions. Code: <https://anonymous.4open.science/r/NOLE-84C3>.

READ FULL TEXT

page 5

page 11

page 14

research
11/21/2019

Neural Network Memorization Dissection

Deep neural networks (DNNs) can easily fit a random labeling of the trai...
research
08/30/2023

MDTD: A Multi Domain Trojan Detector for Deep Neural Networks

Machine learning models that use deep neural networks (DNNs) are vulnera...
research
05/10/2023

Towards Invisible Backdoor Attacks in the Frequency Domain against Deep Neural Networks

Deep neural networks (DNNs) have made tremendous progress in the past te...
research
08/08/2022

PerD: Perturbation Sensitivity-based Neural Trojan Detection Framework on NLP Applications

Deep Neural Networks (DNNs) have been shown to be susceptible to Trojan ...
research
05/28/2019

Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks

Despite the great achievements of deep neural networks (DNNs), the vulne...
research
01/28/2022

Interplay between depth of neural networks and locality of target functions

It has been recognized that heavily overparameterized deep neural networ...
research
01/18/2022

XAI Model for Accurate and Interpretable Landslide Susceptibility

Landslides are notoriously difficult to predict. Deep neural networks (D...

Please sign up or login with your details

Forgot password? Click here to reset