Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain

03/07/2021
by   Jinyu Tian, et al.
0

Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs), which are maliciously designed to cause dramatic model output errors. In this work, we reveal that normal examples (NEs) are insensitive to the fluctuations occurring at the highly-curved region of the decision boundary, while AEs typically designed over one single domain (mostly spatial domain) exhibit exorbitant sensitivity on such fluctuations. This phenomenon motivates us to design another classifier (called dual classifier) with transformed decision boundary, which can be collaboratively used with the original classifier (called primal classifier) to detect AEs, by virtue of the sensitivity inconsistency. When comparing with the state-of-the-art algorithms based on Local Intrinsic Dimensionality (LID), Mahalanobis Distance (MD), and Feature Squeezing (FS), our proposed Sensitivity Inconsistency Detector (SID) achieves improved AE detection performance and superior generalization capabilities, especially in the challenging cases where the adversarial perturbation levels are small. Intensive experimental results on ResNet and VGG validate the superiority of the proposed SID.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2022

Adversarial Detector with Robust Classifier

Deep neural network (DNN) models are wellknown to easily misclassify pre...
research
06/30/2022

Detecting and Recovering Adversarial Examples from Extracting Non-robust and Highly Predictive Adversarial Perturbations

Deep neural networks (DNNs) have been shown to be vulnerable against adv...
research
01/08/2018

Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

Deep Neural Networks (DNNs) have recently been shown to be vulnerable ag...
research
01/08/2018

Spatially transformed adversarial examples

Recent studies show that widely used deep neural networks (DNNs) are vul...
research
04/21/2018

Generating Natural Language Adversarial Examples

Deep neural networks (DNNs) are vulnerable to adversarial examples, pert...
research
06/08/2019

ML-LOO: Detecting Adversarial Examples with Feature Attribution

Deep neural networks obtain state-of-the-art performance on a series of ...

Please sign up or login with your details

Forgot password? Click here to reset