AdvCheck: Characterizing Adversarial Examples via Local Gradient Checking

03/25/2023
by   Ruoxi Chen, et al.
0

Deep neural networks (DNNs) are vulnerable to adversarial examples, which may lead to catastrophe in security-critical domains. Numerous detection methods are proposed to characterize the feature uniqueness of adversarial examples, or to distinguish DNN's behavior activated by the adversarial examples. Detections based on features cannot handle adversarial examples with large perturbations. Besides, they require a large amount of specific adversarial examples. Another mainstream, model-based detections, which characterize input properties by model behaviors, suffer from heavy computation cost. To address the issues, we introduce the concept of local gradient, and reveal that adversarial examples have a quite larger bound of local gradient than the benign ones. Inspired by the observation, we leverage local gradient for detecting adversarial examples, and propose a general framework AdvCheck. Specifically, by calculating the local gradient from a few benign examples and noise-added misclassified examples to train a detector, adversarial examples and even misclassified natural inputs can be precisely distinguished from benign ones. Through extensive experiments, we have validated the AdvCheck's superior performance to the state-of-the-art (SOTA) baselines, with detection rate (∼× 1.2) on general adversarial attacks and (∼× 1.4) on misclassified natural inputs on average, with average 1/500 time cost. We also provide interpretable results for successful detection.

READ FULL TEXT

page 18

page 20

research
05/08/2023

Adversarial Examples Detection with Enhanced Image Difference Features based on Local Histogram Equalization

Deep Neural Networks (DNNs) have recently made significant progress in m...
research
12/31/2020

Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions

Adversarial examples are input examples that are specifically crafted to...
research
02/28/2020

Detecting and Recovering Adversarial Examples: An Input Sensitivity Guided Method

Deep neural networks undergo rapid development and achieve notable succe...
research
03/26/2018

On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples

Understanding and characterizing the subspaces of adversarial examples a...
research
05/12/2020

Effective and Robust Detection of Adversarial Examples via Benford-Fourier Coefficients

Adversarial examples have been well known as a serious threat to deep ne...
research
08/19/2021

Exploiting Multi-Object Relationships for Detecting Adversarial Attacks in Complex Scenes

Vision systems that deploy Deep Neural Networks (DNNs) are known to be v...
research
03/29/2023

Latent Feature Relation Consistency for Adversarial Robustness

Deep neural networks have been applied in many computer vision tasks and...

Please sign up or login with your details

Forgot password? Click here to reset