On Trace of PGD-Like Adversarial Attacks

05/19/2022
by   Mo Zhou, et al.
5

Adversarial attacks pose safety and security concerns for deep learning applications. Yet largely imperceptible, a strong PGD-like attack may leave strong trace in the adversarial example. Since attack triggers the local linearity of a network, we speculate network behaves in different extents of linearity for benign examples and adversarial examples. Thus, we construct Adversarial Response Characteristics (ARC) features to reflect the model's gradient consistency around the input to indicate the extent of linearity. Under certain conditions, it shows a gradually varying pattern from benign example to adversarial example, as the later leads to Sequel Attack Effect (SAE). ARC feature can be used for informed attack detection (perturbation magnitude is known) with binary classifier, or uninformed attack detection (perturbation magnitude is unknown) with ordinal regression. Due to the uniqueness of SAE to PGD-like attacks, ARC is also capable of inferring other attack details such as loss function, or the ground-truth label as a post-processing defense. Qualitative and quantitative evaluations manifest the effectiveness of ARC feature on CIFAR-10 w/ ResNet-18 and ImageNet w/ ResNet-152 and SwinT-B-IN1K with considerable generalization among PGD-like attacks despite domain shift. Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.

READ FULL TEXT

page 3

page 4

page 5

research
06/02/2020

Perturbation Analysis of Gradient-based Adversarial Attacks

After the discovery of adversarial examples and their adverse effects on...
research
05/23/2023

The Best Defense is a Good Offense: Adversarial Augmentation against Adversarial Attacks

Many defenses against adversarial attacks (robust classifiers, randomiza...
research
09/13/2017

A Learning and Masking Approach to Secure Learning

Deep Neural Networks (DNNs) have been shown to be vulnerable against adv...
research
04/08/2022

AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification

Adversarial attacks pose a severe security threat to the state-of-the-ar...
research
11/20/2019

Generate (non-software) Bugs to Fool Classifiers

In adversarial attacks intended to confound deep learning models, most s...
research
05/25/2019

Adversarial Distillation for Ordered Top-k Attacks

Deep Neural Networks (DNNs) are vulnerable to adversarial attacks, espec...
research
09/13/2019

Defending Against Adversarial Attacks by Suppressing the Largest Eigenvalue of Fisher Information Matrix

We propose a scheme for defending against adversarial attacks by suppres...

Please sign up or login with your details

Forgot password? Click here to reset