Attacking Adversarial Attacks as A Defense

06/09/2021
by   Boxi Wu, et al.
11

It is well known that adversarial attacks can fool deep neural networks with imperceptible perturbations. Although adversarial training significantly improves model robustness, failure cases of defense still broadly exist. In this work, we find that the adversarial attacks can also be vulnerable to small perturbations. Namely, on adversarially-trained models, perturbing adversarial examples with a small random noise may invalidate their misled predictions. After carefully examining state-of-the-art attacks of various kinds, we find that all these attacks have this deficiency to different extents. Enlightened by this finding, we propose to counter attacks by crafting more effective defensive perturbations. Our defensive perturbations leverage the advantage that adversarial training endows the ground-truth class with smaller local Lipschitzness. By simultaneously attacking all the classes, the misled predictions with larger Lipschitzness can be flipped into correct ones. We verify our defensive perturbation with both empirical experiments and theoretical analyses on a linear model. On CIFAR10, it boosts the state-of-the-art model from 66.16 AutoAttack, including 71.76 the top-1 robust accuracy of FastAT is improved from 33.18 100-step PGD attack.

READ FULL TEXT

page 8

page 26

page 27

page 29

research
10/08/2018

Efficient Two-Step Adversarial Defense for Deep Neural Networks

In recent years, deep neural networks have demonstrated outstanding perf...
research
04/12/2021

Sparse Coding Frontend for Robust Neural Networks

Deep Neural Networks are known to be vulnerable to small, adversarially ...
research
05/13/2019

Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models

Neural networks are vulnerable to adversarial attacks -- small visually ...
research
10/26/2020

Asymptotic Behavior of Adversarial Training in Binary Classification

It is widely known that several machine learning models are susceptible ...
research
07/20/2020

Robust Tracking against Adversarial Attacks

While deep convolutional neural networks (CNNs) are vulnerable to advers...
research
06/08/2020

Adversarial Feature Desensitization

Deep neural networks can now perform many tasks that were once thought t...
research
07/09/2022

Improved and Interpretable Defense to Transferred Adversarial Examples by Jacobian Norm with Selective Input Gradient Regularization

Deep neural networks (DNNs) are known to be vulnerable to adversarial ex...

Please sign up or login with your details

Forgot password? Click here to reset