Efficient Two-Step Adversarial Defense for Deep Neural Networks

10/08/2018
by   Ting-Jui Chang, et al.
0

In recent years, deep neural networks have demonstrated outstanding performance in many machine learning tasks. However, researchers have discovered that these state-of-the-art models are vulnerable to adversarial examples: legitimate examples added by small perturbations which are unnoticeable to human eyes. Adversarial training, which augments the training data with adversarial examples during the training process, is a well known defense to improve the robustness of the model against adversarial attacks. However, this robustness is only effective to the same attack method used for adversarial training. Madry et al.(2017) suggest that effectiveness of iterative multi-step adversarial attacks and particularly that projected gradient descent (PGD) may be considered the universal first order adversary and applying the adversarial training with PGD implies resistance against many other first order attacks. However, the computational cost of the adversarial training with PGD and other multi-step adversarial examples is much higher than that of the adversarial training with other simpler attack techniques. In this paper, we show how strong adversarial examples can be generated only at a cost similar to that of two runs of the fast gradient sign method (FGSM), allowing defense against adversarial attacks with a robustness level comparable to that of the adversarial training with multi-step adversarial examples. We empirically demonstrate the effectiveness of the proposed two-step defense approach against different attack methods and its improvements over existing defense strategies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2021

Attacking Adversarial Attacks as A Defense

It is well known that adversarial attacks can fool deep neural networks ...
research
02/11/2022

Adversarial Attacks and Defense Methods for Power Quality Recognition

Vulnerability of various machine learning methods to adversarial example...
research
08/09/2020

Fast Gradient Projection Method for Text Adversary Generation and Adversarial Training

Adversarial training has shown effectiveness and efficiency in improving...
research
04/07/2021

Universal Adversarial Training with Class-Wise Perturbations

Despite their overwhelming success on a wide range of applications, conv...
research
05/23/2019

A Direct Approach to Robust Deep Learning Using Adversarial Networks

Deep neural networks have been shown to perform well in many classical m...
research
11/07/2020

Bridging the Performance Gap between FGSM and PGD Adversarial Training

Deep learning achieves state-of-the-art performance in many tasks but ex...
research
09/11/2019

Sparse and Imperceivable Adversarial Attacks

Neural networks have been proven to be vulnerable to a variety of advers...

Please sign up or login with your details

Forgot password? Click here to reset