Understanding and Improving Fast Adversarial Training

07/06/2020
by   Maksym Andriushchenko, et al.
0

A recent line of work focused on making adversarial training computationally efficient for deep learning models. In particular, Wong et al. (2020) showed that ℓ_∞-adversarial training with fast gradient sign method (FGSM) can fail due to a phenomenon called "catastrophic overfitting", when the model quickly loses its robustness over a single epoch of training. We show that adding a random step to FGSM, as proposed in Wong et al. (2020), does not prevent catastrophic overfitting, and that randomness is not important per se – its main role being simply to reduce the magnitude of the perturbation. Moreover, we show that catastrophic overfitting is not inherent to deep and overparametrized networks, but can occur in a single-layer convolutional network with a few filters. In an extreme case, even a single filter can make the network highly non-linear locally, which is the main reason why FGSM training fails. Based on this observation, we propose a new regularization method, GradAlign, that prevents catastrophic overfitting by explicitly maximizing the gradient alignment inside the perturbation set and improves the quality of the FGSM solution. As a result, GradAlign allows to successfully apply FGSM training also for larger ℓ_∞-perturbations and reduce the gap to multi-step adversarial training. The code of our experiments is available at https://github.com/tml-epfl/understanding-fast-adv-training.

READ FULL TEXT

page 22

page 23

research
10/05/2020

Understanding Catastrophic Overfitting in Single-step Adversarial Training

Adversarial examples are perturbed inputs that are designed to deceive m...
research
02/02/2022

Make Some Noise: Reliable and Efficient Single-Step Adversarial Training

Recently, Wong et al. showed that adversarial training with single-step ...
research
09/06/2022

Bag of Tricks for FGSM Adversarial Training

Adversarial training (AT) with samples generated by Fast Gradient Sign M...
research
06/16/2022

Catastrophic overfitting is a bug but also a feature

Despite clear computational advantages in building robust neural network...
research
11/21/2021

Local Linearity and Double Descent in Catastrophic Overfitting

Catastrophic overfitting is a phenomenon observed during Adversarial Tra...
research
11/24/2021

Subspace Adversarial Training

Single-step adversarial training (AT) has received wide attention as it ...
research
08/24/2023

Fast Adversarial Training with Smooth Convergence

Fast adversarial training (FAT) is beneficial for improving the adversar...

Please sign up or login with your details

Forgot password? Click here to reset