Reliably fast adversarial training via latent adversarial perturbation

04/04/2021
by   Geon Yeong Park, et al.
0

While multi-step adversarial training is widely popular as an effective defense method against strong adversarial attacks, its computational cost is notoriously expensive, compared to standard training. Several single-step adversarial training methods have been proposed to mitigate the above-mentioned overhead cost; however, their performance is not sufficiently reliable depending on the optimization setting. To overcome such limitations, we deviate from the existing input-space-based adversarial training regime and propose a single-step latent adversarial training method (SLAT), which leverages the gradients of latent representation as the latent adversarial perturbation. We demonstrate that the L1 norm of feature gradients is implicitly regularized through the adopted latent perturbation, thereby recovering local linearity and ensuring reliable performance, compared to the existing single-step adversarial training methods. Because latent perturbation is based on the gradients of the latent representations which can be obtained for free in the process of input gradients computation, the proposed method costs roughly the same time as the fast gradient sign method. Experiment results demonstrate that the proposed method, despite its structural simplicity, outperforms state-of-the-art accelerated adversarial training methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2020

Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples

Adversarial examples have become one of the largest challenges that mach...
research
02/05/2021

Robust Single-step Adversarial Training with Regularizer

High cost of training time caused by multi-step adversarial example gene...
research
05/16/2020

Encryption Inspired Adversarial Defense for Visual Classification

Conventional adversarial defenses reduce classification accuracy whether...
research
06/27/2019

Latent Optimization for Non-adversarial Representation Disentanglement

Disentanglement between pose and content is a key task for artificial in...
research
12/15/2020

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Despite the empirical success in various domains, it has been revealed t...
research
02/02/2022

Make Some Noise: Reliable and Efficient Single-Step Adversarial Training

Recently, Wong et al. showed that adversarial training with single-step ...
research
04/05/2021

Adaptive Clustering of Robust Semantic Representations for Adversarial Image Purification

Deep Learning models are highly susceptible to adversarial manipulations...

Please sign up or login with your details

Forgot password? Click here to reset