Adversarial Robustness through Local Linearization

07/04/2019
by   Chongli Qin, et al.
3

Adversarial training is an effective methodology for training deep neural networks that are robust against adversarial, norm-bounded perturbations. However, the computational cost of adversarial training grows prohibitively as the size of the model and number of input dimensions increase. Further, training against less expensive and therefore weaker adversaries produces models that are robust against weak attacks but break down under attacks that are stronger. This is often attributed to the phenomenon of gradient obfuscation; such models have a highly non-linear loss surface in the vicinity of training examples, making it hard for gradient-based attacks to succeed even though adversarial examples still exist. In this work, we introduce a novel regularizer that encourages the loss to behave linearly in the vicinity of the training data, thereby penalizing gradient obfuscation while encouraging robustness. We show via extensive experiments on CIFAR-10 and ImageNet, that models trained with our regularizer avoid gradient obfuscation and can be trained significantly faster than adversarial training. Using this regularizer, we exceed current state of the art and achieve 47 ImageNet with l-infinity adversarial perturbations of radius 4/255 under an untargeted, strong, white-box attack. Additionally, we match state of the art results for CIFAR-10 at 8/255.

READ FULL TEXT

page 16

page 17

research
09/18/2023

Reducing Adversarial Training Cost with Gradient Approximation

Deep learning models have achieved state-of-the-art performances in vari...
research
10/26/2021

Improving Local Effectiveness for Global robust training

Despite its popularity, deep neural networks are easily fooled. To allev...
research
11/22/2022

Improving Robust Generalization by Direct PAC-Bayesian Bound Minimization

Recent research in robust optimization has shown an overfitting-like phe...
research
01/29/2023

Scaling in Depth: Unlocking Robustness Certification on ImageNet

Notwithstanding the promise of Lipschitz-based approaches to determinist...
research
11/23/2018

Robustness via curvature regularization, and vice versa

State-of-the-art classifiers have been shown to be largely vulnerable to...
research
06/13/2020

ClustTR: Clustering Training for Robustness

This paper studies how encouraging semantically-aligned features during ...
research
06/05/2019

Enhancing Gradient-based Attacks with Symbolic Intervals

Recent breakthroughs in defenses against adversarial examples, like adve...

Please sign up or login with your details

Forgot password? Click here to reset