Scaling provable adversarial defenses

05/31/2018
by   Eric Wong, et al.
0

Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique for extending these training procedures to much more general networks, with skip connections (such as ResNets) and general nonlinearities; the approach is fully modular, and can be implemented automatically (analogous to automatic differentiation). Second, in the specific case of ℓ_∞ adversarial perturbations and networks with ReLU nonlinearities, we adopt a nonlinear random projection for training, which scales linearly in the number of hidden units (previous approaches scaled quadratically). Third, we show how to further improve robust error through cascade models. On both MNIST and CIFAR data sets, we train classifiers that improve substantially on the state of the art in provable robust adversarial error bounds: from 5.8 (with ℓ_∞ perturbations of ϵ=0.1), and from 80 CIFAR (with ℓ_∞ perturbations of ϵ=2/255). Code for all experiments in the paper is available at https://github.com/locuslab/convex_adversarial/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2019

Provably Robust Deep Learning via Adversarially Trained Smoothed Classifiers

Recent works have shown the effectiveness of randomized smoothing as a s...
research
11/02/2017

Provable defenses against adversarial examples via the convex outer adversarial polytope

We propose a method to learn deep ReLU-based classifiers that are provab...
research
09/30/2022

Your Out-of-Distribution Detection Method is Not Robust!

Out-of-distribution (OOD) detection has recently gained substantial atte...
research
11/08/2018

A Geometric Perspective on the Transferability of Adversarial Directions

State-of-the-art machine learning models frequently misclassify inputs t...
research
04/02/2021

Defending Against Image Corruptions Through Adversarial Augmentations

Modern neural networks excel at image classification, yet they remain vu...
research
10/18/2021

Improving Robustness using Generated Data

Recent work argues that robust training requires substantially larger da...
research
04/05/2021

Rethinking Perturbations in Encoder-Decoders for Fast Training

We often use perturbations to regularize neural models. For neural encod...

Please sign up or login with your details

Forgot password? Click here to reset