Certified Defenses against Adversarial Examples

01/29/2018
by   Aditi Raghunathan, et al.
0

While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this work, we study this problem for neural networks with one hidden layer. We first propose a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value. Second, as this certificate is differentiable, we jointly optimize it with the network parameters, providing an adaptive regularizer that encourages robustness against all attacks. On MNIST, our approach produces a network and a certificate that no attack that perturbs each pixel by at most ϵ = 0.1 can cause more than 35 error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Semidefinite relaxations for certifying robustness to adversarial examples

Despite their impressive performance on diverse tasks, neural networks f...
research
02/01/2019

Robustness Certificates Against Adversarial Examples for ReLU Networks

While neural networks have achieved high performance in different learni...
research
10/24/2018

Toward Robust Neural Networks via Sparsification

It is by now well-known that small adversarial perturbations can induce ...
research
01/23/2021

Error Diffusion Halftoning Against Adversarial Examples

Adversarial examples contain carefully crafted perturbations that can fo...
research
02/01/2021

Fast Training of Provably Robust Neural Networks by SingleProp

Recent works have developed several methods of defending neural networks...
research
02/09/2018

On the Connection between Differential Privacy and Adversarial Robustness in Machine Learning

Adversarial examples in machine learning has been a topic of intense res...
research
11/22/2021

Backdoor Attack through Frequency Domain

Backdoor attacks have been shown to be a serious threat against deep lea...

Please sign up or login with your details

Forgot password? Click here to reset