The Robust Manifold Defense: Adversarial Training using Generative Models

12/26/2017
by   Andrew Ilyas, et al.
0

Deep neural networks are demonstrating excellent performance on several classical vision problems. However, these networks are vulnerable to adversarial examples, minutely modified images that induce arbitrary attacker-chosen output from the network. We propose a mechanism to protect against these adversarial inputs based on a generative model of the data. We introduce a pre-processing step that projects on the range of a generative model using gradient descent before feeding an input into a classifier. We show that this step provides the classifier with robustness against first-order, substitute model, and combined adversarial attacks. Using a min-max formulation, we show that there may exist adversarial examples even in the range of the generator, natural-looking images extremely close to the decision boundary for which the classifier has unjustifiedly high confidence. We show that adversarial training on the generative manifold can be used to make a classifier that is robust to these attacks. Finally, we show how our method can be applied even without a pre-trained generative model using a recent method called the deep image prior. We evaluate our method on MNIST, CelebA and Imagenet and show robustness against the current state of the art attacks.

READ FULL TEXT

page 11

page 13

page 15

page 17

page 25

research
11/25/2019

One Man's Trash is Another Man's Treasure: Resisting Adversarial Examples by Adversarial Examples

Modern image classification systems are often built on deep neural netwo...
research
05/21/2018

Generative Adversarial Examples

Adversarial examples are typically constructed by perturbing an existing...
research
04/08/2023

Exploring the Connection between Robust and Generative Models

We offer a study that connects robust discriminative classifiers trained...
research
02/09/2019

Image Decomposition and Classification through a Generative Model

We demonstrate in this paper that a generative model can be designed to ...
research
05/21/2018

Bidirectional Learning for Robust Neural Networks

A multilayer perceptron can behave as a generative classifier by applyin...
research
05/25/2023

CARSO: Counter-Adversarial Recall of Synthetic Observations

In this paper, we propose a novel adversarial defence mechanism for imag...
research
09/01/2023

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Are foundation models secure from malicious actors? In this work, we foc...

Please sign up or login with your details

Forgot password? Click here to reset