DeepAI AI Chat
Log In Sign Up

Explaining and Harnessing Adversarial Examples

12/20/2014
by   Ian J. Goodfellow, et al.
Google
0

Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset.

READ FULL TEXT

page 3

page 4

page 6

page 8

page 11

10/14/2016

Are Accuracy and Robustness Correlated?

Machine learning models are vulnerable to adversarial examples formed by...
08/27/2016

A Boundary Tilting Persepective on the Phenomenon of Adversarial Examples

Deep neural networks have been shown to suffer from a surprising weaknes...
11/19/2015

A Unified Gradient Regularization Family for Adversarial Examples

Adversarial examples are augmented data points generated by imperceptibl...
02/13/2018

Predicting Adversarial Examples with High Confidence

It has been suggested that adversarial examples cause deep learning mode...
07/22/2021

Towards Explaining Adversarial Examples Phenomenon in Artificial Neural Networks

In this paper, we study the adversarial examples existence and adversari...
10/16/2021

Analyzing Dynamic Adversarial Training Data in the Limit

To create models that are robust across a wide range of test inputs, tra...
10/19/2020

Verifying the Causes of Adversarial Examples

The robustness of neural networks is challenged by adversarial examples ...

Code Repositories

ConvNet

Convolutional Neural Networks for Matlab for classification and segmentation, including Invariang Backpropagation (IBP) and Adversarial Training (AT) algorithms. Trained on GPU, require cuDNN v5.


view repo

pytorch-cnn-adversarial-attacks

Pytorch implementation of convolutional neural network adversarial attack techniques


view repo

GenerateAdv

Python codes for two popular methods for generating adversarial examples: LBFGS and fast gradient sign methods


view repo

LFH_robust_DNN_malware_classifier

LFH method to robust deep neural network models for malware classification


view repo

mnistevol

A MNIST classifer + adversarial image generator using simple genetic algorithm and tensorflow


view repo