A Unified Gradient Regularization Family for Adversarial Examples

by   Chunchuan Lyu, et al.

Adversarial examples are augmented data points generated by imperceptible perturbation of input samples. They have recently drawn much attention with the machine learning and data mining community. Being difficult to distinguish from real examples, such adversarial examples could change the prediction of many of the best learning models including the state-of-the-art deep learning models. Recent attempts have been made to build robust models that take into account adversarial examples. However, these methods can either lead to performance drops or lack mathematical motivations. In this paper, we propose a unified framework to build robust machine learning models against adversarial examples. More specifically, using the unified framework, we develop a family of gradient regularization methods that effectively penalize the gradient of loss function w.r.t. inputs. Our proposed framework is appealing in that it offers a unified view to deal with adversarial examples. It incorporates another recently-proposed perturbation based approach as a special case. In addition, we present some visual effects that reveals semantic meaning in those perturbations, and thus support our regularization method and provide another explanation for generalizability of adversarial examples. By applying this technique to Maxout networks, we conduct a series of experiments and achieve encouraging results on two benchmark datasets. In particular,we attain the best accuracy on MNIST data (without data augmentation) and competitive performance on CIFAR-10 data.


Adversarial Examples for Good: Adversarial Examples Guided Imbalanced Learning

Adversarial examples are inputs for machine learning models that have be...

Explaining and Harnessing Adversarial Examples

Several machine learning models, including neural networks, consistently...

On the (Un-)Avoidability of Adversarial Examples

The phenomenon of adversarial examples in deep learning models has cause...

Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP

An adversarial example is an input transformed by small perturbations th...

Maximally Invariant Data Perturbation as Explanation

While several feature scoring methods are proposed to explain the output...

Improving Back-Propagation by Adding an Adversarial Gradient

The back-propagation algorithm is widely used for learning in artificial...

Mixup Training as the Complexity Reduction

Machine learning has achieved remarkable results in recent years due to ...

Please sign up or login with your details

Forgot password? Click here to reset