On Model Robustness Against Adversarial Examples

11/15/2019
by   Shufei Zhang, et al.
13

We study the model robustness against adversarial examples, referred to as small perturbed input data that may however fool many state-of-the-art deep learning models. Unlike previous research, we establish a novel theory addressing the robustness issue from the perspective of stability of the loss function in the small neighborhood of natural examples. We propose to exploit an energy function to describe the stability and prove that reducing such energy guarantees the robustness against adversarial examples. We also show that the traditional training methods including adversarial training with the l_2 norm constraint (AT) and Virtual Adversarial Training (VAT) tend to minimize the lower bound of our proposed energy function. We make an analysis showing that minimization of such lower bound can however lead to insufficient robustness within the neighborhood around the input sample. Furthermore, we design a more rational method with the energy regularization which proves to achieve better robustness than previous methods. Through a series of experiments, we demonstrate the superiority of our model on both supervised tasks and semi-supervised tasks. In particular, our proposed adversarial framework achieves the best performance compared with previous adversarial training methods on benchmark datasets MNIST, CIFAR-10, and SVHN. Importantly, they demonstrate much better robustness against adversarial examples than all the other comparison methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2023

Reducing Adversarial Training Cost with Gradient Approximation

Deep learning models have achieved state-of-the-art performances in vari...
research
12/10/2019

On Certifying Robust Models by Polyhedral Envelope

Certifying neural networks enables one to offer guarantees on a model's ...
research
11/01/2022

The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training

Although current deep learning techniques have yielded superior performa...
research
11/17/2015

Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization

We propose a general framework for increasing local stability of Artific...
research
01/22/2023

Provable Unrestricted Adversarial Training without Compromise with Generalizability

Adversarial training (AT) is widely considered as the most promising str...
research
07/16/2018

Manifold Adversarial Learning

The recently proposed adversarial training methods show the robustness t...
research
10/06/2020

Constraining Logits by Bounded Function for Adversarial Robustness

We propose a method for improving adversarial robustness by addition of ...

Please sign up or login with your details

Forgot password? Click here to reset