Adversarial robustness via stochastic regularization of neural activation sensitivity

09/23/2020
by   Gil Fidel, et al.
0

Recent works have shown that the input domain of any machine learning classifier is bound to contain adversarial examples. Thus we can no longer hope to immune classifiers against adversarial examples and instead can only aim to achieve the following two defense goals: 1) making adversarial examples harder to find, or 2) weakening their adversarial nature by pushing them further away from correctly classified data points. Most if not all the previously suggested defense mechanisms attend to just one of those two goals, and as such, could be bypassed by adaptive attacks that take the defense mechanism into consideration. In this work we suggest a novel defense mechanism that simultaneously addresses both defense goals: We flatten the gradients of the loss surface, making adversarial examples harder to find, using a novel stochastic regularization term that explicitly decreases the sensitivity of individual neurons to small input perturbations. In addition, we push the decision boundary away from correctly classified inputs by leveraging Jacobian regularization. We present a solid theoretical basis and an empirical testing of our suggested approach, demonstrate its superiority over previously suggested defense mechanisms, and show that it is effective against a wide range of adaptive attacks.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

12/04/2020

Advocating for Multiple Defense Strategies against Adversarial Examples

It has been empirically observed that defense mechanisms designed to pro...
07/18/2017

APE-GAN: Adversarial Perturbation Elimination with GAN

Although neural networks could achieve state-of-the-art performance whil...
10/27/2019

Adversarial Defense Via Local Flatness Regularization

Adversarial defense is a popular and important research area. Due to its...
11/20/2018

Lightweight Lipschitz Margin Training for Certified Defense against Adversarial Examples

How can we make machine learning provably robust against adversarial exa...
10/24/2020

Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

Adversarial Training is proved to be an efficient method to defend again...
03/05/2018

Stochastic Activation Pruning for Robust Adversarial Defense

Neural networks are known to be vulnerable to adversarial examples. Care...
12/05/2018

Random Spiking and Systematic Evaluation of Defenses Against Adversarial Examples

Image classifiers often suffer from adversarial examples, which are gene...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.