A New Family of Neural Networks Provably Resistant to Adversarial Attacks

02/01/2019
by   Rakshit Agrawal, et al.
0

Adversarial attacks add perturbations to the input features with the intent of changing the classification produced by a machine learning system. Small perturbations can yield adversarial examples which are misclassified despite being virtually indistinguishable from the unperturbed input. Classifiers trained with standard neural network techniques are highly susceptible to adversarial examples, allowing an adversary to create misclassifications of their choice. We introduce a new type of network unit, called MWD (max of weighed distance) units that have a built-in resistant to adversarial attacks. These units are highly non-linear, and we develop the techniques needed to effectively train them. We show that simple interval techniques for propagating perturbation effects through the network enables the efficient computation of robustness (i.e., accuracy guarantees) for MWD networks under any perturbations, including adversarial attacks. MWD networks are significantly more robust to input perturbations than ReLU networks. On permutation invariant MNIST, when test examples can be perturbed by 20 while the accuracy of ReLU networks drops below 5 MWD networks is superior even to the observed accuracy of ReLU networks trained with the help of adversarial examples. In the absence of adversarial attacks, MWD networks match the performance of sigmoid networks, and have accuracy only slightly below that of ReLU networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/25/2018

Neural Networks with Structural Resistance to Adversarial Attacks

In adversarial attacks to machine-learning classifiers, small perturbati...
research
05/27/2019

Provable robustness against all adversarial l_p-perturbations for p≥ 1

In recent years several adversarial attacks and defenses have been propo...
research
02/08/2019

Discretization based Solutions for Secure Machine Learning against Adversarial Attacks

Adversarial examples are perturbed inputs that are designed (from a deep...
research
08/09/2019

On the Adversarial Robustness of Neural Networks without Weight Transport

Neural networks trained with backpropagation, the standard algorithm of ...
research
06/07/2019

Reliable Classification Explanations via Adversarial Attacks on Robust Networks

Neural Networks (NNs) have been found vulnerable to a class of impercept...
research
11/20/2017

Verifying Neural Networks with Mixed Integer Programming

Neural networks have demonstrated considerable success in a wide variety...
research
03/27/2017

Biologically inspired protection of deep networks from adversarial attacks

Inspired by biophysical principles underlying nonlinear dendritic comput...

Please sign up or login with your details

Forgot password? Click here to reset