JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks

04/07/2019
by   N. Benjamin Erichson, et al.
0

It has been demonstrated that very simple attacks can fool highly-sophisticated neural network architectures. In particular, so-called adversarial examples, constructed from perturbations of input data that are small or imperceptible to humans but lead to different predictions, may lead to an enormous risk in certain critical applications. In light of this, there has been a great deal of work on developing adversarial training strategies to improve model robustness. These training strategies are very expensive, in both human and computational time. To complement these approaches, we propose a very simple and inexpensive strategy which can be used to "retrofit" a previously-trained network to improve its resilience to adversarial attacks. More concretely, we propose a new activation function---the JumpReLU---which, when used in place of a ReLU in an already-trained model, leads to a trade-off between predictive accuracy and robustness. This trade-off is controlled by the jump size, a hyper-parameter which can be tuned during the validation stage. Our empirical results demonstrate that this increases model robustness, protecting against adversarial attacks with substantially increased levels of perturbations. This is accomplished simply by retrofitting existing networks with our JumpReLU activation function, without the need for retraining the model. Additionally, we demonstrate that adversarially trained (robust) models can greatly benefit from retrofitting.

READ FULL TEXT

page 11

page 12

research
05/25/2019

Resisting Adversarial Attacks by k-Winners-Take-All

We propose a simple change to the current neural network structure for d...
research
03/23/2020

Architectural Resilience to Foreground-and-Background Adversarial Noise

Adversarial attacks in the form of imperceptible perturbations of normal...
research
10/08/2020

Improve Adversarial Robustness via Weight Penalization on Classification Layer

It is well-known that deep neural networks are vulnerable to adversarial...
research
06/07/2019

Reliable Classification Explanations via Adversarial Attacks on Robust Networks

Neural Networks (NNs) have been found vulnerable to a class of impercept...
research
04/11/2021

Achieving Model Robustness through Discrete Adversarial Training

Discrete adversarial attacks are symbolic perturbations to a language in...
research
09/23/2018

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

We improve the robustness of deep neural nets to adversarial attacks by ...
research
06/18/2021

Less is More: Feature Selection for Adversarial Robustness with Compressive Counter-Adversarial Attacks

A common observation regarding adversarial attacks is that they mostly g...

Please sign up or login with your details

Forgot password? Click here to reset