Resisting Adversarial Attacks by k-Winners-Take-All

05/25/2019
by   Chang Xiao, et al.
0

We propose a simple change to the current neural network structure for defending against gradient-based adversarial attacks. Instead of using popular activation functions (such as ReLU), we advocate the use of k-Winners-Take-All (k-WTA) activation, a C^0 discontinuous function that purposely invalidates the neural network model's gradient at densely distributed input data points. Our proposal is theoretically rationalized. We show why the discontinuities in k-WTA networks can largely prevent gradient-based search of adversarial examples and why they at the same time remain innocuous to the network training. This understanding is also empirically backed. Even without notoriously expensive adversarial training, the robustness performance of our networks is comparable to conventional ReLU networks optimized by adversarial training. Furthermore, after also optimized through adversarial training, our networks outperform the state-of-the-art methods under white-box attacks on various datasets that we experimented with.

READ FULL TEXT
research
12/05/2021

Stochastic Local Winner-Takes-All Networks Enable Profound Adversarial Robustness

This work explores the potency of stochastic competition-based activatio...
research
07/30/2023

On Neural Network approximation of ideal adversarial attack and convergence of adversarial training

Adversarial attacks are usually expressed in terms of a gradient-based o...
research
08/07/2019

Improved Adversarial Robustness by Reducing Open Space Risk via Tent Activations

Adversarial examples contain small perturbations that can remain imperce...
research
04/07/2019

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks

It has been demonstrated that very simple attacks can fool highly-sophis...
research
06/15/2019

Robust or Private? Adversarial Training Makes Models More Vulnerable to Privacy Attacks

Adversarial training was introduced as a way to improve the robustness o...
research
06/25/2020

Smooth Adversarial Training

It is commonly believed that networks cannot be both accurate and robust...
research
11/08/2021

Robust and Information-theoretically Safe Bias Classifier against Adversarial Attacks

In this paper, the bias classifier is introduced, that is, the bias part...

Please sign up or login with your details

Forgot password? Click here to reset