Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

09/23/2018
by   Bao Wang, et al.
15

We improve the robustness of deep neural nets to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation function remarkably improves both classification accuracy and stability to adversarial perturbations. Together with the total variation minimization of adversarial images and augmented training, under the strongest attack, we achieve up to 20.6%, 50.7%, and 68.7% accuracy improvement w.r.t. the fast gradient sign method, iterative fast gradient sign method, and Carlini-Wagner L_2 attacks, respectively. Our defense strategy is additive to many of the existing methods. We give an intuitive explanation of our defense strategy via analyzing the geometry of the feature space. For reproducibility, the code is made available at: https://github.com/BaoWangMath/DNN-DataDependentActivation.

READ FULL TEXT
research
04/25/2022

A Hybrid Defense Method against Adversarial Attacks on Traffic Sign Classifiers in Autonomous Vehicles

Adversarial attacks can make deep neural network (DNN) models predict in...
research
07/16/2019

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

Improving the accuracy and robustness of deep neural nets (DNNs) and ada...
research
04/07/2019

JumpReLU: A Retrofit Defense Strategy for Adversarial Attacks

It has been demonstrated that very simple attacks can fool highly-sophis...
research
02/28/2022

Towards Robust Stacked Capsule Autoencoder with Hybrid Adversarial Training

Capsule networks (CapsNets) are new neural networks that classify images...
research
08/23/2021

Kryptonite: An Adversarial Attack Using Regional Focus

With the Rise of Adversarial Machine Learning and increasingly robust ad...
research
09/02/2020

Open-set Adversarial Defense

Open-set recognition and adversarial defense study two key aspects of de...
research
04/11/2022

Generalizing Adversarial Explanations with Grad-CAM

Gradient-weighted Class Activation Mapping (Grad- CAM), is an example-ba...

Please sign up or login with your details

Forgot password? Click here to reset