Boosting Gradient for White-Box Adversarial Attacks

by   Hongying Liu, et al.

Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from original samples, but can greatly change the network output. Existing white-box attack algorithms can generate powerful adversarial examples. Nevertheless, most of the algorithms concentrate on how to iteratively make the best use of gradients to improve adversarial performance. In contrast, in this paper, we focus on the properties of the widely-used ReLU activation function, and discover that there exist two phenomena (i.e., wrong blocking and over transmission) misleading the calculation of gradients in ReLU during the backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradient and corresponding actual changes, and mislead the gradients which results in larger perturbations. Therefore, we propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. During the backpropagation of the network, our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients. Comprehensive experimental results on ImageNet demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attack attackers, to further decrease perturbations in the ℓ _2-norm.



There are no comments yet.


page 1

page 5


Random Directional Attack for Fooling Deep Neural Networks

Deep neural networks (DNNs) have been widely used in many fields such as...

IWA: Integrated Gradient based White-box Attacks for Fooling Deep Neural Networks

The widespread application of deep neural network (DNN) techniques is be...

A randomized gradient-free attack on ReLU networks

It has recently been shown that neural networks but also other classifie...

Gradient-based Adversarial Attacks against Text Transformers

We propose the first general-purpose gradient-based attack against trans...

Defending Against Adversarial Attacks Using Random Forests

As deep neural networks (DNNs) have become increasingly important and po...

Robust and Information-theoretically Safe Bias Classifier against Adversarial Attacks

In this paper, the bias classifier is introduced, that is, the bias part...

Exploring Adversarial Attack in Spiking Neural Networks with Spike-Compatible Gradient

Recently, backpropagation through time inspired learning algorithms are ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.