Noise Optimization for Artificial Neural Networks

02/06/2021 ∙ by Li Xiao, et al. ∙ 0

Adding noises to artificial neural network(ANN) has been shown to be able to improve robustness in previous work. In this work, we propose a new technique to compute the pathwise stochastic gradient estimate with respect to the standard deviation of the Gaussian noise added to each neuron of the ANN. By our proposed technique, the gradient estimate with respect to noise levels is a byproduct of the backpropagation algorithm for estimating gradient with respect to synaptic weights in ANN. Thus, the noise level for each neuron can be optimized simultaneously in the processing of training the synaptic weights at nearly no extra computational cost. In numerical experiments, our proposed method can achieve significant performance improvement on robustness of several popular ANN structures under both black box and white box attacks tested in various computer vision datasets.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Artificial neural network(ANN)s have been widely used in image processing, speech recognition, game, and medical diagnosis. However, ANNs are typically vulnerable to adversarial attacks  (Szegedy et al., 2014). Many previous papers propose to add noises into ANN for improving robustness (Neelakantan et al., 2015; Gulcehre et al., 2016; Brownlee, 2019; You et al., 2019). Adding noises to ANN may flatten the local minima and thus leads to robustness enhancement.

Adversarial attacks are small perturbations generated by computer algorithms. The small perturbations added to the input data can drastically alter the output of ANN  (Dan Hendrycks, 2019a)

, which poses a serious challenge in security-critical applications, such as face recognition 

(Parkhi et al., 2015) and autonomous driving (Hadash et al., 2018). On the other hand, human vision system is surprisingly robust under rather subtle structural changes, let alone the small computer-generated perturbations  (Azulay and Weiss, 2019), natural noise corruptions such as snow, blur, pixelation, and even their combinations. Therefore, achieving the human-like robustness is still a holy grail in computer vision research.

There are evidences showing that proper regularization methods can effectively improve robustness of ANN under adversarial attacks. Previous work  (Krizhevsky et al., 2012; You et al., 2019) adds noises to ANN for improving robustness, which can be viewed as a regularization method to alleviate over-fitting. However, the magnitudes of the injected noises are set in an ad-hoc manner in previous work. Our work is aligned with the previous work in terms of adding noises for improving robustness. The main methodological contribution of our work lies in proposing a new technique to compute the pathwise stochastic gradient estimate with respect to the standard deviation of the Gaussian noise added to each neuron of the ANN. By our proposed technique, the gradient estimate with respect to noise levels is a byproduct of the backpropagation (BP) algorithm for estimating gradient with respect to synaptic weights in ANN. Thus, the noise level for each neuron can be optimized simultaneously in the processing of training the synaptic weights at nearly no extra computational cost.

The pathwise stochastic gradient estimation technique is also known as infinitesimal perturbation analysis (IPA) in simulation literature (Asmussen and Glynn, 2007). IPA and the likelihood ratio (LR) method are two classic unbiased stochastic gradient estimation techniques (Ho and Cao, 1991, Rubinstein and Shapiro, 1993). Recent advances can be found in Hong (2009), Heidergott and Leahu (2010), and Peng et al. (2018). Stochastic gradient estimation has been a central topic in simulation optimization, and recently, a comprehensive review paper is written by a research team of Google’s DeepMind (Mohamed et al., 2020).

The proposed new method is implemented to train multi-layer perceptron (MLP) and convolution neural network (CNN) with a ResNet backbone in MNIST, Cifar-10 and tiny-ImageNet datasets. We test the performance under both white box and and black box attacks. For black box attacks, we add both adversarial attacks and natural noise corruptions to the images. All numerical experiments show that our method can significantly improve robustness of ANN in nearly all situations, and it also improves classification accuracy in original dataset.

2 Related work

Adversarial attacks can be categorized to three types, i.e, 1) black box attack (Hang et al., 2020; Papernot et al., 2016; Guo et al., 2019), where the attacker has no information about the internal structure of the attacked model, training parameters, and defense methods (if defense methods are used), and it can only interact with the model through outputs; 2) white box attack (Dong et al., 2018; Nazemi and Fieguth, 2019), where the attacker has full information about the attacked model; 3) gray box attack (Prabhu and Whaley, ; Xiang et al., 2020), where the attacker only has a partial information of the model.

Researchers have developed many gradient-based adversarial attack methods, such as L-BFGS (Szegedy et al., 2014), FGSM (Goodfellow et al., 2015), and PGD  (Madry et al., 2017). The PGD attack is the strongest first-order attack that utilizes local information of the ANN. These methods are white box attacks in their original designs, but they can also work as gray box attacks and black box attacks due to the transferability of adversarial attacks among models (Tramèr et al., 2017; Petrov and Hospedales, 2019).

Compared with adversarial samples, adding natural noises to corrupt the input is a simpler black box attack (Heaven, 2019; Borji and Lin, 2019). Various types of natural noises such as Gaussian, Impulse, Contrast, Elastic, and Blurs have been developed  (Vasiljevic et al., 2016; Zheng et al., 2016; Dan Hendrycks, 2019b). Dan Hendrycks (2019b) proposed a new metric to evaluate robustness under several types of natural noise corruptions. Each type of noise has

severity levels and the evaluation metric is the average accuracy under noise corruptions at

severity levels.

Previous work focusing on improving robustness under adversarial attacks includes feature squeezing (Xu et al., 2017), distillation network (Papernot et al., 2016), input transformation (e.g., JPEG compression  (Dziugaite et al., 2016)

, autoencoder-based denoising 

(Liao et al., ) and regularization (Ross and Doshi-Velez, 2017) ), Parseval network (Cissé et al., 2017), gradient masking (Papernot et al., 2017), randomization (Liu et al., 2018; Dhillon et al., 2018), radial basis mapping kernel (Taghanaki et al., 2019), non-local context encoder (He et al., 2019), and Per (Dong et al., 2020). The PGD-based adversarial (re)training, which augments the training set with adversarial examples, is the most effective defense strategy (Goodfellow et al., 2015; Tramèr et al., 2018; Madry et al., 2018), but it consumes too much training time and can be neutered completely or partially by adaptive attacks (Athalye et al., 2018; Carlini and Wagner, 2017; Tramer et al., 2020).

Other previous work improves robustness via adding noises to input data  (Hendrycks et al., 2019; Gao et al., 2020), or adding noises to activations, outputs, weights and even gradients (Neelakantan et al., 2015; Gulcehre et al., 2016; Brownlee, 2019; You et al., 2019; Xiao et al., 2019). None of the previous work considers how to optimally set the magnitudes of the noises added to the ANN.

3 Noise Optimization Method

3.1 Gradient Estimation

Let denote the number of layers in the neural network and denote the number of neurons in the -th layer, . We denote the output of -th layer as and is the input of the network.

Suppose we have inputs for the network, denoted as , . For the -th input, the -th output at the -th layer can be given by

(1)

where is the -th input at the -th layer for the -th data, is the weight for the -th input at the -th layer, is the

-th logit output at the

-th layer,

is the activation function and

is an independent random noise added to the -th neuron at the -th layer for the -th data. The computation of Eq.(1) is depicted on the right-hand side of Figure 1. We let and then is the bias term in the linear operation of the -th neuron at the -th layer.

Figure 1: Illustration for the forward propagation of ANN with noises. The left-hand side of the figure shows computation in MLP, and the right-hand side of the figure shows computation in CNNs.

Computation in CNN, which is depicted on the right-hand side of Figure 1, is essentially equivalent to the computation of Eq.(1) in MLP. In Figure 1, the orange colored element in of the -th feature map at the -th layer is a product of the parameters in the -th convolution kernel and the corresponding orange colored inputs in . This computation is equivalent to the linear operation on the inputs of a neuron in Eq.(1

). An independent normal random variable is added to each element in the feature map. Mean

can be viewed as the bias term in Eq.(1).

We denote the loss function as

. For the -th data with label , we have the loss value denoted by . In our work, we try to optimize the magnitude of the noise level for centered normal random noise added to each neuron, i.e., , where is a standard normal random variable. Define the residual error for the -th neuron at the -th layer for the -th data propagated backwardly through the ANN as

(2)

where is defined by

The computation of residual errors by BP is depicted on the left-hand side of Figure 2. The BP algorithm essentially offers pathwise stochastic derivative estimates for the loss with respect to all parameters , , , simultaneously. Specifically,

Figure 2: The left-hand side of the figure presents the backward propagation of residual errors, and the right-hand side the figure shows computation of the gradient estimate based on residual errors.

In the following Theorem 1, we show that the pathwise stochastic derivatives with respect to the magnitudes of the noise levels , , , can be estimated as a byproduct of the BP algorithm, and they can be computed in a similar matter as the the pathwise stochastic derivatives with respect to synaptic weights. The computation of pathwise stochastic derivatives is depicted on the right-hand side of Figure 2.

Theorem 1.

Assume the activation function and the loss function are differentiable. We have

(3)
Proof.

The pathwise stochastic derivative for the sensitivity with respect to is

(4)

where

Notice that

Then the pathwise stochastic derivative on the left-hand side of Eq.(4) can be written as the following nested summations:

By reserving the order of summations, we obtain

(5)

which leads to the right-hand side of Eq.(3) by the definition of residual error . ∎

Next we show that the pathwise stochastic derivative Eq.(3

) is an unbiased estimate for the derivative of the expected loss over the randomness in ANN. The key is to justify the interchange of derivative and expectation.

Theorem 2.

Assume and are differentiable almost everywhere, and

(6)

where is a neighborhood surrounding . Then,

Proof.

For simplicity, we suppress the dependency on for notations in the proof. By definition,

where denotes a quantity dependent on the argument, the second equality holds by applying the mean-value theory to the conclusion of Theorem 1 with , and the third equality holds due to the dominated convergence theorem to justify the interchange of limit and expectation under uniform integrability condition Eq.(6) on the residual error. ∎

3.2 Gradient-Based Searching Method

To reduce oscillation in the gradient-based search, we apply the Adam optimizer to update as follows

(7)

where is the current number of iterations, and the absolute value is taken in the update of to enforce constraint . We set , , , and and are the exponential moving average of the gradient and its square, respectively, and is the pathwise stochastic derivative estimate with respect to derived in the last subsection. To avoid rapid change of in iterations, the initial learning rate is set at a relative low value . The detailed training procedure is summarized in Algorithm 1.

1:  Input: Training data , loss function .
2:  Construct a -layers ANN and initialize all the parameters in ANNs.
3:  repeat
4:     Exploiting Eq. (1) to calculate the output ;
5:     Calculate loss function ;
6:     Using Eq. (2) and (3) to estimate the gradient of the loss respect to the weights and noise levels, respectively;
7:     Updating the weights in ANN;
8:     Updating the noise level using Eq. (7).
9:  until Condition is met
Algorithm 1 Noise Optimization for ANNs

4 Experiments

Datasets Models Optimizer Learning rate Weight decay Epochs Batch size
MNIST MLP Adam 30 128
CNN
Cifar-10 ResNet18 50

Tiny-ImageNet

ResNet34 SGD initially 80
momentum=0.9 decay by 0.8 every 20 epochs
Table 1: Summary of the hyper-parameters for training different models.
Models Attack Methods
Corruption Noise FGSM L-BFGS PGD
MNIST
Including
Gaussian Noise,
Impulse Noise,
Glass Blur Noise,
and
Contrast Noise
N=10
,,N=10
Cifar-10
N=20
,,N=5
Tiny-ImageNet
N=10
,,N=3
Table 2: Summary of the settings for different attack methods.

4.1 Datasets and experimental settings

We conduct extensive experiments in three public datasets to test the robustness of our method: 1) the MNIST dataset; 2) the Cifar-10 dataset; 3) the tiny-Imagenet dataset (subset of the Imagenet) (Le and Yang, 2015)

. For MNIST, we apply our method to both Multi-Layer Perceptron (MLP) and convolution Neural networks(CNN). For Cifar-10, we use CNN with the ResNet18 backbone. For tiny-Imagenet, we use CNN with the ResNet34 backbone. All the codes are implemented based on the PyTorch 1.6.0 and run on Nvidia GeForce RTX 3090.

Both white box and black box attacks are used to test the robustness of ANN. For white box attacks, we apply FGSM (Goodfellow et al., 2015), L-BFGS (Szegedy et al., 2014) and PGD (Madry et al., 2017) to generate adversarial samples. For black box attacks, we apply FGSM and L-BFGS with a different ANN structure than that under attack to generate adversarial samples. Unlike adversarial attacks which modify the pixels with the worst case perturbation,  Dan Hendrycks (2019b) propose to add various types of natural noises to the input images as corruption attacks. In our work, we also adopt four types of natural noises as black box attacks. For each type of noise, we compute the average accuracy of 5 strength levels of corruptions to evaluate robustness.

The settings of hyper-parameters for training models, e.g. Optimizer, learning rate, weight decay, number of training epochs and batch sizes are presented in Table 1. All setups are determined by a hyper-parameter search. The settings for the attack methods are presented in Table 2. For FGSM, is the step size. For L-BFGS, is the step size and is the number of maximum iterations. For PGD, is the step size, is the maximum permutaion in one pixel and is the number of maximum iterations. The settings of FGSM and PGD in MNIST and Cifar-10 datasets follow those set in previous work (Ling et al., 2019; Chan et al., 2020), whereas other setups are determined by a hyper-parameter search.

4.2 Results in MNIST dataset

We construct the MLP and CNN to train the MNIST dataset, respectively. The MLP contains two hidden layers with 100 and 50 neurons on each layer, and we use the ReLu and Sigmoid as the activation function at two hidden layers, respectively. The CNN consists of 2 convolution layers with kernel size

, and 32 kernels at the first layer and 64 kernels at the second layer. Two fully connected layers with 128 neurons and an output layer with 10 neurons are followed. Cross-entropy function is adopted as the loss function for classification. We randomly split the entire dataset into training, validation, and testing datasets in a ratio of 5:1:1.

The results are shown in Tables 4 and 4

. For MLP, we report results for three ANN structures trained by corresponding methods: a) MLP: MLP without adding noises; b) MLP+: MLP with a standard normally distributed noise in each neuron; c) MLPN: MLP with Gaussian noises optimized by our proposed method simultaneously in the process of training synaptic weights by BP. For CNN, we report results for five ANN structures trained by corresponding methods: a) CNN: CNN without adding noises; b) CNN-MLP+: CNN with standard normally distributed noises added only to fully connected neural layers; c) CNN-A+: CNN with standard normally distributed noises added to both convolution neural layers and fully connected neural layers; d) CNN-MLPN: CNN with Gaussian noises added only to fully connected layers, which are optimized by our proposed method; d) CNN-AN: CNN with Gaussian noises added to both both convolution neural layers and fully connected layers, which are optimized by our proposed method.

Models Act. Ori White box evaluation Black box evaluation FGSM L-BFGS PGD Gaussian Impluse Glass Blur Contrast FGSM L-BFGS MLP ReLU 0.948 0.149 0.255 0.105 0.935 0.934 0.879 0.598 0.314 0.690 Sigmoid 0.936 0.280 0.324 0.207 0.883 0.783 0.885 0.676 0.410 0.749 MLP+ ReLU 0.884 0.283 0.413 0.238 0.875 0.851 0.531 0.531 0.340 0.742 Sigmoid 0.875 0.314 0.433 0.253 0.869 0.834 0.817 0.605 0.432 0.736 MLPN ReLU 0.921 0.295 0.420 0.267 0.895 0.909 0.835 0.672 0.430 0.745 Sigmoid 0.957 0.336 0.568 0.275 0.946 0.944 0.920 0.710 0.465 0.788
Table 3: Evaluations results of MLP models in MNIST
Models Ori White box evaluation Black box evaluation FGSM L-BFGS PGD Gaussian Impluse Glass Blur Contrast FGSM L-BFGS CNN 0.986 0.744 0.616 0.655 0.983 0.971 0.752 0.845 0.917 0.779 CNN-MLP+ 0.980 0.788 0.613 0.684 0.977 0.955 0.564 0.794 0.924 0.767 CNN-A+ 0.974 0.757 0.586 0.704 0.951 0.947 0.835 0.575 0.920 0.775 CNN-MLPN 0.990 0.870 0.685 0.752 0.995 0.984 0.788 0.853 0.957 0.818 CNN-AN 0.982 0.783 0.766 0.714 0.976 0.973 0.867 0.834 0.928 0.826
Table 4: Evaluation results of CNN models in MNIST
Figure 3: The figures from left to right respectively show the training loss, validation loss and accuracy in testing dataset.
Figure 4: Pictures from left to right show the initial image, saliency maps obtained from Resnet34, ResNet34-MLP+, Resnet34-A+, Resnet34-MLPN, Resnet34-AN.
Models Ori White box evaluation Black box evaluation
FGSM L-BFGS PGD Gaussian Impluse Glass Blur Contrast FGSM L-BFGS
ResNet18 0.902 0.234 0.433 0.114 0.558 0.530 0.189 0.544 0.467 0.562
ResNet18-MLP+ 0.874 0.254 0.461 0.118 0.553 0.535 0.185 0.536 0.469 0.554
ResNet18-A+ 0.877 0.219 0.401 0.143 0.572 0.543 0.184 0.544 0.493 0.570
ResNet18-MLPN 0.899 0.368 0.450 0.181 0.553 0.514 0.175 0.533 0.482 0.584
ResNet18-AN 0.905 0.393 0.489 0.203 0.587 0.557 0.175 0.559 0.562 0.613
Table 5: Evaluation results for ResNet18 models in Cifar-10
Models Ori White box evaluation Black box evaluation
FGSM L-BFGS PGD Gaussian Impluse Glass Blur Contrast FGSM L-BFGS
ResNet34 0.436 0.082 0.321 0.019 0.397 0.351 0.341 0.331 0.374 0.329
ResNet34-MLP+ 0.434 0.076 0.324 0.022 0.383 0.339 0.323 0.333 0.362 0.312
ResNet34-A+ 0.177 0.012 0.145 0.011 0.165 0.155 0.138 0.133 0.158 0.145
ResNet34-MLPN 0.445 0.119 0.402 0.051 0.406 0.364 0.336 0.339 0.389 0.344
ResNet34-AN 0.448 0.121 0.402 0.055 0.412 0.375 0.352 0.346 0.389 0.350
Table 6: Evaluation results for ResNet34 models in tiny-ImageNet

Robustness under white box attack The white box attack results are presented in Tables 4 and 4. Adding standard normally distributed noises to either MLP or CNN can significantly improve the model’s defensiveness for all of the FGSM, L-BFGS and PGD attacks. As a tradeoff, the accuracy on the original classification task is dropped to some extent. Surprisingly, by adding Gaussian noises optimized by our proposed method, we not only further improve the model’s defensiveness at a decent margin, but also improve the classification accuracy in the original testing dataset.

For MLP, it is interesting to notice that the Sigmoid activation function generally leads to better performance than the ReLu activation function. Compared to the base model MLP with the Sigmoid activation function, MLPN achieves a 20%(0.336 vs 0.28) increase in accuracy under the FGSM attack, a 75%(0.568 vs 0.324) increase in accuracy under the L-BFGS attack, and a 33%(0.275 vs 0.207) increase in accuracy under the PGD attack. For CNN, it is interesting to observe that adding noises only to fully connected layers achieves better performance than adding noises to all layers. CNN-MLPN achieves a 16%(0.87 vs 0.774) increase in accuracy under the FGSM attack, a 11%(0.685 vs 0.616) increase in accuracy under the L-BFGS attack, and a 15%(0.752 vs 0.655) increase in accuracy under the PGD attack. Our method also increases the classification accuracy in the original dataset by 2.2%(0.957 vs 0.936).

Acceleration for training Our proposed method leads to a fast convergence speed in training an ANN ReLu activation functions. Fig.3 reports the training losses, validation losses, and accuracy in the testing dataset as a function of epochs. Compared to MLP and MLP+, MLPN leads to the fastest convergence speed and achieves a comparable classification accuracy.

Robustness under black box attack Robustness under black box attack is evaluated and shown in Tables 4 and 4. To apply FGSM and L-BFGS, we use another MLP consisted of two hidden layers with 300 and 150 neurons and the Relu activation function at each layer to generate adversarial samples. Following  Dan Hendrycks (2019b), we perform black box attacks by adding corruption noises to the images, including Gaussian, Impluse, Glass Blur, Contrast. The results show that standard normally distributed noises can improve the model’s defensiveness against adversarial attacks but at a cost of a significant drop in accuracy in both the original testing dataset and the dataset corrupted by natural noises. On the other hand, our proposed noise optimization method achieves performance enhancement in all cases, i.e., accuracy in original dataset, and defensiveness against both adversarial attacks and natural noise corruptions.

Again, MLP with the Sigmoid activation function performs better than that with ReLu. Under the adversarial attacks, MLPN achieves a 13%(0.465 vs 0.41) increase in accuracy for FGSM and a 5.2%(0.788 vs 0.749) increase in accuracy for L-BFGS. Under natural noise corruptions, MLPN achieves a 7.1%(0.946 vs 0.883) increase in accuracy for Gaussian, a 21%(0.944 vs 0.783) increase in accuracy for Impulse, a 4.0%(0.92 vs 0.885) increase in accuracy for Glass Blur, and a 4.8%(0.71 vs 0.676) increase in accuracy for Contrast. For CNN, adding noises only to fully connected layers(CNN-MLPN) also achieves better performance than adding noises to all the layers(CNN-AN) in most situations. Compares to the baseline, CNN-MLPN achieves a 4.4%(0.957 vs 0.917) increase in accuracy under the FGSM attack and a 5.0%(0.818 vs 0.779) increase in accuracy under the L-BFGS attack. Under natural noise corruptions, CNN-MLPN achieves a 1.2%(0.995 vs 0.983) increase in accuracy for Gaussian, a 1.3%(0.984 vs 0.971) increase in accuracy for Impulse, a 4.8%(0.788 vs 0.752) increase in accuracy for Glass Blur, and a 0.9%(0.853 vs 0.845) increase in accuracy for Contrast. Our proposed method also improves classification accuracy in the original testing dataset by 0.4%(0.990 vs 0.986).

4.3 Results in Cifar-10 dataset

We adopt ResNet18 as the base model for Cifar-10 classification. To compare the influence of adding noises to fully connected neural layers and convolution neural layers, we replace the last fully connected neural layer of 10 hidden neurons with three fully connected neural layers which consist of 256, 128 and 10 neurons at three layers, respectively. For convolution neural layers, we only add noises to the last convolution layer of each residual block. For generating adversarial samples in black box attack, we use the original ResNet18 with one fully connected neural layer to generate adversarial examples for both FGSM and L-BFGS. We randomly split the entire dataset into training, validation, and testing datasets in a ratio of 4:1:1.

The results are shown in Table 5. Notations ResNet18, ResNet18-MLP+, ResNet18-A+, ResNet18-MLPN, and ResNet18-AN are interpreted similarly as those in CNN described before. The results show that adding standard normally distributed noises does not improve robustness but deteriorate classification accuracy in the original testing dataset. However, adding noises optimized by our proposed method significantly improves the performance under both adversarial attacks and natural noise corruptions, as well as the classification accuracy in the original testing dataset. It is also worth noting that adding noises to both fully connected layer and convolution layer(ResNet18-AN) achieves the best performance in all cases.

Compared to the baseline, ResNet18-AN achieves an average accuracy increase of 53% under white box adversarial attacks ( 68%(0.393 vs 0.234) for FGSM, 13%(0.489 vs 0.433) for L-BFGS, 78%(0.203 vs 0.114) for PGD), and an average accuracy increase of 9.7% under black box adversarial attacks ( 20%(0.562 vs 0.467) for FGSM, 9.1%(0.613 vs 0.562) for L-BFGS). Under natural noise corruptions, although ResNet18-AN leads to an accuracy drop of 7.4%(0.175 vs 0.189) for Glass Blur, it achieves significant performance improvement in defending the other three types of noises ( 5.2%(0.587 vs 0.557) for Gaussian, 4.9%(0.557 vs 0.53) for Impulse and 2.8%(0.559 vs 0.544) for Contrast). ResNet18-AN also slightly improves classification accuracy in the original testing dataset by 0.3%(0.905 vs 0.902).

4.4 Tiny-ImageNet

We adopt ResNet34 as the base model for tiny-ImageNet classification. Tiny-ImageNet dataset (Le and Yang, 2015) is a subset of ImageNet which contains only 200 classes with 500 training images, 50 validation images and 50 test images in each class, and with the image size down-sampled to pixels. To compare the influence of adding noises to fully connected neural layers and convolution neural layers, we replace the last fully connected neural layer of 200 hidden neurons with four fully connected neural layers which consist of 1024, 512, 256 and 200 neurons at each layer, respectively. For convolution neural layers, we only add noises to the last convolution neural layer of each residual block. For generating adversarial samples in black box attack, we use the original ResNet34 with one fully connected neural layer to generate adversarial samples for both FGSM and L-BFGS. We randomly split the entire dataset into training, validation, and testing datasets in a ratio of 10:1:1.

The results are shown in Table 6. Notations ResNet34, ResNet34-MLP+, ResNet34-A+, ResNet34-MLPN, and ResNet34-AN are interpreted similarly as those described in CNN. Similar to the observations in Cifar-10, adding standard normally distributed noises does not improve robustness but deteriorate accuracy, especially when noises are added to both the convolution neural layers and fully connected layers(ResNet34-A+), whereas our proposed method enhances performance in all cases with the best result achieved by ResNet34-AN.

Compared to the baseline, ResNet34-AN achieves an average accuracy increase of 87% under white box adversarial attacks ( 48%(0.121 vs 0.082) for FGSM, 25%(0.402 vs 0.321) for L-BFGS, 189%(0.055 vs 0.019) for PGD), and an average accuracy increase of 5.2% under black box adversarial attacks ( 4.0%(0.389 vs 0.374) for FGSM, 6.4%(0.35 vs 0.329) for L-BFGS). ResNet34-AN also has better defensiveness under all four types of natural noise corruptions, leading to an average accuracy increase of 4.6% ( 3.8%(0.412 vs 0.397) for Gaussian, 6.8%(0.375 vs 0.351) for Impulse, 3.2%(0.352 vs 0.341) for Glass Blur and 4.5%(0.346 vs 0.331) for Contrast). ResNet34-AN improves classification accuracy in the original testing dataset by 2.7%(0.448 vs 0.436).

4.5 Visualization on Saliency Map

To help better understand why our noise optimization method improves robustness, we adopt SmoothGrad method  (Smilkov, 2017) to generate saliency maps for different models in tiny-ImageNet dataset. Gradient-based saliency map is typically used to represent ’saliency’ at every location in the visual field, and it is adopted as a proxy for locating “important” pixels in the input image. The value on each pixel of the saliency map stands for the level of attention of the model.

For each sampled image, we add random noise and generate the saliency map. We repeat the process times and average the saliency maps to obtain the final saliency map, which is computed by

(8)

where is the -th label’s score (scalar output) given input . We set to and each image is reused times. We show the 2d score by summarizing the pixels along three channels of the map.

The results are shown in Fig. 4

. All of the images are sampled from testing dataset and classified correctly. Adding noises optimized by our proposed method makes the model focus more on the regions where targets are located and learn more important features. Taking the first picture as an example, we can see that ResNet34 concentrates all its attention on the face, whereas ResNet34-AN focuses on both the face and neck. Likewise for the other pictures, saliency maps of the ANNs with noises optimized by our method are more comprehensive and clear, which indicates that the ANNs capture more important features and thus lead to improved robustness in classification.

5 Conclusion

In this work, we propose a method to optimize the magnitudes of the noises added to ANN simultaneously in the process of training the synaptic weights at nearly no extra computation cost. Our method is applied to train both MLP and CNN with a ResNet backbone in MNIST, Cifar-10, and Tiny-ImageNet datasets. The proposed noise optimization method significantly improves the performance under both adversarial attacks and natural noise corruptions, as well as the classification accuracy in the original testing dataset. For training MLP, our method can also lead to a faster convergence speed in training. We use the saliency map to help better understand why our noise optimization method improves robustness.

References

  • S. Asmussen and P. W. Glynn (2007) Stochastic simulation: algorithms and analysis. Vol. 57, Springer Science & Business Media. Cited by: §1.
  • A. Athalye, N. Carlini, and D. Wagner (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In iclr, Cited by: §2.
  • A. Azulay and Y. Weiss (2019) Why do deep convolutional networks generalize so poorly to small image transformations?. External Links: Link Cited by: §1.
  • A. Borji and S. Lin (2019) White noise analysis of neural networks. arXiv preprint arXiv:1912.12106. Cited by: §2.
  • J. Brownlee (2019) Train neural networks with noise to reduce overfitting. Machine Learning Mastery. Cited by: §1, §2.
  • N. Carlini and D. Wagner (2017) Adversarial examples are not easily detected: bypassing ten detection methods. In

    Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

    ,
    pp. 3–14. Cited by: §2.
  • A. Chan, Y. Tay, and Y. Ong (2020) What it thinks is important is important: robustness transfers through input gradients. In

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    ,
    pp. 332–341. Cited by: §4.1.
  • M. Cissé, P. Bojanowski, E. Grave, Y. Dauphin, and N. Usunier (2017) Parseval networks: improving robustness to adversarial examples. In icml, Cited by: §2.
  • T. D. Dan Hendrycks (2019a)

    Adversarial attacks and defenses in deep learning

    .
    In Engineering, Vol. 6, pp. 346 – 360. External Links: Document Cited by: §1.
  • T. D. Dan Hendrycks (2019b) Benchmarking neural network robustness to common corruptions and perturbations. In ICLR, Cited by: §2, §4.1, §4.2.
  • G. S. Dhillon, K. Azizzadenesheli, Z. C. Lipton, J. Bernstein, J. Kossaifi, A. Khanna, and A. Anandkumar (2018) Stochastic activation pruning for robust adversarial defense. In iclr, Cited by: §2.
  • Y. Dong, Q. Fu, X. Yang, T. Pang, H. Su, Z. Xiao, and J. Zhu (2020) Benchmarking adversarial robustness. In cvpr, Cited by: §2.
  • Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li (2018) Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9185–9193. Cited by: §2.
  • G. K. Dziugaite, Z. Ghahramani, and D. M. Roy (2016) A study of the effect of JPG compression on adversarial images. arXiv preprint arXiv:1608.00853. Cited by: §2.
  • X. Gao, R. K. Saha, M. R. Prasad, and A. Roychoudhury (2020) Fuzz testing based data augmentation to improve robustness of deep neural networks. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pp. 1147–1158. Cited by: §2.
  • I. J. Goodfellow, J. Shlens, and C. Szegedy (2015) Explaining and harnessing adversarial examples. In iclr, Cited by: §2, §2, §4.1.
  • C. Gulcehre, M. Moczulski, M. Denil, and Y. Bengio (2016) Noisy activation functions. In International conference on machine learning, pp. 3059–3068. Cited by: §1, §2.
  • C. Guo, J. Gardner, Y. You, A. G. Wilson, and K. Weinberger (2019) Simple black-box adversarial attacks. In International Conference on Machine Learning, pp. 2484–2493. Cited by: §2.
  • G. Hadash, E. Kermany, B. Carmeli, O. Lavi, G. Kour, and A. Jacovi (2018) Estimate and replace: a novel approach to integrating deep neural networks with existing applications. arXiv preprint arXiv:1804.09028. Cited by: §1.
  • J. Hang, K. Han, H. Chen, and Y. Li (2020) Ensemble adversarial black-box attacks against deep learning systems. Vol. 101, pp. 107184. Cited by: §2.
  • X. He, S. Yang, G. Li, H. Li, H. Chang, and Y. Yu (2019) Non-local context encoder: robust biomedical image segmentation against adversarial attacks. In aaai, Vol. 33, pp. 8417–8424. Cited by: §2.
  • D. Heaven (2019) Why deep-learning ais are so easy to fool. Nature 574 (7777), pp. 163–166. Cited by: §2.
  • B. Heidergott and H. Leahu (2010) Weak differentiability of product measures. Mathematics of Operations Research 35 (1), pp. 27–51. Cited by: §1.
  • D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan (2019) Augmix: a simple data processing method to improve robustness and uncertainty. arXiv preprint arXiv:1912.02781. Cited by: §2.
  • Y. Ho and X. Cao (1991) Discrete event dynamic systems and perturbation analysis. Kluwer Academic Publishers, Boston, MA. Cited by: §1.
  • L. J. Hong (2009)

    Estimating quantile sensitivities

    .
    Operations Research 57 (1), pp. 118–130. Cited by: §1.
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25, pp. 1097–1105. Cited by: §1.
  • Y. Le and X. Yang (2015) Tiny imagenet visual recognition challenge. CS 231N 7, pp. 7. Cited by: §4.1, §4.4.
  • [29] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu Defense against adversarial attacks using high-level representation guided denoiser. In , Cited by: §2.
  • X. Ling, S. Ji, J. Zou, J. Wang, C. Wu, B. Li, and T. Wang (2019) Deepsec: a uniform platform for security analysis of deep learning model. In 2019 IEEE Symposium on Security and Privacy (SP), pp. 673–690. Cited by: §4.1.
  • X. Liu, M. Cheng, H. Zhang, and C. Hsieh (2018) Towards robust neural networks via random self-ensemble. In eccv, pp. 369–385. Cited by: §2.
  • A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083. Cited by: §2, §4.1.
  • A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2018) Towards deep learning models resistant to adversarial attacks. In iclr, Cited by: §2.
  • S. Mohamed, M. Rosca, M. Figurnov, and A. Mnih (2020) Monte Carlo gradient estimation in machine learning. Journal of Machine Learning Research 21 (132), pp. 1–62. Cited by: §1.
  • A. Nazemi and P. Fieguth (2019) Potential adversarial samples for white-box attacks. Cited by: §2.
  • A. Neelakantan, L. Vilnis, Q. V. Le, I. Sutskever, L. Kaiser, K. Kurach, and J. Martens (2015) Adding gradient noise improves learning for very deep networks. arXiv preprint arXiv:1511.06807. Cited by: §1, §2.
  • N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami (2017) Practical black-box attacks against machine learning. In ASIA Computer and Communications Security, pp. 506–519. Cited by: §2.
  • N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE Symposium on Security and Privacy, pp. 582–597. Cited by: §2, §2.
  • O. M. Parkhi, A. Vedaldi, and A. Zisserman (2015) Deep face recognition. In British Machine Vision Conference, Cited by: §1.
  • Y. Peng, M. C. Fu, J. Hu, and B. Heidergott (2018) A new unbiased stochastic derivative estimator for discontinuous sample performances with structural parameters. Operations Research 66 (2), pp. 487–499. Cited by: §1.
  • D. Petrov and T. M. Hospedales (2019) Measuring the transferability of adversarial examples. arXiv preprint arXiv:1907.06291. Cited by: §2.
  • [42] V. U. Prabhu and J. Whaley

    On grey-box adversarial attacks and transfer learning

    .
    online: https://unify. id/wpcontent/uploads/2018/03/greybox attack. pdf. Cited by: §2.
  • A. S. Ross and F. Doshi-Velez (2017) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In aaai, Cited by: §2.
  • R. Y. Rubinstein and A. Shapiro (1993) Discrete event systems: sensitivity analysis and stochastic optimization by the score function method. Wiley, New York. Cited by: §1.
  • D. Smilkov (2017) SmoothGrad: removing noise by adding noise. In arXiv preprint arXiv:1706.03825, Cited by: §4.5.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2014) Intriguing properties of neural networks. In iclr, Cited by: §1, §2, §4.1.
  • S. A. Taghanaki, K. Abhishek, S. Azizi, and G. Hamarneh (2019) A kernelized manifold mapping to diminish the effect of adversarial perturbations. In cvpr, pp. 11340–11349. Cited by: §2.
  • F. Tramer, N. Carlini, W. Brendel, and A. Madry (2020) On adaptive attacks to adversarial example defenses. In nips, Cited by: §2.
  • F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel (2018) Ensemble adversarial training: attacks and defenses. In iclr, Cited by: §2.
  • F. Tramèr, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel (2017) The space of transferable adversarial examples. arXiv. External Links: Link Cited by: §2.
  • I. Vasiljevic, A. Chakrabarti, and G. Shakhnarovich (2016) Examining the impact of blur on recognition by convolutional networks. arXiv preprint arXiv:1611.05760. Cited by: §2.
  • Y. Xiang, Y. Xu, Y. Li, W. Ma, Q. Xuan, and Y. Liu (2020) Side-channel gray-box attack for dnns. IEEE Transactions on Circuits and Systems II: Express Briefs. Cited by: §2.
  • L. Xiao, Y. Peng, J. Hong, Z. Ke, and S. Yang (2019) Training artificial neural networks by generalized likelihood ratio method: exploring brain-like learning to improve robustness. arXiv preprint arXiv:1902.00358. Cited by: §2.
  • W. Xu, D. Evans, and Y. Qi (2017) Feature squeezing: detecting adversarial examples in deep neural networks. In Network and Distributed System Security Symposium, Cited by: §2.
  • Z. You, J. Ye, K. Li, Z. Xu, and P. Wang (2019) Adversarial noise layer: regularize neural network by adding noise. In 2019 IEEE International Conference on Image Processing (ICIP), pp. 909–913. Cited by: §1, §1, §2.
  • S. Zheng, Y. Song, T. Leung, and I. Goodfellow (2016) Improving the robustness of deep neural networks via stability training. In Proceedings of the ieee conference on computer vision and pattern recognition, pp. 4480–4488. Cited by: §2.