Deep Neural Networks (DNNs) have led to tremendous success in various applications, such as image classification Hinton et al. [2012b], speech recognition Hinton et al. [2012a], medical applications Hung et al.  and etc. Wide deployment of DNNs has raised several major security concerns Goodfellow et al. , Akhtar and Mian , Chen et al. [2017a]. For example, in the context of image classification, an adversarial example is a carefully modified image that is visually imperceptible to human eyes, but fools the DNN successfully Goodfellow et al. . Recently, there have been a cohort of works toward developing new adversarial attack techniques, which have exposed the underlying vulnerability of DNN Athalye et al. , Madry et al. . In order to counter adversarial attacks, several works have proposed different techniques, such as training the network with adversarial samples Madry et al. , Goodfellow et al. , regularization He et al. , Lin et al.  and other various methods Raghunathan et al. , Samangouei et al. .
In a separate yet related track, investigation towards generating efficient and compact networks have also been accelerated. Many prior works have been conducted regarding compression techniques including quantization Zhou et al. , Courbariaux et al. [2015, 2016], He and Fan  and weight pruning Han et al. [2015b], Molchanov et al. , Wen et al. . It is shown that many DNNs can function properly (no accuracy loss) even after significantly (>90%) network pruning Han et al. [2015a], Molchanov et al. . Such sparse DNN achieves significant speed-up and compression rate which opens the door for DNN in memory and resource constraint applications Han et al. . Previously, several works have tried to generate both sparse and robust networks Guo et al. , Ye et al.  by combining the network pruning (i.e. compactness) technique to defend adversarial examples. However, their efforts either suffer from poor test data accuracy or do not improve the robustness significantly.
Overview of RSR.
In this work, we propose a multi-objective optimization mechanism that could lead these two different yet related tracks, namely pursuit of network robustness and compression, to merge. To achieve this objective, we propose a novel Robust Sparse Regularization (RSR) method which integrates several regularization techniques to achieve such dual optimization. First, we propose to train a DNN with channel-wise noise injection (CNI) embedded with adversarial training to improve network robustness. Such technique injects a channel-wise Gaussian noise which is trainable during adversarial training. CNI improves test accuracy for both clean and perturbed data. Second, in order to simultaneously achieve network compactness and robustness, we propose a new ensemble loss function including an-1 weight penalty term (i.e. lasso). Lasso regularization during adversarial training performs weight selection by constraining some weight values to a very small values as shown in figure 1. When training is done, we could prune the small weight values based on a threshold to achieve a sparse network. Our extensive experiments show that RSR training is an effective network pruning scheme to achieve improved robustness without sacrificing any clean data accuracy across different architectures.
2 Related Works
2.1 Adversarial Attack
Recent development of various powerful adversarial attack methods could totally fool a trained DNN through maliciously perturbed input data. Adversarial attacks can be generally classified into two major categories. First, white-box attack assumes the adversary has full access to the trained model and its parameters. Second, black-box attack assumes the adversary treats the model as a black box who could only access the inputs and output of the model. Now we briefly introduce various white-box and black-box attack methods which will be studied in this work.
As one of the efficient attack method, fast Gradient Sign Method (FGSM) Szegedy et al. 
uses a single-step to generate adversarial example. If we consider a vector inputand target label , FGSM alters each element of in the direction of its gradient w.r.t the loss . Similar as the original paper Goodfellow et al. , we define the generation of adversarial examples as:
Here, the perturbation constraint is . By varying the value of we can vary the attack strength. calculates the output of DNN with parameters . is the sign function. To ensure valid pixel range for the images we clip the value of such that .
Projected Gradient Descent (PGD) Madry et al.  is a powerful attack with multiple steps. It is an iterative version of FGSM, as the initialization. The perturbed data is updated during multi-step iteration process. At -th step it can be expressed as:
where is the projection space and is the step size. In Madry et al. , it is also claimed that PGD is a universal adversary among all the first-order adversaries since it relies only on first-order information.
In this work, we also evaluate our proposed method against a wide range of Black-Box attacks. Specifically we investigate the transferable adversarial attack Liu et al. . In transferable adversarial attack, the adversarial example is generated from source model to attack another target model. Both the source model and target model can be different but they are trained on the same training dataset. Moreover, Zero-th Order Optimization (ZOO) attack Chen et al. [2017b] is also investigated in this work. To perform ZOO attack it does not require training a substitute model, it can directly approximate the gradient just based on the input data and output scores from the model.
2.2 Adversarial Defenses
Several works Madry et al. , Goodfellow et al.  have proposed to jointly train the network with adversarial and clean samples, called adversarial training, to achieve network robustness. Later, development of backward pass differential attack (BPDA) Athalye et al.  has exposed the underlying vulnerability of many other defense methods relying on gradient obfuscation Dhillon et al. , Xie et al. . Then, training the network with adversarial examples has become one of the most popular defense approach to defend adversarial examples. Meanwhile, there is a cohort of work investigating the effect of regularization techniques such as quantization Rakin et al. , Lin et al.  , noise injection He et al. , Lecuyer et al. [2018a], Liu et al. , Yoshida and Miyato , Lecuyer et al. [2018b] and pruning Dhillon et al. , Ye et al. [2019, 2018], Guo et al.  to improve the robustness. Several previous works have investigated the effects of network pruning on robustness Guo et al. , Ye et al. [2019, 2018]. Recently, Ye et al.  proposed concurrent weight pruning and adversarial training to generate robust and sparse network. However, their ADMM based pruning method’s performance on smaller network (i.e, lesser width) suffers from poor test accuracy for both clean and adversarial data. Further, Guo et al.  showed that pruned network will defend adversarial examples provided that the network is not over-sparsified.
In this section, we first introduce the proposed Robust Sparse Regularization (RSR) technique, which is incorporated into a multi-objective optimization process that simultaneously improves network robustness and compactness. Our proposed RSR mainly consists of two components: 1) a trainable Channel-wise Noise Injection (CNI) and 2) lasso weight penalty (-1 norm) for model pruning, which will be introduced in this section.
3.1 Adversarial Training
Training the neural network with adversarial examples is a popular defense method Madry et al. , Goodfellow et al. . Since our method integrates with such adversarial training, we briefly introduce it first. The goal of adversarial training can be formalized as: if we have a set of inputs- and target labels- , adversarial training tries to obtain the optimal solution of network parameters (i.e, weights, biases) for the following min-max optimization problem:
where the min-max optimization is composed of inner maximization and outer minimization problem. For inner maximization we acquire the perturbed data as shown in the description of PGD attack Madry et al. . While the outer minimization is optimized through gradient descent method during network training.
3.2 Channel-wise Noise Injection
The first regularization technique used in RSR is to inject learnable channel-wise noise on weights during the DNN adversarial training process. Considering a convolution layer in DNN with 4-D weight tensor, where denotes number of output channel, input channel, kernel height and kernel width respectively, the Channel-wise Noise Injection (CNI) can be mathematically described as:
where is trainable noise scaling coefficient.. Note that, is the variance of that is statistically calculated in the run-time. Preliminary work He et al.  shows that parametric noise injection is an improved variant of adversarial training, where such trainable noise injection method could effectively regularize DNN during the adversarial training. We follow similar optimization and update rule for , but extending it into channel-wise version, where weights for each output channel shares the same noise scaling coefficient instead of whole layer.
We train the network with both clean and adversarial samples to achieve a good balance between adversarial data and clean test data accuracy. Optimization problem of equation 3 can be solved by minimizing the ensemble loss in equation 5. The ensemble loss is basically the weighted sum of losses for clean and adversarial data with channel-wise trainable noise injected on weights of DNN model:
where is the coefficient to balance the ensemble loss terms which is chosen as 0.5 by default. Optimizing the loss function improves network robustness. The optimizer tries to solve for both model parameters and to find an equilibrium between clean and perturbed data.
3.3 Lasso Weight Penalty
For incorporating the network pruning into the adversarial training, we propose to train the neural network with lasso weight penalty. Lasso is known as least absolute shrinkage and selection operator Tibshirani . It was introduced as a -1 regularizer that penalizes the features with higher values. Lasso is an ideal choice for weight pruning as it shrinks the lesser important weights to zero He et al. , Wang et al. , Wen et al. . We include the lasso weight penalty term into and reformat equation 5 as:
where denotes the weight tensor of -th layer, and is the total number of parametric layers (i.e., convolution and fully-connected layer). is the absolute sum of all the elements of a tensor. The effect of lasso weight penalty is determined by the coefficient , where in larger value would generate a sparse model containing a significant amount of weight with near zero values. We tune experimentally and describe the procedure for selecting optimized value of in section 4.2.
3.4 Weight Pruning
The proposed ensemble loss serves the purpose of multi-objective loss function. We expect a network after training with the ensemble loss to be more resilient to adversarial samples. At the same time, due to the presence of lasso weight penalty, we expect a significant portion of the weight tensor to converge to near zero values. We then perform weight pruning after training with the proposed ensemble loss function, by setting the weights below a certain threshold () to zero. Note that, after pruning, we remove the noise injection term for zero-value weights. As a result, during inference, we only add noise to the non-zero elements of the weight tensor. For the weight tensor in a fully connected layer, lets assume (). For convolution layer, (). Then, the pruning operation can be described as:
|FC layer if||(7)|
|Conv. layer if||(8)|
Here is the threshold, which is the least absolute non-zero value in the weight tensor after pruning. Again, we can tune the value of for different networks to achieve different sparsity ratio. Hence, by tuning the value of , we can effectively show the maximum amount of parameters that can be pruned without causing robustness degradation.
4.1 Experiment setup
Datasets and network architectures.
In this work, we only consider CIFAR-10 dataset for image classification task as most of the baseline works report their robustness in terms of under-attack accuracy on this dataset. CIFAR-10 is composed of 50K training samples and 10K test samples. Our data augmentation method is same as described in He et al. . Attacker can directly add noise to the natural images as our data normalization layer is placed in front of the DNN as a non-trainable layer. We adopt three classical networks, ResNet-20, ResNet-18 He et al.  and VGG-16 Simonyan and Zisserman , to perform comparative analysis. We also show the analysis on the effect of network width by varying the width of ResNet-18 network. We report the mean accuracy with 5 trials due to the presence of randomness in both CNI and PGD Madry et al. . We tune the hyper parameter to be for both ResNet-18 and VGG-16 and for ResNet-20.
In order to attack CIFAR-10 dataset using PGD attack, is set to , and is set to 7. For FGSM, attack parameters (i.e, ) remain the same as PGD. We use the same attack hyper parameters as in Madry et al. . Moreover, we also conduct the RSR defense against several state-of-the-art black-box attacks (i.e. ZOO Chen et al. [2017b] and transfer Liu et al.  attack) in order to evaluate the proposed RSR against a wide range of attacks.
Competing methods for adversarial defense.
In this work, PGD adversarial training Madry et al.  is selected as the primary baseline method. Additionally, our work includes channel-wise noise injection, so we also compare the method with parametric noise injection (PNI) He et al. . Additionally, we also compare our work with several network compression and pruning methods Ye et al. , Lin et al. . Finally, comparison with several state-of-the art regularization techniques serving as a adversarial defense Lecuyer et al. [2018a], Liu et al.  is also presented.
Our simulation results on two popular white-box attack PGD Madry et al.  and FGSM Goodfellow et al.  are presented in table 1. During adversarial training, as stated in section 3.1, we use PGD algorithm to generate the adversarial samples. First, for the regular models, we do not perform any weight pruning. RSR helps to achieve significant robustness enhancement and even improves the clean data accuracy compared to baseline PGD training Madry et al. . We observe that, with increasing the model capacity, network robustness increases as well. The observation of robustness enhancement with increasing network capacity is consistent with previous works Madry et al. , He et al. . For our proposed RSR, the pattern remains the same. Our best accuracy was obtained using VGG-16 network. We could improve the clean test data accuracy by 0.95% and perturbed data accuracy by 9.48% Under strong PGD attack for VGG-16.
Our proposed RSR can prune 60%, 85% and 50% of the network’s weight for ResNet-20, ResNet-18 and VGG-16, respectively, without any clean test accuracy loss. To show the comparative effect of network robustness and sparsity, we prune each of the four training cases (PGD/ CNI/ Lasso/ RSR) by equal amount. The level of sparsity can always be tuned by choosing different values of . As expected, both PGD and CNI performance suffers significantly after pruning. On the contrary, RSR outperforms baseline PGD (even without pruning) training method. We observe 8.72% and 6.83% improvement on test accuracy under PGD and FGSM attack respectively for ResNet-18 architecture. Again the most significant improvement observed in the VGG-16 network which has the largest capacity. Such observation confirms that increasing the number of parameters increases the effect of weight penalty and noise injection to enhance the network robustness. Another question to be asked is what if we want to prune the network beyond the reported sparsity. For example, if we want to prune ResNet-18 beyond 85% , does the network still remain robust? We try to answer this question in the next paragraph where we explain the effect of network width with sparsity.
Effect of Network Width.
Ye et al.  demonstrated that decreasing a network width may have negative impact on robustness. To verify if our method also follows the same trend, we show an ablation study on ResNet-18 with decreasing network width in table 2. Our observation confirms that RSR method still remains more robust than the baseline PGD method for each case (i.e., ). On the other side, we achieve less sparsity on network with smaller network width. In case, we could only achieve 38.33% sparsity without sacrificing any clean or perturbed data accuracy. This observation is quite intuitive as ResNet-18() network already has less parameters than that of ResNet-18 (). Thus, even after performing less amount of weight pruning, the percentage of parameter (7.7%) in the network would still be smaller compared to ResNet-18() (14.37%). Finally, such observation also answers the question asked previously: A particular architecture (e.g., ResNet-18) can be pruned up to a certain amount of sparsity levels based on the network width. The maximum number of parameters which can be pruned without any sacrifice in robustness may vary across different architectures. Figure 1(a) shows ResNet-20, ResNet-18 and VGG-16 test accuracy under PGD attack starts to drop at different sparsity levels (% weight equal to zero). If any model is sparsified beyond this point, it falls under the definition of over-sparsified model Guo et al.  and the network no longer remains robust.
Robustness improvement coming from Lasso training? or CNI training? or Both?
We have provided comprehensive experimental analysis on our proposed RSR method to show its performance enhancement on three fronts: clean data accuracy, robustness (i.e. under attack accuracy) and sparsity. Table 1, confirms that lasso loss primarily contributes to the sparse model generation through weight shrinkage. However, in order to identify the chief contributor towards robustness improvement, an ablation study is shown in table 1, where we also report effect of training the network only with lasso loss and CNI, respectively. The regularization effect of lasso is less significant for ResNet-20 and CNI plays the dominant role in network robustness improvement. However, both lasso and Channel-wise noise injection contributes towards the improvement of robustness for redundant networks (i.e, VGG-16). Both lasso and CNI can improve the network robustness by close to 4 % and 7 %, respectively, on VGG-16. Nonetheless, we choose lasso because it helps shrink weight values to a very small value, thus performing a robust model selection during adversarial training. Apart from that, lasso regularization also supplements CNI towards defending adversarial examples .
Choice of Lambda ().
In figure 1(b), we show a plot of test accuracy on both clean and perturbed data versus Lambda() for ResNet-20. Clearly, both the test accuracy starts to drop if we increase beyond . So for ResNet-20 we choose as the standard value of to achieve the maximum sparsity without any degradation in test accuracy. Similarly, the value of for other architectures (i.e, ResNet-18, VGG-16) is optimized experimentally.
We report the black-box attack accuracy for ResNet-20 architecture in table 3. We test our defense method against un-targeted ZOO attack Chen et al. [2017b]. We randomly select 200 test samples to calculate the attack success rate. Our proposed method defends ZOO attack better as it decreases the attack success rate by 12 % compared to baseline PGD method.
To perform transferable attack on RSR and PGD, we use VGG-16 and ResNet-18 network as the source model. For both cases, our RSR performs on par with the PGD method. Additionally, our proposed RSR reports higher test accuracy against black-box attack compared to white-box PGD method. Better resistance against black-box attack is considered as a sign of a defense that does not effectively uses obfuscated or masked gradient Athalye et al.  as a defense tool.
Comparison to state-of-the art techniques.
In table 4, we summarize the performance of our defense in comparison to some other state-of-the-art defense techniques. Our proposed RSR method outperforms these comparative defenses and achieves significant robustness improvement.
|Model||ResNet-18||ResNet-20(4)||Wide ResNet||ResNet-18||Wide ResNet||ResNext||ResNet-18|
|Sparsity (%)||0||0||(6) Compression||-||0||0||85.36|
Note that, we compare with the unbroken defenses that are not reported to show signs of obfuscated gradients Athalye et al.  yet. Again there are some previous works on network pruning and robustness Dhillon et al.  which might suffer from gradient obfuscation Athalye et al. . Guo et al.  first theoretically shows the effect of pruning on non-linear DNN to demonstrate the vulnerability of over-sparsified model to adversarial attacks. However, we are the first to formulate an improved adversarial defense with sparse regularization. Our proposed RSR generates sparse and compact neural network that can achieve state-of-the-art under-attack accuracy and much improved robustness.
RSR is performing regularization.
Robust Sparse Regularization is performing regularization on the network to enhance both robustness and compactness. It does not show any obvious signs of gradient masking proposed in Athalye et al. . First, RSR performs better against single step attack (i.e, FGSM) compared to multiple step attack (i.e., PGD). Also we report higher test accuracy against black-box attack than white-box. Finally, increasing the attack strength linearly decreases the effectiveness of our defense. Such observations confirm primarily our robustness enhancement is not achieved through any gradient obfuscation or masking Athalye et al. . Instead, our improvement in robustness primarily comes from regularized training method. However, after pruning , presence of noise in the surviving weights is also playing a critical role. Table 5 summarizes the impact of the presence of noise during the inference on robustness.
|With inference Noise||Without Inference Noise||Baseline(PGD)|
When we disable the inference noise in the network, test accuracy under PGD attack drops to 41.61 %. This confirms the presence of inference noise contributing heavily towards the robustness. However, our regularization training still stands out as even after the accuracy drop we maintain a higher accuracy than the baseline. Thus we conclude the robustness achieved in this work can be viewed as a combined effect of regularization(i.e., CNI and lasso), sparsity and inference noise.
Optimal Gamma provides the improvement on three fronts.
We can fine-tune the model after training to prune the weights of a network below a certain threshold (). During training apart from enhancing robustness, RSR mainly shrinks the weights of the network. The demonstration of weight shrinkage is presented in figure 1(c). ResNet-18 network training with RSR contains 85% weights with near zero value (less than ). So pruning weights with such small values will have minimal effect on clean test accuracy and robustness. Thus, the value of can be tuned to an optimal point for each network to achieve improvement on three fronts: clean data accuracy, robustness(i.e., under attack accuracy) and sparsity.
We successfully co-optimize the objective of network robustness and compactness through our proposed RSR training method. As a result, we show that heavily sparse network can resist adversarial examples to generate both robust and compact neural network at the same time. Our proposed method performs dual optimization during training to resist state-of-art white-box and black-box attacks using a more compact network.
- Akhtar and Mian  N. Akhtar and A. Mian. IEEE Access, 6:14410–14430, 2018.
Athalye et al. 
A. Athalye, N. Carlini, and D. Wagner.
Obfuscated gradients give a false sense of security: Circumventing
defenses to adversarial examples.
In J. Dy and A. Krause, editors,
Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 274–283, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR. URL http://proceedings.mlr.press/v80/athalye18a.html.
- Chen et al. [2017a] H. Chen, H. Zhang, P.-Y. Chen, J. Yi, and C.-J. Hsieh. Show-and-fool: Crafting adversarial examples for neural image captioning. arXiv preprint arXiv:1712.02051, 2017a.
Chen et al. [2017b]
P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh.
Zoo: Zeroth order optimization based black-box attacks to deep neural
networks without training substitute models.
Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–26. ACM, 2017b.
- Courbariaux et al.  M. Courbariaux, Y. Bengio, and J.-P. David. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems, pages 3123–3131, 2015.
- Courbariaux et al.  M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830, 2016.
- Dhillon et al.  G. S. Dhillon, K. Azizzadenesheli, J. D. Bernstein, J. Kossaifi, A. Khanna, Z. C. Lipton, and A. Anandkumar. Stochastic activation pruning for robust adversarial defense. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=H1uR4GZRZ.
- Goodfellow et al.  I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Guo et al.  Y. Guo, C. Zhang, C. Zhang, and Y. Chen. Sparse dnns with improved adversarial robustness. In Advances in neural information processing systems, pages 242–251, 2018.
- Han et al. [2015a] S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015a.
- Han et al. [2015b] S. Han, J. Pool, J. Tran, and W. Dally. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pages 1135–1143, 2015b.
- Han et al.  S. Han, H. Shen, M. Philipose, S. Agarwal, A. Wolman, and A. Krishnamurthy. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, pages 123–136. ACM, 2016.
He et al. 
K. He, X. Zhang, S. Ren, and J. Sun.
Deep residual learning for image recognition.
Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- He et al.  Y. He, X. Zhang, and J. Sun. Channel pruning for accelerating very deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 1389–1397, 2017.
- He and Fan  Z. He and D. Fan. Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
- He et al.  Z. He, A. S. Rakin, and D. Fan. Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
- Hinton et al. [2012a] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012a.
- Hinton et al. [2012b] G. Hinton, N. Srivastava, and K. Swersky. Neural networks for machine learning. Coursera, video lectures, 264, 2012b.
- Hung et al.  C.-Y. Hung, W.-C. Chen, P.-T. Lai, C.-H. Lin, and C.-C. Lee. Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database. In 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 3110–3113. IEEE, 2017.
- Lecuyer et al. [2018a] M. Lecuyer, V. Atlidakis, R. Geambasu, D. Hsu, and S. Jana. Certified robustness to adversarial examples with differential privacy. ArXiv e-prints, 2018a.
- Lecuyer et al. [2018b] M. Lecuyer, V. Atlidakis, R. Geambasu, D. Hsu, and S. Jana. On the connection between differential privacy and adversarial robustness in machine learning. arXiv preprint arXiv:1802.03471, 2018b.
- Lin et al.  J. Lin, C. Gan, and S. Han. Defensive quantization: When efficiency meets robustness. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=ryetZ20ctX.
- Liu et al.  X. Liu, M. Cheng, H. Zhang, and C.-J. Hsieh. Towards robust neural networks via random self-ensemble. arXiv preprint arXiv:1712.00673, 2017.
- Liu et al.  Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770, 2016.
- Madry et al.  A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJzIBfZAb.
- Molchanov et al.  D. Molchanov, A. Ashukha, and D. Vetrov. Variational dropout sparsifies deep neural networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 2498–2507. JMLR. org, 2017.
- Raghunathan et al.  A. Raghunathan, J. Steinhardt, and P. Liang. Certified defenses against adversarial examples. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Bys4ob-Rb.
- Rakin et al.  A. S. Rakin, J. Yi, B. Gong, and D. Fan. Defend deep neural networks against adversarial examples via fixed anddynamic quantized activation functions. arXiv preprint arXiv:1807.06714, 2018.
- Samangouei et al.  P. Samangouei, M. Kabkab, and R. Chellappa. Defense-GAN: Protecting classifiers against adversarial attacks using generative models. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=BkJ3ibb0-.
- Simonyan and Zisserman  K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Szegedy et al.  C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus. Intriguing properties of neural networks. CoRR, abs/1312.6199, 2013.
- Tibshirani  R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288, 1996.
- Wang et al.  H. Wang, Q. Zhang, Y. Wang, and H. Hu. Structured pruning for efficient convnets via incremental regularization. arXiv preprint arXiv:1811.08390, 2018.
- Wen et al.  W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li. Learning structured sparsity in deep neural networks. In Advances in neural information processing systems, pages 2074–2082, 2016.
- Xie et al.  C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille. Mitigating adversarial effects through randomization. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=Sk9yuql0Z.
Ye et al. 
S. Ye, S. Wang, X. Wang, B. Yuan, W. Wen, and X. Lin.
Defending dnn adversarial attacks with pruning and logits augmentation, 2018.URL https://openreview.net/forum?id=S1qI2FJDM.
- Ye et al.  S. Ye, K. Xu, S. Liu, H. Cheng, J.-H. Lambrechts, H. Zhang, A. Zhou, K. Ma, Y. Wang, and X. Lin. Second rethinking of network pruning in the adversarial setting. arXiv preprint arXiv:1903.12561, 2019.
- Yoshida and Miyato  Y. Yoshida and T. Miyato. Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941, 2017.
- Zhou et al.  S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160, 2016.