Deep neural networks (DNNs) currently define state-of-the-art performance in standard image classification tasks. However, Szegedy et al. proposed that the advanced classifiers may be fooled by an imperceptible perturbation called adversarial samplesSzegedy et al. (2013). It raises concern about the intrinsic vulnerability of DNNsGoodfellow et al. (2014); Carlini and Wagner (2017).
The defenders try hard to gain the initiative in the adversarial arms race to resist the rapid development of adversarial attacks. Researchers propose many empirical and certified defense methods to obtain robust networks. The certified defense methods are supported by rigorous theoretical security guarantees, which could steadily expand the robustness radius. However, transferring it to large datasets is not accessible due to the high computational costTjeng et al. (2017). And adversarial training is currently the most flexible and effective empirical defense method by enhancing the training set with adversarial samples generated dynamicallyMadry et al. (2017). Nevertheless, previous research has testified the widespread transferability of adversarial examplesPapernot et al. (2016); Ilyas et al. (2019); Inkawhich et al. (2020). And there is a demand for implicit transferability in further research because adversarial training depends on specific attack algorithms for augmented dataHendrycks et al. (2021), which makes the defender hard to and appear passive in the arms race. On the contrary, a rational ensemble strategy is an effective defense method in practiceKurakin et al. (2018); Liu et al. (2020). A recent study presents the dynamic defense framework (DDF) based on the stochastic ensembleQin et al. (2021). The DDF would change the ensemble states based on variable model attributes of the architecture and smooth parameters. It expects heterogeneous candidate models to ensure diverse ensemble statuses.
We propose the dynamic stochastic ensemble with adversarial robust lottery ticket subnetworks. Based on the Lottery HypothesisFrankle and Carbin (2018), Fu et al. discover the subnetworks with inborn robustness. It matches or surpasses the adversarially trained networks with the same structure without any model training.Fu et al. (2021). Inspired by Fu et al., our method obtained subnetworks with different network structures and remaining ratios to promote the adversarial transferable diversity for the DDF. By weakening the transferability between ensemble states, we improve the initiative of the DDF against the adversary.
In the framework of dynamic defense, we represent the dynamic stochastic ensemble with adversarial robust lottery ticket subnetworks. Fu et al. (2021) proved the poor adversarial transferability between the scratch tickets under a single structure. Drawing inspiration from prior works, we further explore the adversarial transferable diversity from the different fundamental structures and remaining ratios.
2.1 The Dynamic Defense Framework and Adversarial Transferable Diversity
The DDF is a randomized defense strategy to protect ensemble gradient information, and the essential requirements for it are randomness and diversity to promote the ensemble’s adversarial robustnessQin et al. (2021). It presents a model ensemble defense method with randomized network parameter distribution specialty, which causes an unknowable act of the defender. The output of dynamic stochastic ensemble model containing I number of models is defined as follows:
The randomness is achieved by transferring the ensemble states with ensemble variables . The DDF demands the construction of diversified ensemble statuses with a heterogeneous model library. Relevant studies highlight that diverse network structure plays a crucial role in ensemble defenseYang et al. (2020). In our solution, we evaluated the heterogeneousness and diversity between ensemble subnetworks by the poor adversarial transferability of the attacks.
2.2 Adversarial Robust Lottery Subnetwork
For purpose of testifying the multi-sparsity adversarial robust lottery subnetworks can achieve better adversarial transferable diversity under different network structures. We picked four representative network structures, ResNet18, ResNet34, WideResNet32, and WideResNet38He et al. (2016); Zagoruyko and Komodakis (2016), as the basic architecture of our experiments and gained the sparse lottery ticket from original dense networks. Following Fu et al. (2021), we applied adversarial training to gain robustness of our subnetworks during pruning. It can be expressed as a min-max problem as Eq.2.
presents the loss function,f is a randomly initialized network with random weights, and is the perturbation with maximum value . In order to satisfy the sparsity of the subnetworks, we set a learnable weight and a binary weight that correspond to its dimensionsSehwag et al. (2020); Ramanujan et al. (2020). is meant to activate a small number of primary weights . With the primary network parameters weighted by , f can be effectively trained by small perturbations added to the input .
2.3 Dynamical Ensemble for The Lottery Subnetworks
Through our method, we obtained fourty subnetworks with different basic structures and sparsity. Based on the robust lottery subnetwork library, we define the randomized ensemble attribute parameter , which determines the ensemble states. It can be achieved in the following steps:
(A)Construct a robust lottery subnetworks library with adversarial transferable diversity, including forty sparse subnetworks. Each of ResNet18/ResNet34/WideResNet32/WideResNet38 owns ten.
(B)Set the range for and s. We brought four basic structures into the selection rather than the entire library, increasing the possibility of including more structures. It realized by , randomly assigned to determine the corresponding structure chosen. One means selected, and 0 means rejected. represents the distribution of the sparsity of each candidate structure, and each sparsity also refers to the corresponding subnetwork. Under every candidate structure, there are k number of sparse subnetworks whose .
(C)Randomly select ensemble number . presents the chosen number for each candidate structure. We set as the fraction of the total ensemble number , and when
. In particular, we gave a higher probability to smaller, whose . Since our experiments fed the attacker with the structure and sparsity of ensemble subnetworks, we expect to reduce the probability of a universal adversarial sample through our probabilistic solutionMoosavi-Dezfooli et al. (2017).
(D)Set by and . According to determined by the distibution of , we got total ensemble number . Meanwhile, randomly select ensemble sparsity from , representing the corresponding subnetworks attending the ensemble.
(E)According to the ensemble variable , we make the ensemble in the light of Eq.1.
3 Experiments and Results
In this section, we verify the widespread existence of robust lottery tickets and diversified adversarial transferability across different basic structures and sparsity. Then we design an adaptive attack on top of PGD-20 to evaluate the adversarial ensemble robustness of our method on CIFAR-10.
3.1 The Adversarial Transferability between Robust Lottery Ticket Subnetworks
We collect forty robust lottery ticket subnetworks with different sparsity based on ResNet18, ResNet34, WideResNet32, and WideResNet38, ten of each basic structure. As shown in Tab.1, we marked clean accuracy and robust accuracy against PGD-20 with =8 and illustrated the existence of adversarial robustness between lottery ticket subnetworks for different structures.
Fig.1 presents the pair-wise adversarial transferability with our lottery ticket subnetworks library tested under the same . We adopt PGD-20 attacks with =8 by constraint of . To make a fair comparison, we choose ResNet18 and WideResNet32 subnetworks with different sparsity as defense models, respectively, and pick Resnet34 and WideResNet38 subnetworks with the same sparsity to generate adversarial samples. And the distribution of sparsity with 0.07, 0.2, and 0.6 is set for models. The number represents the robust accuracy of the defense models against transferal attacks with different structures.
As shown in Fig.1, ResNet18 and WideResNet32 subnetworks with different sparsity possess poor adversarial transferability against adversarial samples generatd with the same network. Compared with (b) and (c), it is said that combination of different structures’ subnetworks could weaken the adversarial transferability for attacks. E.g., except for the diagonal number, the accuracy of ResNet18 against adversarial samples generated by the same structure is 65.5%69.9%. It raises to 66.5%73.6% and 66.9%70.9% facing transferable attacks by ResNet34 and WideResNet38 subnetworks. Likewise, WideResNet32 subnetworks’ accuracy was raised for 5.7%9.3% and 2%5.9%.
3.2 Emsemble Robustness
In this section, we validate the effectiveness of our defense strategy. We set the adversarial training as the baselines and compare our method with R2SFu et al. (2021), which ensemble different remaining ratios from the same networks.
Evaluation setup. Since both the R2S and our method could adjust the probability for their sparsity choices, we assume that both adopt uniform sampling from the same sparsity with 2.3 for simplicity. Moreover, we design an adaptive attack based on the Expectation over Transformation(EOT)Athalye et al. (2018) that generates adversarial examples via the expectations of the gradients from all candidate robust lottery subnetworks, which achieves the attack effect by promoting the transferability of adversarial samples, traversing the possibility of defense strategy.
We set adversarial trained ResNet18/WideResNet32 dense networks as baselines and compared our method with the R2S. In order to comprehensively and accurately observe the defense effect, we adopt multiple =[ 0, 2, 4, 8, 12, 20] with white-box attack in the norm. For the EOT attack, we announced the network structures and remaining ratio of the lottery ticket library so that the attacker could sample the expectation of different ensemble statuses.
|Method||clean acc(%)||clean acc(%)||robust acc(%)||robust acc(%)|
As shown in Tab.2, the robust accuracy of our method is 3.02%10.12% higher than R2S. Meanwhile, R2S dropped 3.59%3.67% in clean accuracy, while the dynamic ensemble for different structures raised 1.08% compared to adversarial training. For adversarial training, our method got a 15.42% raise over the robust accuracy. In addition, Fig.2 shows that our method has better adversarial robustness than the R2S and adversarial training under the overall environment with multiple perturbations.
In this paper, we propose the dynamic ensemble method based on adversarial lottery ticket subnetworks, which describes how the diversity of ensemble robustness is presented as adversarial transferability among subnetworks. We gather different basic structures and sparsity for each robust lottery ticket subnetwork. Furthermore, we make poor adversarial transferability and diversified ensemble statuses between models by picking stochastic ensemble models. Our experiments show that diversified structures and sparsity of scratch tickets weaken the adversarial transferability for subnetworks and improve the adversarial robustness of the ensemble method.
Synthesizing robust adversarial examples.
International conference on machine learning, pp. 284–293. Cited by: §3.2.
-  (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Cited by: §1.
-  (2018) The lottery ticket hypothesis: finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635. Cited by: §1.
-  (2021) Drawing robust scratch tickets: subnetworks with inborn robustness are found within randomly initialized networks. Advances in Neural Information Processing Systems 34, pp. 13059–13072. Cited by: §1, §2.2, §2, §3.2.
-  (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §1.
-  (2016) Deep residual learning for image recognition. In , pp. 770–778. Cited by: §2.2.
-  (2021) Unsolved problems in ml safety. arXiv preprint arXiv:2109.13916. Cited by: §1.
-  (2019) Adversarial examples are not bugs, they are features. Advances in neural information processing systems 32. Cited by: §1.
Transferable perturbations of deep feature distributions. arXiv preprint arXiv:2004.12519. Cited by: §1.
-  (2018) Adversarial attacks and defences competition. In The NIPS’17 Competition: Building Intelligent Systems, pp. 195–231. Cited by: §1.
-  (2020) Enhancing certified robustness via smoothed weighted ensembling. arXiv preprint arXiv:2005.09363. Cited by: §1.
Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083. Cited by: §1.
-  (2017) Universal adversarial perturbations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1765–1773. Cited by: §2.3.
-  (2016) Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277. Cited by: §1.
-  (2021) Dynamic defense approach for adversarial robustness in deep neural networks via stochastic ensemble smoothed model. arXiv preprint arXiv:2105.02803. Cited by: §1, §2.1.
-  (2020) What’s hidden in a randomly weighted neural network?. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11893–11902. Cited by: §2.2.
-  (2020) Hydra: pruning adversarially robust neural networks. Advances in Neural Information Processing Systems 33, pp. 19655–19666. Cited by: §2.2.
-  (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1.
-  (2017) Evaluating robustness of neural networks with mixed integer programming. arXiv preprint arXiv:1711.07356. Cited by: §1.
-  (2020) DVERGE: diversifying vulnerabilities for enhanced robust generation of ensembles. Advances in Neural Information Processing Systems 33, pp. 5505–5515. Cited by: §2.1.
-  (2016) Wide residual networks. arXiv preprint arXiv:1605.07146. Cited by: §2.2.