Adversarial Image Generation and Training for Deep Convolutional Neural Networks

06/05/2020 ∙ by Ronghua Shi, et al. ∙ Beijing Didi Infinity Technology and Development Co., Ltd. 0

Deep convolutional neural networks (DCNNs) have achieved great success in image classification, but they may be very vulnerable to adversarial attacks with small perturbations to images. Moreover, the adversarial training based on adversarial image samples has been shown to improve the robustness and generalization of DCNNs. The aim of this paper is to develop a novel framework based on information-geometry sensitivity analysis and the particle swarm optimization to improve two aspects of adversarial image generation and training for DCNNs. The first one is customized generation of adversarial examples. It can design adversarial attacks from options of the number of perturbed pixels, the misclassification probability, and the targeted incorrect class, and hence it is more flexible and effective to locate vulnerable pixels and also enjoys certain adversarial universality. The other is targeted adversarial training. DCNN models can be improved in training with the adversarial information using a manifold-based influence measure effective in vulnerable image/pixel detection as well as allowing for targeted attacks, thereby exhibiting an enhanced adversarial defense in testing.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 7

page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep convolutional neural networks (DCNNs) have exhibited exceptional performance in image classification Krizhevsky et al. (2012); He et al. (2016); Huang et al. (2017)

, so they have been widely used in various real-world applications including face recognition

Sun et al. (2015), self-driving cars Bojarski et al. (2016), biomedical image processing Bakas et al. (2018), among many others Najafabadi et al. (2015)

. Despite of these successes, DCNN classifiers can be easily attacked by adversarial examples with perturbations imperceptible to human vision

Szegedy et al. (2013); Goodfellow et al. (2014); Su et al. (2019). This motivates the hot research in adversarial attacks and defenses of DCNNs. See Wiyatno et al. (2019); Ren et al. (2020) for reviews.

Existing adversarial attacks can be categorized into white-box, gray-box, and black-box attacks. Adversaries in white-box attacks have the full information of their targeted DCNN model, whereas their knowledge is limited to model structure in gray-box attacks and only to model’s input and output in black-box attacks. For instance, popular algorithms for white-box attacks include the fast gradient sign method Goodfellow et al. (2014); Kurakin et al. (2016), the projected gradient descent method Madry et al. (2017), the Carlini and Wagner attack Carlini and Wagner (2017), among many others Szegedy et al. (2013); Papernot et al. (2016); Moosavi-Dezfooli et al. (2016)

. Defensive techniques for those attacks include heuristic and certificated defenses. Adversarial training is the current most successful heuristic defense approach for improving the robustness of DCNNs, which simply incorporates adversarial samples into training but has better numerical performance than certificated defenses

Ren et al. (2020).

In this paper, we propose a simple yet efficient framework for white-box adversarial image generation and training for DCNN classifiers. For generating an adversarial example of a given image, our framework provides user-customized options in the number of perturbed pixels, misclassification probability, and targeted incorrect class. To the best of our knowledge, this is the first approach rendering all the three desirable options. The freedom to specify the number of perturbed pixels allows users to conduct attacks at various pixel levels such as one-pixel Su et al. (2019) and all-pixel Moosavi-Dezfooli et al. (2017) attacks. Particularly, we adopt a recent perturbation-manifold based first-order influence (FI) measure Shu and Zhu (2019) to efficiently locate the most vulnerable pixels to increase the attack success rate. In contrast with traditional Euclidean-space based measures such as Jacobian norm Novak et al. (2018) and Cook’s local influence measure Cook (1986), the FI measure captures the intrinsic change of the perturbed objective function Zhu et al. (2007, 2011) and shows better performance in detecting vulnerable images and pixels. Besides, our framework allows users to specify the misclassification probability and/or the targeted incorrect class. The prespecified misclassification probability is rarely seen in existing approaches, which produce an adversarial example either near the model’s decision boundary Moosavi-Dezfooli et al. (2016); Nazemi and Fieguth (2019) or with unguaranteed high confidence Nguyen et al. (2015)

. We tailor different loss functions accordingly to the three desirable options and their combinations, and apply the particle swarm optimization (PSO)

Kennedy and Eberhart (1995), a fast gradient-free method, to obtain the optimal perturbation. Moreover, we observe that our perturbations with high misclassfication probability can have certain adversarial universality Moosavi-Dezfooli et al. (2017) to images from different classes. For adversarial training, in training data we further utilize the FI measure to identify vulnerable images and their pixels that are prone to optional targeted classes. Then using our customized generation approach yields an adversarial dataset for training. Experiments show that our adversarial training significantly improves the robustness of pretrained DCNN classifers. Figure 1 illustrates the flowchart of our framework.

We notice that two recent papers Zhang et al. (2019); Mosli et al. (2019) also applied PSO to craft adversarial images. However, we have intrinsic distinctions. First, the two papers focus on black-box attacks, but ours is white-box. Article Zhang et al. (2019) only studied all-pixel attacks; although article Mosli et al. (2019) considered few-pixel attacks, but searched in random chunks to locate the vulnerable pixels, we use the FI measure to directly discover those pixels. Moreover, targeted attacks are not considered in Mosli et al. (2019), and both papers cannot prespecify a misclassification probability for the generated adversarial example. Our framework is able to design arbitrary-pixel-level, confidence-specified, and/or targeted/nontargeted attacks.

Our contributions are summarized as follows:

  • We propose a novel white-box framework for adversarial image generation and training for DCNN classifiers. It provides users with multiple options in pixel levels, confidence levels, and targeted classes for adversarial attacks and defenses.

  • We adopt a manifold-based FI measure to efficiently identify vulnerable images and pixels for adversarial perturbations.

  • We design different loss functions adaptive to user-customized specifications and apply the PSO, a fast gradient-free optimization, to obtain optimal perturbations.

  • We demonstrate the effectiveness of our framework via experiments on benchmark datasets and notice that our high-confidence perturbations may have certain adversarial universality.

Figure 1: Flowchart of our proposed framework.

2 Method

2.1 Perturbation-Manifold Based Influence Measure

Given an input image and a DCNN classifier with parameters , the prediction probability for class is denoted by . Let

be a perturbation vector in an open set

, which can be imposed on any subvector of . Let the prediction probability under perturbation be with .

For sensitivity analysis of DCNNs, Shu and Zhu Shu and Zhu (2019) recently have proposed an FI measure to delineate the ‘intrinsic’ perturbed change of the objective function on the Riemannian manifold of Zhu et al. (2007, 2011). In contrast with traditional Euclidean-space based measures such as Jacobian norm Novak et al. (2018) and Cook’s local influence measure Cook (1986), this perturbation-manifold based measure enjoys the desirable invariance property under diffeomorphic (e.g., scaling) reparameterizations of perturbations and has better numerical performance in detecting vulnerable images and pixels.

Let be an objective function of interest, for example, the cross-entropy . The FI measure at is defined by

(1)

where , with , and is the pseudoinverse of . A larger value of indicates that the DCNN model is more sensitive in to local perturbation around . We shall use the FI measure to discover vulnerable images or pixels for adversarial attacks.

2.2 Particle Swarm Optimization

Since introduced by Kennedy and Eberhart Kennedy and Eberhart (1995) in 1995, the PSO algorithm has been successfully used in solving complex optimization problems in various fields of engineering and science Poli (2008); Eberhart and Shi (2001); Zhang et al. (2015). Let be an objective function, which will be specified in Section 2.3 for adversarial scenarios. The PSO algorithm performs searching via a population (called swarm) of candidate solutions (called particles) by iterations to optimize the objective function . Specifically, let

(2)
(3)

where is the position of particle in an -dimensional space at iteration , is the total number of particles, and is the current iteration. The position, , of particle at iteration is updated with a velocity by

(4)

where is the inertia weight, and are acceleration coefficients, and and

are uniformly distributed random variables in the range

. Following Xu et al. (2019), we fix and . We can see that the movement of each particle is guided by its individual best known position and the entire swarm’s best known position. We shall use the PSO algorithm to obtain desirable adversarial perturbations under various user’s requirements.

2.3 Adversarial Image Generation

Given an image , we combine FI and PSO to generate its adversarial image with user-customized options for the number of pixels for perturbation, the misclassification probability, and the targeted class to which the image is misclassified, denoted by , , and , respectively.

Denote image . For an RGB image of pixels, we view the three channel components of a pixel as three separate pixels, so here. We let the default value of .

We first locate vulnerable pixels in for perturbation, if is specified but the targeted pixels are not given by the user. We compute the FI measure in (1) for each pixel based on the objective function

(5)

where . Denote to be the pixel with the -th largest FI value. We use as the pixels for adversarial attack and let perturbation .

We then apply the PSO algorithm in (2) and (4) to obtain an optimal value of that minimizes the adversarial objective function

where we assume , constrains the range of perturbation to guarantee the visual quality of the generated adversarial image compared to the original, is a misclassification loss function, represents the magnitude of perturbation, and and are prespecified weights. To ensure the misleading nature of the generated adversarial sample, is set to prioritize over .

We use different functions to meet different user-customized requirements on . If only is known, inspired by Meng (2018); Meng and Chen (2017), we let the misclassification loss function be

where is the label with the -th largest prediction probability from the trained DCNN for the input image added with perturbation . Since results in the minimum of , this loss function encourages PSO to yield a valid perturbation. If the -perturbed is prespecified with a misclassification probability , we use the misclassification loss function

Later in our experiments, we show that high is helpful to generate universal adversarial perturbations applicable to images from the other classes. If a targeted class is given, we choose the misclassification loss function

Furthermore, if both and are provided, we use

or equivalently .

Our procedure for generating a customized adversarial image is illustrated in Figure 1 (b)-(e) and also summarized in Algorithm 1.

1:Image and label , number of perturbed pixels , (optional) indices of perturbed pixels, (optional) misclassification probability , (optional) targeted incorrect label

, hyperparameters

in PSO, and maximum iteration number
2:If perturbed pixels are not specified, compute FI by (1) and (5) for all pixels to locate the pixels for perturbation ;
3:Initialize particles in PSO with positions and velocities ;
4:Repeat
5:for particle do
6:  Update and by (4);
7:  Update by (2);
8:end for
9: Update by (3);
10:Until convergence or iteration
11:Adversarial image , where .
Algorithm 1 Adversarial image generation

2.4 Adversarial Training

We aim to create a set of adversarial images for a given trained DCNN model, and then fine-tune the model on the training data augmented with this adversarial dataset. To include as many adversarial images as possible, we do not specify a value to in Algorithm 1. Note that Algorithm 1 may not have a feasible solution when given with restrictive parameters such as small or small . To efficiently generate a batch of adversarial images, we first select a set of potentially vulnerable images by some modifications to Algorithm 1.

Specifically, given an image dataset , thresholds and targeted incorrect labels (if not given, the label with the second largest prediction probability), we first find , the set of all correctly classified images that have image-level FI (with ) and prediction probability . For each image in set , we generate its adversarial image by Algorithm 1 in which is the number of pixels with FI and is specified to . These generated adversarial images form an adversarial dataset. The whole procedure of our adversarial training is illustrated in Figure 1 and detailed in Algorithm 2.

1:Image set and labels , thresholds , targeted incorrect labels , and hyperparameters in Algorithm 1
2:For each correctly classified , compute the image-level FI (denoted by ) by (1) with and ;
3:Determine ;
4:For each , generate its adversarial image by Algorithm 1 with # of pixels with FI and .
5:Adversarial dataset
Algorithm 2 Adversarial dataset generation

3 Experiments

We conduct experiments on the two benchmark datasets MNIST and CIFAR10 using the ResNet32 model He et al. (2016). Data augmentation is used, including random horizontal and vertical shifts up to 12.5% of image height and width for both datasets, and additionally random horizontal flip for CIFAR10 data. Table 1 shows the prediction accuracy of our trained ResNet32 for the two datasets.

MNIST CIFAR10
Model Training (n=60k) Testing (n=10k) Training (n=50k) Testing (n=10k)
Original 99.76% 99.25% 98.82% 91.28%
Adversarial 99.68% 99.32% 99.10% 91.32%
Table 1: Accuracy of original and adversarial trained ResNet32 models
Figure 2: Pixel-level FI maps of an MNIST image for different target classes.
Figure 3: Pixel-level FI maps of a CIFAR10 image’s RGB channels for different target classes. Class labels: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) = (plane, car, bird, cat, deer, dog, frog, horse, ship, truck).

3.1 Customized Adversarial Image Generation

We consider two images with easy visual detection and large image-level FI in MNIST and CIFAR10, shown in Figures 2 and 3 with prediction-probability graphs and pixel-level FI maps. The probability bar graphs imply candidate misclassification classes that can be used as . The FI maps indicate the vulnerability of each pixel to local perturbation and are useful to locate pixels for attack.

We first evaluate the performance of Algorithm 1 (cf. Figure 1 (b)-(e)) in generating adversarial examples of the two images according to different requirements on , and . Figures 4 and 5 show the generated adversarial images with corresponding perturbation maps. Perturbations 1-3 consider the settings with , , and , respectively, and with no specifications to and . For Perturbations 4-6, we only specify , , and , respectively, assign no value to , and tune being the number of pixels with FI and to obtain feasible solutions from PSO. Perturbations 7-9 are prespecified with , , and for MNIST, and , , and for CIFAR10, respectively, being the number of pixels with FI , and no value for . The detailed parameter settings for Algorithm 1 are provided in Supplementary Material. We can see that the generated adversarial images have visually negligible differences from the originals and satisfy the prespecified requirements.

Figure 4: Adversarial examples of an MNIST image. Perturbations 1-3 are set with and , respectively; Perturbations 4-6 are with and , respectively; Perturbations 7-9 are with and , respectively. Perturbation maps are followed by resulting adversarial images.
Figure 5: Adversarial examples of a CIFAR10 image. Perturbations 1-3 are set with and attacked pixels (framed in the attacked channel’s color), respectively. Perturbations 4-6 are set with and , respectively. Perturbations 7-9 are set with and , respectively. Perturbation maps are followed by resulting adversarial images.

We also investigate the adversarial universality of Perturbation 6 shown in Figures 4 and 5, which have 99% prediction probability to Class 4. Table 2 shows the proportions of original correctly-classified images that are misclassified after added with the perturbations. The MNIST dataset has error rates at least 14.3% for all classes and some up to 100%, with a total rate above 87.5% in both training and testing sets. In particular, a remarkably large proportion of each class are misclassfied to Class 4 with a total rate of 62.2% and 64.5% for training and testing sets. Perturbation 6 for CIFAR10 also exhibits a certain extent of adversarial universality with non-targeted total error rates 3.92% and 6.19% and Class-4-targeted total rates 0.92% and 1.32% for training and testing sets, respectively. Figure 6 displays images from the other nine classes that are originally correctly classified with high probability but are misclassified (most with high probability) to Class 4 after added with Perturbation 6. These results indicate that our method may generate a universal adversarial perturbation, which particularly has the potential to misclassify images from different classes to the same specific class. The existence of universal adversarial perturbations may be attributed to the geometric correlations of decision boundaries between classes Moosavi-Dezfooli et al. (2017). An adversarial perturbation with very high confidence may have salient features of its resulting class and thus it may have strong power to drag other different images towards the decision boundary.

True Class 0 1 2 3 5 6 7 8 9 Total
MNIST Misclassifed 92.9 100 99.3 91.1 96.0 92.3 100 15.3 99.9 87.7
Training Misclass. to 4 81.1 36.9 55.8 84.3 75.5 49.6 95.3 14.2 68.8 62.2
MNIST Misclassifed 91.7 100 99.4 94.4 97.9 91.1 100 14.3 100 87.9
Testing Misclass. to 4 79.7 37.3 58.8 88.3 78.5 57.4 94.8 13.7 75.0 64.5
CIFAR10 Misclassifed 8.46 2.74 2.90 7.27 5.77 0.42 4.82 2.07 1.00 3.92
Training Misclass. to 4 1.27 0.04 0.89 1.65 1.45 0.12 2.56 0.30 0.06 0.92
CIFAR10 Misclassifed 11.2 5.62 6.27 12.36 8.29 0.94 6.81 3.05 2.61 6.19
Testing Misclass. to 4 2.17 0.10 1.48 2.52 2.34 0.31 2.81 0.32 0.21 1.32
Table 2: Proportions (in %) of original correctly-classified images that are misclassified after added with Perturbations 6 in Figures 4 and 5.
Figure 6: Results of universal adversarial perturbations (Perturbations 6 in Figures 4 and 5).

3.2 Adversarial Training

We consider using Algorithm 2 to generate adversarial datasets for adversarial training. Figure 7 shows the Manhattan plots of image-level FIs for correctly classified images and Figure 8 presents the heatmaps of confusion matrices. We can see that the distributions of image-level FIs and the patterns of misclassifications are very close between training and test datasets in both MNIST and CIFAR10. Hence, our adversarial training is expected to be useful for unseen adversarial examples generated from similar mechanisms in testing.

Based on the two figures, for selecting vulnerable images (cf. Figure 1(a)), we let , be the most frequent misclassified class of ’s true class, and in Algorithm 2 . The resulting image set is likely to be near the decision boundaries of the trained classifier. We then set in the algorithm. We generate adversarial datasets Adv1 () and Adv2 (), respectively, from training and testing sets of MNIST, and Adv3 () and Adv4 () from those of CIFAR10. Adv1 and Adv3 are used for adversarial training (cf. Figure 1(f)), whereas Adv2 and Adv4 test the adversarial trained models. The detailed parameter settings for Algorithm 2 to generate those datasets are given in Supplementary Material.

The adversarial trained ResNet32 models are trained from the original trained models on the training data augmented with Adv1 and Adv3, respectively, for additional 30 epochs for MNIST and 50 epochs for CIFAR10. The results of adversarial training are reported in Tables 

1 and 3. Since the adversarial datasets () are much smaller than original testing datasets () and the original trained models already have high accuracy, the results in Tables 1 are only slightly improved on the test datasets. However, in Table 3, the adversarial training on Adv1 and Adv3 indeed benefits the defense of the fine-tuned ResNet32 models against adversarial attacks. The accuracy is dramatically improved from 0.00% to 83.82% and 88.93% on Adv1 and Adv3, respectively, and also up to 76.92% and 63.01% on test-data derived Adv2 and Adv4, respectively. We also observe an increase of and , respectively, in accuracy on combined data of original test set and its adversarial samples for MNIST and CIFAR10. These results indicate that our approach can significantly improve the adversarial defense of DCNN classifiers.

Figure 7: Manhattan plots of image-level FIs for correctly classified images.
Figure 8: Heatmaps of confusion matrices.
MNIST CIFAR10
Model Adv1 Tr.+Adv1 Adv2 Ts.+Adv2 Adv3 Tr.+Adv3 Adv4 Ts.+Adv4
(n=136) (n=60k+136) (n=26) (n=10k+26) (n=244) (n=50k+244) (n=146) (n=10k+146)
Original 0.00% 99.53% 0.00% 98.99% 0.00% 98.34% 0.00% 89.97%
Adversarial 83.82% 99.64% 76.92% 99.26% 88.93% 99.05% 63.01% 90.91%
Table 3: Accuracy of original and adversarial trained ResNet32 models on adversarial datasets

4 Conclusion

This paper introduced an FI-and-PSO based framework for adversarial image generation and training for DCNN classifiers by accounting for the user specified number of perturbed pixels, misclassification probability, and/or targeted incorrect class. We used the perturbation-based FI measure to efficiently detect the vulnerable images and pixels to increase the attack success rate. We designed different misclassification loss functions to meet various user’s specifications and obtained the optimal perturbation by the fast PSO algorithm. Experiments showed good performance of our approach in generating customized adversarial samples and associated adversarial training for DCNNs.

Broader Impact

DCNN models for image classification are widely used in various real-world applications such as self-driving cars and face recognition for identification, but they can be vulnerable to adversarial attacks with small perturbations to original images, resulting in safety and security concerns in the above mentioned applications. Our proposed white-box framework for adversarial image generation and training for DCNN classifiers may help developers to test and fortify their DCNN-based products to improve reliability in the real-world applications.

References

  • [1] S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, R. T. Shinohara, C. Berger, S. M. Ha, M. Rozycki, et al. (2018)

    Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge

    .
    arXiv preprint arXiv:1811.02629. Cited by: §1.
  • [2] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, X. Zhang, J. Zhao, and K. Zieba (2016) End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316. Cited by: §1.
  • [3] N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pp. 39–57. Cited by: §1.
  • [4] R. D. Cook (1986) Assessment of local influence. Journal of the Royal Statistical Society: Series B (Methodological) 48 (2), pp. 133–155. Cited by: §1, §2.1.
  • [5] R.C. Eberhart and Y. Shi (2001) Particle swarm optimization: developments, applications and resources. In

    Proceedings of the 2001 congress on evolutionary computation (IEEE Cat. No. 01TH8546)

    ,
    Vol. 1, pp. 81–86. Cited by: §2.2.
  • [6] I. J. Goodfellow, J. Shlens, and C. Szegedy (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §1, §1.
  • [7] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    ,
    pp. 770–778. Cited by: §1, §3.
  • [8] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger (2017) Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708. Cited by: §1.
  • [9] J. Kennedy and R. Eberhart (1995) Particle swarm optimization. In Proceedings of ICNN’95-International Conference on Neural Networks, Vol. 4, pp. 1942–1948. Cited by: §1, §2.2.
  • [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105. Cited by: §1.
  • [11] A. Kurakin, I. Goodfellow, and S. Bengio (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533. Cited by: §1.
  • [12] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017)

    Towards deep learning models resistant to adversarial attacks

    .
    arXiv preprint arXiv:1706.06083. Cited by: §1.
  • [13] D. Meng and H. Chen (2017) Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 135–147. Cited by: §2.3.
  • [14] D. Meng (2018) Generating deep learning adversarial examples in black-box scenario. Electronic Design Engineering 26 (24), pp. 164–173. Cited by: §2.3.
  • [15] S. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard (2017) Universal adversarial perturbations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1765–1773. Cited by: §1, §3.1.
  • [16] S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard (2016) Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2574–2582. Cited by: §1, §1.
  • [17] R. Mosli, M. Wright, B. Yuan, and Y. Pan (2019) They might not be giants: crafting black-box adversarial examples with fewer queries using particle swarm optimization. arXiv preprint arXiv:1909.07490. Cited by: §1.
  • [18] M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic (2015) Deep learning applications and challenges in big data analytics. Journal of Big Data 2 (1), pp. 1. Cited by: §1.
  • [19] A. Nazemi and P. Fieguth (2019) Potential adversarial samples for white-box attacks. arXiv preprint arXiv:1912.06409. Cited by: §1.
  • [20] A. Nguyen, J. Yosinski, and J. Clune (2015) Deep neural networks are easily fooled: high confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 427–436. Cited by: §1.
  • [21] R. Novak, Y. Bahri, D. A. Abolafia, J. Pennington, and J. Sohl-Dickstein (2018) Sensitivity and generalization in neural networks: an empirical study. In International Conference on Learning Representations, Note: arXiv preprint arXiv:1802.08760 Cited by: §1, §2.1.
  • [22] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami (2016) The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P), pp. 372–387. Cited by: §1.
  • [23] R. Poli (2008) Analysis of the publications on the applications of particle swarm optimisation. Journal of Artificial Evolution and Applications 2008. Cited by: §2.2.
  • [24] K. Ren, T. Zheng, Z. Qin, and X. Liu (2020) Adversarial attacks and defenses in deep learning. Engineering. Cited by: §1, §1.
  • [25] H. Shu and H. Zhu (2019) Sensitivity analysis of deep neural networks. In

    Proceedings of the AAAI Conference on Artificial Intelligence

    ,
    Vol. 33, pp. 4943–4950. Cited by: §1, §2.1.
  • [26] J. Su, D. V. Vargas, and K. Sakurai (2019) One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23 (5), pp. 828–841. Cited by: §1, §1.
  • [27] Y. Sun, D. Liang, X. Wang, and X. Tang (2015) Deepid3: face recognition with very deep neural networks. arXiv preprint arXiv:1502.00873. Cited by: §1.
  • [28] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1, §1.
  • [29] R. R. Wiyatno, A. Xu, O. Dia, and A. de Berker (2019) Adversarial examples in modern machine learning: a review. arXiv preprint arXiv:1911.05268. Cited by: §1.
  • [30] G. Xu, Q. Cui, X. Shi, H. Ge, Z. Zhan, H. P. Lee, Y. Liang, R. Tai, and C. Wu (2019) Particle swarm optimization based on dimensional learning strategy. Swarm and Evolutionary Computation 45, pp. 33–51. Cited by: §2.2.
  • [31] Q. Zhang, K. Wang, W. Zhang, and J. Hu (2019) Attacking black-box image classifiers with particle swarm optimization. IEEE Access 7, pp. 158051–158063. Cited by: §1.
  • [32] Y. Zhang, S. Wang, and G. Ji (2015) A comprehensive survey on particle swarm optimization algorithm and its applications. Mathematical Problems in Engineering 2015. Cited by: §2.2.
  • [33] H. Zhu, J. G. Ibrahim, S. Lee, and H. Zhang (2007) Perturbation selection and influence measures in local influence analysis. The Annals of Statistics 35 (6), pp. 2565–2588. Cited by: §1, §2.1.
  • [34] H. Zhu, J. G. Ibrahim, and N. Tang (2011) Bayesian influence analysis: a geometric approach. Biometrika 98 (2), pp. 307–323. Cited by: §1, §2.1.