Query-efficient Meta Attack to Deep Neural Networks

06/06/2019 ∙ by Jiawei Du, et al. ∙ National University of Singapore 0

Recently, several adversarial attack methods to black-box deep neural networks have been proposed and they serve as an excellent testing bed for investigating safety issues with DNNs. These methods generally take in the query and corresponding feedback from the targeted DNN model and infer suitable attack patterns accordingly. However, due to lacking prior and inefficiency in leveraging the query information, these methods are mostly query-intensive. In this work, we propose a meta attack strategy which is capable of attacking the target black-box model with much fewer queries. Its high query-efficiency comes from prior abstraction on training a meta attacker which can speed up the search for adversarial examples significantly. Extensive experiments on MNIST, CIFAR10 and tiny-Imagenet demonstrate that, our meta-attack method can remarkably reduce the number of model queries without sacrificing the attack performance. Moreover, the obtained meta attacker is not restricted to a particular model but can be reused easily with fast adaptive ability to attack a variety of models.



There are no comments yet.


page 8

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks have been widely applied and achieved state-of-art performance on a variety of tasks including image recognition [11, 25], object detection [8, 26], segmentation [17, 31], and speech recognition [12, 10]. Despite their great success, they are found to be susceptible to adversarial attacks and suffer dramatic performance degradation in front of adversarial examples, even when only tiny and invisible noise is imposed on the inputs [30].

To investigate the safety and robustness of deep neural networks, many adversarial attack methods have been developed which can be roughly categorized into white-box [9, 19, 2, 18] and black-box based ones [24, 1, 20]. In the white-box attack setting, the target model is transparent to the attacker and imperceptible adversarial noise can be easily crafted to mislead this model by leveraging its gradient information [9]

. In contrast, in the black-box setting, the structure and parameters of the target DNN model are invisible, and the adversary can only access the input-output pair in each query. With a sufficient number of queries, black-box methods utilize the returned information to attack the target model generally by estimating gradient

[13, 20] or performing evolutionary optimization [13].

Black-box attack is more feasible in realistic scenarios than white-box attack but these methods are usually much more query-intensive. Such a drawback is largely because the returned information of each queried example is sparse and very limited. During optimization, they simply integrate the information between two sequential iterations brutally and ignore the implicit but profound message. The returned information is thus not fully leveraged. Although emphasized in [13] that query-efficient algorithms for generating adversarial examples are very meaningful in practice, how to enhance query-efficiency for black-box attack remains an open problem. Studying this problem would also help better understand safety and robustness issues of deep neural networks.

We address a query-efficiency concerned attack problem in this work. Particularly, we consider only top-probability scores accessible from the target black-box model. With this practical but challenging scenario, we aim at three important objectives: lower query number, higher success rate and smaller noise magnitude. We develop a meta-learning based attack method, which applies meta learning to obtaining prior information from the successful attack patterns, and uses the prior for efficient optimization. Specifically, we propose to train a meta attacker model through meta learning [23], inspired by its success in solving few-shot learning problems. We first constitute several classification models and utilize them to get pairs of (images, gradients

) for max-margin logit classification loss. Then we use the data pairs of each classification model to train the meta attacker. After obtaining the attacker, we apply it to attacking a new black-box model for accelerating the search for adversarial examples, together with coordinate-wise gradient estimation and fine-tuning. Different with previous methods, we use the estimated gradient not only to update adversarial noise but to fine-tune the well-trained attacker. After few-shot fine-tuning, the attacker is able to simulate the gradient distribution of the target model.

We evaluate our method on MNIST [16], CIFAR10 [15] and tiny-ImageNet datasets [27], comparing it with Zoo [3], Decision-Boundary [1] and Opt-attack [4]. Overall, we save at least query numbers compared with baseline methods. In both targeted and untargeted settings, we achieve comparable attack success rate and adversarial perturbation to all baselines but with a significantly reduced query number, which demonstrates the superior query-efficiency of our meta attacker.

2 Related Work

Many existing attack methods fall within the white-box setting, where detailed information about the model, i.e. gradients and losses, is provided. Some classical examples are Fast-Gradient Sign Method (FGSM) [9], IFGSM [18], DeepFool [19] and C&W attack [2][24] is the first work to explore black-box attack. It tries to construct a substitute model with augmented data and transfer the black-box attack problem to a white-box one. However, its attack performance is very poor due to the limited transferability of adversarial examples between two different models. [1] considers a more restricted case where only top-1 prediction classes are returned and proposes a random-walk based attack method around the decision boundary. This method dispenses class prediction scores and hence requires extensive model queries. Zoo [3] is a black-box version of C&W attack, achieving a similar attack success rate and comparable visual quality as many white-box attack methods. However, its coordinate-wise gradient estimation requires extensive model evaluations. More recently, [13] proposes a query-limited setting with noise considered, and uses a natural evolution strategy (NES) to enhance query efficiency. Though this method successfully controls the query number, the noise imposed is larger than average. [20] proposes a novel local-search based technique to construct numerical approximation to the network gradient, which is then carefully used to construct a small set of pixels in an image to perturb. It suffers a similar problem as in [3] for pixel-wise attack. [4] considers a hard-label black-box setting and formulates the problem as real-valued optimization that is solved by a zeroth order optimization algorithm.

We briefly introduce some works on meta-learning related to our work. Meta-learning is a process of learning how to learn. A meta-learning algorithm takes in a distribution of tasks, each being a learning problem, and produces a quick learner that can generalize from a small number of examples. Meta-learning is very popular recently for its fast adaptive ability. MAML [6] is the first to propose this idea. Recently, a simplified algorithm Reptile [23] which is an approximation to the first-order MAML is proposed, achieving higher efficiency in computation and consuming less memory. With these superior properties, meta learning is applied to adversarial attack methods [32, 5]. [32] considers attacking a graph neural network by modifying training data to worsen the performance after training. [5] just investigates the susceptibility of MAML to adversarial attacks and the transferability of the obtained meta model to a specific task, with only limited results obtained.

3 Method

3.1 Black-box Attack Schemes

We first formulate the black-box attack problem and introduce the widely used solutions. We use to denote the pair of a natural image and its true label , and to denote the adversarial perturbed version of and the returned label by the original classification model, . The problem of finding an adversarial example with imperceivable difference from to fail the target model prediction can be formulated as


where denotes the norm. is the returned logit or probability by the target model . The first term imposes a constraint on the perturbation magnitude to enforce high similarity between the clean image and the adversarial one . The second term measures whether is adversarial or not. One simple example is . is a trade-off parameter.

White-box attack can utilize the gradient to perform gradient descent . However in black-box attack we are considering in this work, is not attainable. One can only query the model to attack and receive feedback from model evaluation. In each model evaluation, the target model receives an image and returns some useful information, such as hard label, logit, probability score. Each model evaluation is a query, and such information is always limited from each query. How to design a query efficient algorithm is a big concern.

Recently, zeroth-order optimization approaches circumvent the challenge of lacking the precise gradient map by approximating it via model evaluation [7, 22]. In black-box attack, one can apply such an estimation strategy to every pixel of an image to obtain the corresponding gradient information. Formally, one can evaluate the gradient of the -coordinate by


where is a very samll number and is the -th elementary basis. After obtaining the estimated gradient, classical optimization algorithms [21, 14] can be used to infer the adversarial examples. Though the estimated gradient may not be accurate, the convergence of these zeroth-order methods is guaranteed under mild assumptions [7, 22].

However, naively applying the above gradient estimation to black-box attack would suffer dimension explosion since it is coordinate-wise. For example, consider attacking a DNN classification model using an image from the tiny-Imagenet dataset. The image size is and one needs to compute two close values at each pixel. Thus, obtaining a full gradient estimate will consume more than queries and model evaluations. This is not affordable in practice and is also the fundamental limitation—the high query cost—of existing black-box attack methods [3]. In this work, we address such a limitation via developing a query-efficient meta-learning based attack model.

3.2 Meta Attacker Training Method

To reduce the query cost for black-box attack, we apply meta learning to training a meta attacker model, inspired by its recent success in few-shot learning problems [6, 23]. The meta attacker learns to extract useful prior information of the gradient of a variety of models w.r.t. specific input samples and thus it can infer the gradient for a new target model using only a few queries. After obtaining such a meta attacker, we can replace the above zeroth-order gradient estimation with it to directly estimate gradient with only a few queries.

Specifically, we first collect a set of pre-trained classification models with fully accessible gradient information. We feed an image into these models and compute the following classification loss for each model :


Here is the groundtruth label and indexes other classes. is the probability score of the true label predicted by the model , and denotes the probability scores of other classes. We perform one step back-propagation w.r.t. the input images on the loss (3) and obtain the corresponding gradient for the input . We collect a set of () pairs and use them to train the meta attacker.

The meta attacker with parameter is trained by minimizing following loss featuring fast learning:


Each group () contains a few examples randomly sampled from each (). denotes the update of the meta attacker parameters on each group. Since () contains only a few samples, minimizing the above loss would give an initial meta attacker parameter which can fast adapt to new data through gradient descent based fine-tuning. Therefore, the obtained meta attacker can be used to attack new black-box models by estimating their gradient information through a few queries .

More concretely, the adaptation of the meta attacker to a specific task can be achieved by one or multiple gradient descent: . For a sensitive position of the meta attacker, we follow the update strategy of Reptile [23] in meta learning: . In this way, our meta attacker is able to generalize to a new task from a very small number of samples. Detailed training process of our meta attacker is described in Algorithm 1.

Our designed attacker

has a similar structure as an autoencoder which consists of symmetric convolution and de-convolution layers and outputs a gradient map with the same size as the input. Its architecture details are provided in the experiment.

0:  Input images , groundtruth gradients generated from classification models ; Sample a few samples () from () to form task ;
1:  Randomly initialize ;
2:  while not done do
3:     for  do
4:        Evaluate with respect to ();
5:        Update ;
6:     end for
7:     Update ;
8:  end while
8:  Meta model .
Algorithm 1 Adversarial Meta Attack Algorithm (Training)

3.3 Query-efficient Attack

As mentioned above, an effective adversarial attack relies on optimizing the loss function (

1) w.r.t. the input image to find the adversarial example of the target model . Differently, our proposed method applies the meta attacker to predicting the gradient map of a test image directly.

Particularly, we update the meta attacker based on query information with the following periodic scheme. Suppose we have a test image denoted as which is perturbed to at iteration . If , our method performs zeroth-order gradient estimation to obtain gradient map for fine-tuning the meta attacker. For further enhancing the efficiency, instead of estimating the full gradient map through all coordinates, we just select of the coordinates to estimate. The indexes of these coordinates are determined by the gradient map in the last iteration by selecting top- indexes, denoted as , with largest values. We feed the image into meta attacker and compute the MSE loss on indexes , i.e. . Then we perform gradient descent for the MSE loss with a few steps to update the parameters of our meta attacker. For other iterations, we use the meta attacker directly without updating it. This can save the number of queries significantly.

We then use the periodically updated attacker to iteratively generate the gradient for producing the adversarial sample by where is tunable. The details are summarized in Algorithm 2.

We make a few remarks here. First, though we just use coordinates to fine-tune our meta attacker every iterations, the meta attacker is trained to ensure that it could abstract the gradient distribution of different and learn to predict the gradient from a few samples with simple fine-tuning. Secondly, the most query-consuming part lies in zeroth-order gradient estimation, due to its coordinate-wise nature. However, we only do this every iterations. Intuitively, larger implies less gradient estimation computation and fewer queries. Besides, for gradient estimation, we only select top- coordinates which can be much smaller than dimension of the input. This also largely reduces the query number.

0:  Test image with label , meta attacker , target model , iteration interval , selected top- coordinates;
1:  for  do
2:     if  then
3:        Perform zeroth-order gradient estimation on top coordinates, denoted as and obtain ;
4:        Fine-tune meta attacker with () on ;
5:     else
6:        Generate the gradient map directly from meta attacker with , select coordinates ;
7:     end if
8:     Update + ;
9:     if  then
10:        ;
11:        break;
12:     else
13:        ;
14:     end if
15:  end for
15:  adversarial example .
Algorithm 2 Adversarial Meta Attack Algorithm (Attacking)

4 Experiments

In this section, we conduct experiments to compare our proposed meta attacker with state-of-the-art black-box attack methods including Zoo [3], the decision-based black-box attack (Decision-Boundary) [1] and hard label attack (Opt-attack) [4]. We also compare a meta guided attacker with the above baselines, which is denoted as “Meta Guided”. It is trained and fine-tuned in the same way as our proposed meta attacker. The only difference is it obtains gradient estimated numerically from the selected coordinates while our meta attacker applies gradient predicted by itself. They both apply the corresponding gradient to selecting top- coordinates. Comparing our proposed meta attacker and its Meta Guided variant would help understand the effectiveness of our meta attacker in gradient estimation and justify the benefits of selecting critical image locations to attack. Lastly, we perform model analysis experiments to reveal the effectiveness and generalizability of our meta attacker.

4.1 Settings

Datasets and Target Models

We evaluate the attack performance on MNIST [16] for handwritten digit recognition, CIFAR10 [15] and tiny-Imagenet [27] for object classification. The architecture details of four classification models on MNIST are given in Table 1, where Net4 is used as the target model , and Net1, Net2, Net3 are used for training the meta attacker. On CIFAR-10, we choose ResNet18 [11] as the target model and use VGG13, VGG16 [28] and GoogleNet [29] for training our meta attacker. On tiny-Imagenet, we choose VGG19 and ResNet34 as the target model separately, and use VGG13, VGG16 and ResNet18 for training the meta attacker together.

Attack Protocols

For a target black-box model , obtaining a pair of (input-output) is considered as one query. We use the mis-classification rate as attack success rate; 100 images are randomly selected from each dataset as test images. To evaluate overall noise added by the attack methods, we use the mean distance across all the samples , where denotes the adversarial version for the authentic sample .

Implementation Details

For all the experiments, we use the same architecture for the meta attacker , which consists of four convolutional layers and four deconvolutional layers. We use Reptile [23] with learning rate to train meta attacker. Fine-tuning parameters are set as for MNIST and CIFAR10; for tiny-Imagenet. Top coordinates are selected as part coordinates for attacker fine-tuning and model attacking on MNIST; and on CIFAR10 and tiny-Imagenet.

Net1 Net2 Net3 Net4

Conv(6, 5, 5) + ReLu

Conv(64, 5, 5) + Relu Dropout(0.2) Conv(128, 3, 3) + Tanh
MaxPool(2,2) Conv(64, 5, 5) + Relu Conv(64, 8, 8) + Relu MaxPool(2,2)
Conv(16, 5, 5) + ReLu Dropout(0.25) Conv(128, 6, 6) + Relu Conv(64, 3, 3) + Tanh
MaxPool(2,2) FC(128) + Relu Conv(128, 5, 5) + Relu MaxPool(2,2)
FC(120) + Relu Dropout(0.5) Dropout(0.5) FC(128) + Relu
FC(84) + Relu FC(10) + Softmax FC(10) + Softmax FC(10) + Softmax
FC(10) + Softmax
Table 1: Neural network architectures used on MNIST. Conv: convolutional layer, FC: fully connected layer.

4.2 Comparison with Baselines

We conduct experimental comparison with baselines for both the untargeted and targeted black-box attack on the three datasets. We show that our meta attacker can achieve similar success rate and noise magnitude as baselines, but using much fewer queries.

Untargeted Attack

Untargeted attack aims to generate adversarial examples that would be mis-classified by the attacked model into any category different from the ground truth one. The detailed results for each attack and each dataset are shown in Table 

2. Our method is competitive with previous attack methods in terms of adversarial perturbation and success rate, but our query number is largely reduced. More specifically, the query number in our meta attacker is at least less than baselines on MNIST, and less than [3] on CIFAR10. On tiny-Imagenet, we reduce at least queries when attacking VGG19 and when attacking ResNet34. The meta guided attacker achieves smaller distortion on CIFAR10 and tiny-Imagenet datasets. These results demonstrate the remarkable advantages of gathering and utilizing useful information from each query in our method.

Dataset / Target model Method Success Rate Avg. Avg. Queries
MNIST / Net4 Zoo [3] 1.00 1.61 21,760
Decision Boundary [1] 1.00 1.85 13,630
Opt-attack [4] 1.00 1.85 12,925
Meta Guided (ours) 1.00 1.73 5,975
Meta attack (ours) 1.00 1.79 1,024
CIFAR10 / Resnet18 Zoo [3] 1.00 0.30 8,192
Decision Boundary [1] 1.00 0.30 17,010
Opt-attack [4] 1.00 0.33 20,407
Meta Guided (ours) 1.00 0.26 6,254
Meta attack (ours) 0.92 0.33 2,438
tiny-ImageNet / VGG19 Zoo [3] 1.00 0.52 27,827
Decision Boundary [1] 1.00 0.52 49,942
Opt-attack [4] 1.00 0.53 71016
Meta Guided (ours) 1.00 0.57 16,460
Meta attack (ours) 0.98 0.54 6,826
tiny-ImageNet / Resnet34 Zoo [3] 1.00 0.47 25,344
Decision Boundary [1] 1.00 0.48 49,982
Opt-attack [4] 1.00 0.52 60,437
Meta Guided (ours) 0.98 0.41 16,040
Meta attack (ours) 0.98 0.49 6,866
Table 2: MNIST, CIFAR10 and tiny-ImageNet untargeted attack comparison: Meta attacker attains comparable success rate and distortion as baselines, and significantly reduces query numbers.

We also compare the results of our method with Zoo from a query-efficiency perspective. We experiment on CIFAR10 and tiny-Imagenet by limiting the query number to a fixed value and compare the success rate. The results are shown in Fig. 4 and 4. We notice that for different query thresholds, the success rate of our method is to times higher than Zoo. This is because that testing samples have different distances to the decision boundary. When the number of queries is limited, Zoo [3] only works over the easier testing samples with closer distortion distance to the decision boundary. Higher success rate of our method indicates that our meta attacker can predict correct gradient—even the query information is limited—toward the decision boundary for harder testing samples. These results give strong evidence on effectiveness of our proposed method for enhancing query efficiency.

Targeted Attack

In the targeted setting, one aims to generate adversarial noise such that the perturbed sample would be mis-classified into any pre-specified category. This is a more strict setting than the untargeted attack. For fair comparison, we define the target label for each sample in the following way: a sample with label gets the target label . We use the same meta attacker as above. The results on MNIST, CIFAR10 and tiny-ImageNet are shown in Table 3.

Similar to untargeted attack, we achieve comparable noise and success rate with largely reduced query numbers. On tiny-Imagenet, when attacking VGG19, our method decreases queries and reduces the query by when attacking ResNet34. It is worth noting that compared to baseline Zoo, our method generates smaller noise. Our guided meta attacker gains lower distortion on MNIST, tiny-Imagenet and remarkably better success rate on tiny-Imagenet Resnet34.

Dataset / Target model Method Success Rate Avg. Avg. Queries
MNIST / Net4 Zoo [3] 1.00 2.63 23,552
Decision Boundary [1] 0.64 2.71 19,951
Opt-attack [4] 1.00 2.33 99,661
Meta Guided (ours) 1.00 2.22 8,325
Meta attack (ours) 1.00 2.41 1,872
CIFAR10 / Resnet18 Zoo [3] 1.00 0.55 66,400
Decision Boundary [1] 0.58 0.53 16,250
Opt-attack [4] 1.00 0.50 121,810
Meta Guided (ours) 1.00 0.56 27,437
Meta attack (ours) 0.90 0.52 21,188
tiny-ImageNet / VGG19 Zoo [3] 0.74 1.26 119,648
Decision Boundary [1] - - -
Opt-attack [4] 0.66 1.14 252,009
Meta Guided (ours) 0.73 0.99 76,459
Meta attack (ours) 0.54 1.24 15,813
tiny-ImageNet / Resnet34 Zoo [3] 0.60 1.03 88,966
Decision Boundary [1] - - -
Opt-attack [4] 0.78 1.00 214,015
Meta Guided (ours) 0.84 0.99 88,418
Meta attack (ours) 0.54 1.21 19,516
Table 3: MNIST, CIFAR10 and tiny-ImageNet targeted attack comparison: Meta attack significantly outperforms other black-box methods in query numbers.

4.3 Analysis

Figure 1: Performance comparison with limited queries on CIFAR10.
Figure 2: Comparison of randomly initialized and well-trained meta attackers.

Figure 3: Performance comparison with limited queries on tiny-Imagenet.
Figure 4: Comparison of transferred and well-trained meta attackers.

Meta Attacker Training and Query Efficiency

We first test the benefits of meta training by comparing the performance of a meta-trained attacker with a Gaussian randomly initialized attacker without meta training. Testing on three datasets, Fig. 4 shows their success rate, and distortion and query count results for initial success. The meta pre-trained attacker achieves average higher success rate with lower distortion and less queries, compared with the randomly initialized one. This justifies the benefits of our deployed meta training for enhancing the query efficiency and improving attack performance.

Guaranteed by fine-tuning iterations, the randomly initialized meta attacker succeeds over many testing samples. The fine-tuning iterations work like an inner training loop in meta training process. Hence, sufficient fine-tuning iterations could train the randomly initialized meta attacker as well as well-trained meta attacker. This explains the effectiveness of the randomly initialized meta attacker on many testing samples compromised by more queries. However, the randomly initialized meta attacker could not predict gradient as accurate as the well-trained meta attacker during earlier iterations. Such inaccuracy leads to larger distortion at the beginning. On the contrary, the meta training process enables the well-trained meta attacker to fast-adapt to current testing samples. These results highlight the significant advantages of our meta model towards to black-box attack. The meta-training of the meta attacker make it familiar with gradient patterns of various models.


Here, we aim to show that our meta attacker trained on one dataset can be transferred to other divergent datasets and still achieves good attack performance. To this end, we conduct this experiment between CIFAR10 and tiny-Imagenet. These two datasets are both for image classification but differ in image classes and size. We conduct three groups of experiments. We first apply the meta attacker trained on CIFAR10 to attack the target model VGG19, ResNet34 on tiny-Imagenet respectively, which are different from those models for training the meta attacker. Namely, the meta attacker tested on CIFAR10 has no privileged prior and is not familiar with neither tiny-Imagenet dataset nor the corresponding classification models. Similarly, we also use the meta attacker trained on tiny-Imagenet and classification models to attack the target ResNet18 model on CIFAR10. Fig. 4 shows the success rate, distortion and queries results. The transferred meta attackers attain almost the same success rate and distortion comparing to well trained meta attackers with slightly more queries. Different from the randomly initialized meta attacker, the transferred meta attacker adapts to current testing samples rapidly, and it avoids irreversible noise generated in earlier iterations. These results show the good generalizability and robustness of our proposed meta attacker.

5 Conclusion

In this paper, we propose a meta-based black-box attack method that largely reduces demanded query numbers without compromising in attack success rate and distortion. We train a meta attacker and incorporate it into the optimization process to decrease the number of queries. We conduct both untargeted attack and targeted attack on MNIST, CIFAR10 and tiny-ImageNet to verify its effectiveness. Extensive numerical results confirm the superior query-efficiency of our method over selected baselines. We also prove the necessity and generalizability of our designed meta attack method.


  • [1] W. Brendel, J. Rauber, and M. Bethge.

    Decision-based adversarial attacks: Reliable attacks against black-box machine learning models.

    In International Conference on Learning Representations, 2018.
  • [2] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39–57. IEEE, 2017.
  • [3] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In

    Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

    , pages 15–26. ACM, 2017.
  • [4] Minhao Cheng, Thong Le, Pin-Yu Chen, Huan Zhang, JinFeng Yi, and Cho-Jui Hsieh. Query-efficient hard-label black-box attack: An optimization-based approach. In International Conference on Learning Representations, 2019.
  • [5] Riley Edmunds, Noah Golmant, Vinay Ramasesh, Phillip Kuznetsov, Piyush Patil, and Raul Puri. Transferability of adversarial attacks in model-agnostic meta-learning.
  • [6] Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1126–1135. JMLR. org, 2017.
  • [7] Saeed Ghadimi and Guanghui Lan. Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 23(4):2341–2368, 2013.
  • [8] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    , pages 580–587, 2014.
  • [9] Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2015.
  • [10] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton.

    Speech recognition with deep recurrent neural networks.

    In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645–6649. IEEE, 2013.
  • [11] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  • [12] Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Brian Kingsbury, et al. Deep neural networks for acoustic modeling in speech recognition. IEEE Signal processing magazine, 29, 2012.
  • [13] Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited queries and information. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, July 2018.
  • [14] Rie Johnson and Tong Zhang.

    Accelerating stochastic gradient descent using predictive variance reduction.

    In Advances in neural information processing systems, pages 315–323, 2013.
  • [15] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
  • [16] Yann LeCun.

    The mnist database of handwritten digits.

    http://yann. lecun. com/exdb/mnist/.
  • [17] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015.
  • [18] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu.

    Towards deep learning models resistant to adversarial attacks.

    In International Conference on Learning Representations, 2018.
  • [19] Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016.
  • [20] Nina Narodytska and Shiva Kasiviswanathan. Simple black-box adversarial attacks on deep neural networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1310–1318. IEEE, 2017.
  • [21] Yurii Nesterov. Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2013.
  • [22] Yurii Nesterov and Vladimir Spokoiny. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17(2):527–566, 2017.
  • [23] Alex Nichol, Joshua Achiam, and John Schulman. On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999, 2018.
  • [24] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 506–519. ACM, 2017.
  • [25] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
  • [26] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015.
  • [27] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
  • [28] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  • [29] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  • [30] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
  • [31] Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, and Shuicheng Yan. Stc: A simple to complex framework for weakly-supervised semantic segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(11):2314–2320, 2016.
  • [32] Daniel Zügner and Stephan Günnemann. Adversarial attacks on graph neural networks via meta learning. In International Conference on Learning Representations, 2019.