1 Introduction
Deep neural networks (DNNs) have been demonstrated to be vulnerable to
adversarial examples Szegedy2014 that are typically formed by perturbing benign examples with an intention to cause misclassifications.^{†}^{†}*The first two authors contributed equally to this work. According to the amount of information that is exposed and possible to be leveraged, an intelligent adversary shall adopt different categories of attacks. Getting access to critical information (e.g., the architecture and learned parameters) about a target DNN, the adversaries generally prefer whitebox attacks Szegedy2014 ; Goodfellow2015 ; Moosavi2016 ; CW2017 ; Madry2018 . After a few rounds of forward and backward passes, such attacks are capable of generating images that are perceptually indistinguishable to the benign ones but would successfully trick the target DNN into making incorrect classifications. Whereas, so long as little information is exposed, the adversaries will have to adopt blackbox attacks Papernot2017 ; Liu2017 ; Chen2017 ; Narodytska2017 ; Ilyas2018 ; Nitin2018 ; Tu2019 ; Ilyas2019 ; Guo2019 instead.In general, blackbox attacks require no more information than the confidence score from a target and thus the threat model is more realistic in practice. Over the past few years, remarkable progress has been made in this regard. While initial efforts reveal the transferability of adversarial examples and devote to learning substitute models Papernot2017 ; Liu2017 , recent methods focus more on gradient estimation accomplished via zerothorder optimizations Chen2017 ; Narodytska2017 ; Ilyas2018 ; Nitin2018 ; Tu2019 ; Ilyas2019
. By issuing classification queries to the target (a.k.a., victim model), these methods learn to approach its actual gradient w.r.t. any input, so as to perform adversarial attacks just like in the whitebox setting. Despite many practical merits, high query complexity is virtually inevitable for computing sensible estimations of inputgradients in some methods, making their procedures costly and probably suspicious to the classification system.
Following this line of research, we aim at reducing the query complexity of the blackbox attacks. We discover in this paper that, it is possible that the gradient estimations and zerothorder optimizations can be performed in subspaces with much lower dimensions than one may suspect, and a principled way of spanning such subspaces is considered by utilizing “prior gradients” of a few reference models as heuristic search directions. Our method, for the first time, bridges the gap between transferbased attacks and the querybased ones. Powered by the developed mechanism, we are capable of trading the attack failure rate in favor of the query efficiency reasonably well. Experimental results show that our method can gain significant reductions in the requisite numbers of queries with much lower failure rates, in comparison with previous stateofthearts. We show that it is possible to obtain the reference models with a small training set disjoint to the one for training CIFAR10/ImageNet targets.
2 Related Work
One common and crucial ingredient utilized in most whitebox attacks is the model gradient w.r.t the input. In practical scenarios, however, the adversaries may not be able to acquire detailed architecture or learned parameters of a model, preventing them from adopting gradientbased algorithms directly. One initial way to overcome this challenge is to exploit transferability Szegedy2014 . Ever since the adversarial phenomenon was discovered Szegedy2014 ; Goodfellow2015 , it has been presented that adversarial examples crafted on one DNN model can probably fool another, even if they have different architectures. Taking advantage of the transferability, Papernot et al. Papernot2016transferability ; Papernot2017 propose to construct a dataset which is labeled by querying the victim model, and train a substitute model as surrogate to mount blackbox attacks. Thereafter, Liu et al. Liu2017 study such transferbased attacks over large networks on ImageNet Russakovsky2015 , and propose to attack an ensemble of models for improved performance. Despite the simplicity, attacks function solely on the transferability suffer from high failure rates.
An alternative way of mounting blackbox attacks is to perform gradient estimation. Suppose that the prediction probabilities (i.e., the confidence scores) of the victim model is available, methods in this category resort to zerothorder optimizations. For example, Chen et al. Chen2017 propose to accomplish this task using pixelbypixel finite differences, while Ilyas et al. Ilyas2018 suggest to apply a variant of natural evolution strategies (NES) Salimans2017 . With the inputgradients appropriately estimated, they proceed as if in a whitebox setting. In practice, the two are combined with the C&W whitebox attack CW2017 and PGD Madry2018 , respectively. Though effective, owing to the high dimensionality of natural images, these initial efforts based on accurate gradient estimation generally require (tens of) thousands of queries to succeed on the victim model, which is very costly in both money and time. Towards reducing the query complexity, Tu et al. Tu2019 and Ilyas et al. Ilyas2019 further introduce an autoencoding and a bandit mechanisms respectively that incorporate spatial and temporal priors. Similarly, Bhagoji et al. Nitin2018
show the effectiveness of random grouping and principal components analysis in achieving the goal.
In extreme scenarios where only final decisions of the victim model are exposed, adversarial attacks can still be performed Brendel2018 ; Cheng2019 . Such blackbox attacks are in general discrepant from the scorebased attacks, and we restrict our attention to the latter in this paper. As have been briefly reviewed, methods in this threat model can be divided into two categories, i.e., the transferbased attacks (which are also known as the oraclebased attacks) and querybased attacks. Our method, probably for the first time, bridges the gap between them and therefore inherits the advantages from both sides. It differs from existing transferbased attacks in a sense that it takes gradients of reference models as heuristic search directions for finite difference gradient estimation, and benefit from the heuristics, it is far more (query)efficient than the latest querybased attacks.
3 Motivations
Let us consider attacks on an image classification system. Formally, the blackbox attacks of our interest attempt to perturb an input and trick a victim model to give an incorrect prediction about its label . While, on account of the high dimensionality of input images, it is difficult to estimate gradient and perform blackbox attacks within a few queries, we echo a recent claim that the limitation can be reasonably ameliorated by exploiting prior knowledge properly Ilyas2019 . In this section, we will shed light on the motivations of our method.
Attack in Linear Subspaces?
Natural images are highdimensional and spatially overredundant, which means not all the pixels (or combinations of pixels) are predictive of the imagelevel labels. A classification model offers its predictions typically through mining discriminative components and suppressing irrelevant variations from raw images Lecun2015 . One reasonable hypothesis worth exploring in this spirit is that, it is probably less effective to perturb an image on some specific pixels (or along certain directions) when attacking a blackbox model. From a geometric point of view, that said, the problem probably has a lower intrinsic dimension than , just like many other ones Li2018 .
To verify this, we try estimating gradients and mounting attacks on lowdimensional subspaces for images, which is bootstrapped by generating
random basis vectors
sequentially on condition of each being orthogonal to the prior ones. We utilize the bandit optimization advocated in a recent paper Ilyas2019 for gradient estimation, and adopt the same iterative attack (i.e., PGD) as in it. Recall that the bandit mechanism updates its estimation at each step by a scaled search direction:(1) 
in which
is the search direction sampled from a Gaussian distribution,
is a step size that regulates the directional estimation, and calculates the inner product between its normalized input and the precise model gradient. The mechanism queries a victim model twice at each step of the optimization procedure for calculating , after which a PGD step based on the current estimation is applied. Interested readers can check the insightful paper Ilyas2019 for more details.In this experiment, once the basis is established for a given image, they are fixed over the whole optimization procedure that occurs on the dimensional subspace instead of the original dimensional one. More specifically, the search direction is yielded by combining the generated basis vectors with Gaussian coefficients, i.e., and . We are interested in how the value of affects the failure rate and the requisite number of queries of successful attacks. By sampling 1,000 images from the CIFAR10 test set, we craft untargeted adversarial examples for a blackbox wide residual network (WRN) Zagoruyko2016 with an upper limit of 2,000 queries for efficiency reasons. As depicted in Figure 1, after , all three concerned metrics (i.e., failure rate, mean and median query counts) barely change. Moreover, at , the failure rate already approaches 10%, which is comparable to the result gained when the same optimization is applied in the original image space which has dimensions. See the red dotted line in Figure 1 for this baseline. Similar phenomenon can be observed on other models using other attacks as well, which evidences that the problem may indeed have a lower dimension than one may suspect and it complements the study of the intrinsic dimensionality of training landscape of DNNs in a prior work Li2018 .
Prior Gradients as Basis Vectors?
Since the requisite number of queries at is already high in Figure 1, we know that the random basis vectors boost the stateoftheart only to some limited extent. Yet, it inspires us to explore more principled subspace bases for queryefficient attacks. To achieve this goal, we start from revisiting and analyzing the transferbased attacks. We know from prior works that even adversarial examples crafted using some singlestep attacks like the fast gradient (sign) Kurakin2017 can transfer Papernot2017 ; Liu2017 , hence one can hypothesize that the gradients of some “substitute” models are more helpful in spanning the search subspaces with reduced dimensionalities. A simple yet plausible way of getting these gradients involved is to use them directly as basis vectors. Note that unlike the transferbased attacks in which these models totally substitute for the victim when crafting adversarial examples, our study merely considers their gradients as priors. We refer to such models and gradients as reference models and prior gradients respectively throughout this paper for clarity.
More importantly, we can further let these basis vectors be adaptive when applying an iterative attack (e.g., the basic iterative method Kurakin2017 and PGD Madry2018 ), simply by recalculating the prior gradients (w.r.t the current inputs which may be candidate adversarial examples) at each step. Different zerothorder optimization algorithms can be readily involved in the established subspaces. For simplicity, we will stick with the described bandit optimization in the sequel of this paper and we leave the exploration on other algorithms like the coordinatewise finite differences Chen2017 and NES Ilyas2018 to future works.
An experiment is similarly conducted to compare attacks in the gradientspanned subspaces^{1}^{1}1Granted, the prior gradients are almost surely linearly independent and thus can be regarded as basis vectors. and the random ones, in which the WRN is still regarded as the victim model. We compare mounting blackbox attacks on different subspaces spanned by the (adaptive) prior gradients and randomly generated vectors as described before. Figure 2 summarizes our main results. As in Figure 0(a), we illustrate the attack failure rates in Figure 1(a). Apparently, the prior gradients are much more promising than its random counterparts when spanning search subspaces. For more insights, we project normalized WRN gradients onto the two sorts of subspaces and further compare the mean squared residuals of projection under different circumstances in Figure 1(b). It can be seen that the gradientspanned subspaces indeed align better with the precise WRN gradients, and over misalignments between the search subspaces and precise model gradients lead to high failure rates.
4 Our Subspace Attack
As introduced in the previous section, we reckon that it is promising to apply the gradient of some reference models to span the search subspace for mounting blackbox attacks. However, there remain some challenges in doing so. First, it should be computationally and memory intensive to load all the reference models and calculate their inputgradients as basis vectors. Second, it is likely that an “universal” adversarial example for a victim model is still far away from such subspaces, which means mounting attacks solely on them may lead to high failure rate as encountered in the transferbased attacks. We will discuss the issues and present our solutions in this section. We codename our method subspace attack and summarize it in Algorithm 1, in which the involved hyperparameters will be carefully explained in Section 5.
4.1 Coordinate Descent for Efficiency
If one of the prior gradients happens to be wellaligned with the gradient of the victim model, then “an adaptive” onedimensional subspace suffices to mount the attack. Nevertheless, we found that it is normally not the case, and increasing the number of reference models and prior gradients facilitates the attack, which can be partially explained by the fact that they are nearly orthogonal to each other in highdimensional spaces Liu2017 . Definitely, it is computationally and memory intensive to calculate the inputgradients of a collection of reference models at each step of the optimization.
Given a set of basis vectors, offtheshelf optimization procedures for blackbox attacks either estimate the optimal coefficients for all vectors before update Chen2017 or give one optimal scaling factor overall Ilyas2019 . For any of them, the whole procedure is somewhat analogous to a gradient descent whose update directions do not necessarily align with single basis vectors. It is thus natural to make an effort based on coordinate descent Wright2015 , which operates along coordinate directions (i.e., basis vectors) to seek the optimum of an objective, for better efficiency. In general, the algorithm selects a single coordinate direction or a block of coordinate directions to proceed iteratively. That said, we may only need to calculate one or several prior gradients at each step before update and the complexity of our method is significantly reduced. Experimental results in Section 5 show that one single prior gradient suffices.
4.2 Dropout/layer for Exploration
As suggested in Figure 1(b), one way of guaranteeing a low failure rate in our method is to collect adequate reference models. However, it is usually troublesome in practice, if not infeasible. Suppose that we have collected a few reference models which might not be adequate, and we aim to reduce the failure rate whatsoever. Remind that the main reason of high failure rates is the imperfect alignment between our search subspaces and the precise gradients (cf., Figure 1(b)), however, it seems unclear how to explore other possible search directions without training more reference models. One may simply try adding some random vectors to the basis set for better alignment and higher subspacedimensions, although they bare the ineffectiveness as discussed in Section 3 and we also found in experiments that this strategy does not help much.
Our solution to resolve this issue is inspired by the dropout Srivastava2014 and “droplayer” (a.k.a., stochastic depth) Huang2016 techniques. Dropout/layer, originally serve as regularization techniques, randomly drop a subset of hidden units or residual blocks (if exist) from DNNs during training. Their successes indicate that a portion of the features can provide reasonable predictions and thus meaningful inputgradients, which implies the possibility of using dropout/layer invoked gradients to enrich our search priors ^{2}^{2}2We examine the generated inputgradients in this manner and found that most of them are still independent.. By temporarily removing hidden units or residual blocks, we can acquire a spectrum of prior gradients from each reference model. In experiments, we append dropout to all convolutional/fullyconnect layer (except the final one), and we further drop residual blocks out in ResNet reference models.
5 Experiments
In this section, we will testify the effectiveness of our subspace attack by comparing it with the stateofthearts in terms of the failure rate and the number of queries (of successful attacks). We consider both untargeted and targeted attacks on CIFAR10 Krizhevsky2009 and ImageNet Russakovsky2015
. All our experiments are conducted on a GTX 1080 Ti GPU with PyTorch
pytorch . Our main results for untargeted attacks are summarized in Table 1, and the results for targeted attacks are reported in the supplementary material.Dataset  Victim Model  Method  Ref. Models  Mean Queries  Median Queries  Failure Rate 
CIFAR10  WRN  NES Ilyas2018    1882  1300  3.5% 
BanditsTD Ilyas2019    713  266  1.2%  
Ours  AlexNet+VGGNets  392  60  0.3%  
GDAS  NES Ilyas2018    1032  800  0.0%  
BanditsTD Ilyas2019    373  128  0.0%  
Ours  AlexNet+VGGNets  250  58  0.0%  
PyramidNet*  NES Ilyas2018    1571  1300  5.1%  
BanditsTD Ilyas2019    1160  610  1.2%  
Ours  AlexNet+VGGNets  555  184  0.7%  
ImageNet  Inceptionv3  NES Ilyas2018    1427  800  19.3% 
BanditsTD Ilyas2019    887  222  4.2%  
Ours  Original ResNets  462  96  1.1%  
PNASNet  NES Ilyas2018    2182  1300  38.5%  
BanditsTD Ilyas2019    1437  552  12.1%  
Ours  Original ResNets  680  160  4.2%  
SENet  NES Ilyas2018    1759  900  17.9%  
BanditsTD Ilyas2019    1055  300  6.4%  
Ours  Original ResNets  456  66  1.9% 
5.1 Experimental Setup
Evaluation Metrics and Settings.
As in prior works Ilyas2018 ; Nitin2018 ; Ilyas2019
, we adopt the failure rate and the number of queries to evaluate the performance of attacks using originally correctly classified images. For untargeted settings, an attack is considered successful if the model prediction is different from the groundtruth, while for the targeted settings, it is considered successful only if the victim model is tricked into predicting the target class. We observe that the number of queries changes dramatically between different images, thus we report both the mean and median number of queries of successful attacks to gain a clearer understanding of the query complexity.
Following prior works, we scale the input images to , and set the maximum perturbation to for CIFAR10 and for ImageNet. We limit to query victim models for at most 10,000 times in the untargeted experiments and 50,000 times in the targeted experiments, as the latter task is more difficult and requires more queries. In all experiments, we invoke PGD Madry2018
to maximize the hinge logitdiff adversarial loss from Carlini and Wagner
CW2017 . The PGD step size is set to for CIFAR10 and for ImageNet. At the end of each iteration, we clip the candidate adversarial examples back to to make sure they are still valid images. We initialize the dropout/layer ratio as and increase it by at the end of each iteration until it reaches throughout our experiments. Other hyperparameters like the OCO learning rate and the finitedifference step sizes (i.e., ) are set following the paper Ilyas2019 . We mostly compare our method with NES Ilyas2018 and BanditsTD Ilyas2019 , and their official implementations are directly used. We apply all the attacks on the same set of clean images and victim models for fair comparison. For BanditsTD on ImageNet, we craft adversarial examples on a resolution of and upscale them according to specific requests from the victim models (i.e., for Inceptionv3, for PNASNet, and for SENet) before query, just as described in the paper Ilyas2019 . We do not perform such rescaling on CIFAR10 since no performance gain is observed.Victim and Reference Models.
On CIFAR10, we consider three victim models: (a) a WRN Zagoruyko2016 with 28 layers and 10 times width expansion ^{3}^{3}3Pretrained model: https://github.com/bearpaw/pytorchclassification, which yields 4.03% error rate on the test set; (b) a model obtained via neural architecture search named GDAS Dong2019 ^{4}^{4}4Pretrained model: https://github.com/DXY/GDAS, which has a significantly different architecture than our AlexNet and VGGNet reference models and shows 2.81% test error rate; (c) a 272layer PyramidNet+Shakedrop model Han2017 ; Yamada2018 trained using AutoAugment Cubuk2019 with only 1.56% test error rate, ^{5}^{5}5Unlike the other two models that are available online, this one is trained using scripts from: https://github.com/tensorflow/models/tree/master/research/autoaugment
which is the published stateoftheart on CIFAR10 to the best of our knowledge. As for reference models, we simply adopt the AlexNet and VGG11/13/16/19 architectures with batch normalizations
Ioffe2015 . To evaluate in a more dataindependent scenario, we choose an auxiliary dataset (containing only 2,000 images) called CIFAR10.1 Recht2018 to train the reference models from scratch.We also consider three victim models on ImageNet: (a) an Inceptionv3 Szegedy2016 which is commonly chosen Ilyas2018 ; Ilyas2019 ; Cheng2019 ; Tu2019 with 22.7% top1 error rate on the official validation set; (b) a PNASNet5Large model Liu2018 whose architecture is obtained through neural architecture search, with a top1 error rate of 17.26%; (c) an SENet154 model Hu2018 with a top1 error rate of 18.68% ^{6}^{6}6Pretrained models: https://github.com/Cadene/pretrainedmodels.pytorch. We adopt ResNet18/34/50 as reference architectures, and we gather 30,000+45,000 images from an auxiliary dataset Recht2019 and the ImageNet validation set to train them from scratch. The clean images for attacks are sampled from the remaining 5,000 ImageNet official validation images and hence being unseen to both the victim and reference models.
5.2 Comparison with The Stateofthearts
In this section we compare the performance of our subspace attack with previous stateoftheart methods on CIFAR10 and ImageNet under untargeted settings.
On CIFAR10, we randomly select 1,000 images from its official test set, and mount all attacks on these images. Table 1 summarizes our main results, in which the fifth to seventh columns compare the mean query counts, median query counts and failure rates. On all three victim models, our method significantly outperforms NES and BanditsTD in both query efficiency and success rates. By using our method, we are able to reduce the mean query counts by a factor of 1.5 to 2.1 times and the median query counts by 2.1 to 4.4 times comparing with BanditsTD which incorporates both time and spatial priors Ilyas2019 . The PyramidNet+ShakeDop+AutoAugment Cubuk2019 model, which shows the lowest test error rate on CIFAR10, also exhibits the best robustness under all considered blackbox attacks. More interestingly, even if the victim model is GDAS, whose architecture is designed by running neural architecture search and thus being drastically different from that of the reference models, our prior gradients can still span promising subspaces for attacks. To the best of our knowledge, we are the first to attack PyramidNet+ShakeDrop+AutoAugment which is a published stateoftheart and GDAS which has a searched architecture in the blackbox setting.
For ImageNet, we also randomly sample 1,000 images from the ImageNet validation set for evaluation. Similar to the results on CIFAR10, the results on ImageNet also evidence that our method outperforms the stateofthearts by large margins. Moreover, since the applied reference models are generally more “oldfashioned” and computationally efficient than the victim models that are lately invented, our method introduces little overhead to the baseline optimization algorithm.
5.3 Dropout Ratios and Training Scales
We are interested in how the dropout ratio would affect our attack performance. To figure it out, we set an upper limit of the common dropout ratio to 0.0, 0.2, 0.5 respectively to observe how the query complexity and the failure rate vary when attacking the WRN victim model. With the AlexNet and VGGNet reference models trained on CIFAR10.1 Recht2018 , we see from the bottom of Table 2 that more dropout indicates lower failure rate, verifying that exploration via dropout well amends the misalignments between our subspaces and the victim model gradients.
Ref. Training Set  Images  Maximum  Mean Queries  Median Queries  Failure Rate 
CIFAR10 Training  0.0  59  12  1.4%  
50k  0.2  77  14  0.2%  
0.5  111  14  0.2%  
CIFAR10.1 + CIFAR10 Test (Part)  0.0  239  16  3.2%  
2k+8k  0.2  174  20  0.7%  
0.5  212  22  0.3%  
CIFAR10.1  0.0  519  48  9.6%  
2k  0.2  380  62  0.9%  
0.5  392  60  0.3% 
It might also be intriguing to evaluate how the performance of our method varies with the scale of training set for yielding reference models. We attempt to evaluate it empirically by training AlexNet and VGGNets from scratch using different numbers of training images. More specifically, we enlarge our training set by further using the CIFAR10 official training and test images, excluding the 1,000 images for mounting attacks of course. In addition to the CIFAR10.1 dataset as used, we try two larger sets: (a) the official CIFAR10 training set which consists of 50,000 images; ^{7}^{7}7In this special setting the reference models and the victim model share the same training data. (b) a set built by augmenting CIFAR10.1 with 8,000 CIFAR10 test images, whose overall size is 2,000+8,000=10,000. It can be seen from Table 2 that by training reference models with 8,000 more images, the query counts could be cut by over 2 without dropout, and the failure rate decreases as well. We believe that the performance gain is powered by better generalization ability of the reference models. In a special scenario where the reference and the victim models share the same training set, our method requires only 59 queries on average to succeed on 98.6% of the testing images without dropout. The performance of our method with dropout is also evaluated on the basis of these reference models, and we can see that dropout is capable of reducing the failure rates significantly regardless of the reference training set. While for the query complexity, we may observe that more powerful reference models generally require less exploration governed by dropout to achieve efficient queries.
5.4 Choice of Reference Models and Prior Gradients
Ref. Models  Mean Queries  Median Queries  Failure Rate 
VGG19  400  78  0.6% 
VGG19/16/13  395  71  0.4% 
VGG19/16/13/11+AlexNet  392  60  0.3% 
We investigate the impact of number and architecture of reference models for our method by evaluating our attack using different reference model sets, and report the performance in Table 3. As in previous experiments, reference models are trained on CIFAR10.1, and the maximum dropout ratio is set to 0.5. We see that increasing the number of reference models indeed facilitates the attack in both query efficiency and success rates, just like in the exploratory experiment where dropout is absent.
We also compare using “gradient descent” and “coordinate descent” empirically. On CIFAR10 we choose the same five reference models as previously reported, and at each iteration we compute all five prior gradients and search in the complete subspace. We combine all the prior gradients with Gaussian coefficients to provide a search direction in it. Experimental results demonstrate that with significantly increased runtime, both the query counts and failure rates barely change (mean/median queries: 389/62, failure rate: 0.3%), verifying that our coordinatedescentflavored policy achieves a sensible tradeoff between efficiency and effectiveness.
6 Conclusion
While impressive results have been gained, stateoftheart blackbox attacks usually require a large number of queries to trick a victim classification system, making the process costly and suspicious to the system. In this paper, we propose the subspace attack method, which reduces the query complexity by restricting the search directions of gradient estimation in promising subspaces spanned by inputgradients of a few reference models. We suggest to adopt a coordinatedescentflavored optimization and dropout/layer to address some potential issues in our method and trade off the query complexity and failure rate. Extensive experimental results on CIFAR10 and ImageNet evidence that our method outperforms the stateofthearts by large margins, even if the reference models are trained on a small and inadequate dataset disjoint to the one for training the victim models. We also evaluate the effectiveness of our method on some winning models (e.g., PyramidNet+ShakeDrop+AutoAugment Cubuk2019 and SENet Hu2018 ) on these datasets and models whose architectures are designed by running neural architecture search (e.g., GDAS Dong2019 and PNAS Liu2018 ).
References

[1]
Wieland Brendel, Jonas Rauber, and Matthias Bethge.
Decisionbased adversarial attacks: Reliable attacks against blackbox machine learning models.
In ICLR, 2018.  [2] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP), 2017.
 [3] PinYu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and ChoJui Hsieh. Zoo: Zeroth order optimization based blackbox attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15–26. ACM, 2017.
 [4] Minhao Cheng, Thong Le, PinYu Chen, Jinfeng Yi, Huan Zhang, and ChoJui Hsieh. Queryefficient hardlabel blackbox attack: An optimizationbased approach. In ICLR, 2019.
 [5] Ekin D Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, and Quoc V Le. Autoaugment: Learning augmentation policies from data. In CVPR, 2019.
 [6] Xuanyi Dong and Yi Yang. Searching for a robust neural architecture in four gpu hours. In CVPR, 2019.
 [7] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In ICLR, 2015.
 [8] Chuan Guo, Jacob R Gardner, Yurong You, Andrew G Wilson, and Kilian Q Weinberger. Simple blackbox adversarial attacks. In ICML, 2019.
 [9] Dongyoon Han, Jiwhan Kim, and Junmo Kim. Deep pyramidal residual networks. In CVPR, pages 5927–5935, 2017.
 [10] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016.
 [11] Jie Hu, Li Shen, and Gang Sun. Squeezeandexcitation networks. In CVPR, 2018.
 [12] Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, and Kilian Q Weinberger. Deep networks with stochastic depth. In ECCV, 2016.
 [13] Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Blackbox adversarial attacks with limited queries and information. In ICML, 2018.
 [14] Andrew Ilyas, Logan Engstrom, and Aleksander Madry. Prior convictions: Blackbox adversarial attacks with bandits and priors. In ICLR, 2019.
 [15] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
 [16] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.

[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.
Imagenet classification with deep convolutional neural networks.
In NeurIPS, 2012.  [18] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial machine learning at scale. In ICLR, 2017.
 [19] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015.
 [20] Chunyuan Li, Heerad Farkhoor, Rosanne Liu, and Jason Yosinski. Measuring the intrinsic dimension of objective landscapes. In ICLR, 2018.
 [21] Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, LiJia Li, Li FeiFei, Alan Yuille, Jonathan Huang, and Kevin Murphy. Progressive neural architecture search. In ECCV, 2018.
 [22] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and blackbox attacks. In ICLR, 2017.
 [23] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In ICLR, 2018.
 [24] SeyedMohsen MoosaviDezfooli, Alhussein Fawzi, and Pascal Frossard. DeepFool: a simple and accurate method to fool deep neural networks. In CVPR, 2016.
 [25] Nina Narodytska and Shiva Kasiviswanathan. Simple blackbox adversarial attacks on deep neural networks. In CVPR Workshop, 2017.
 [26] Arjun Nitin Bhagoji, Warren He, Bo Li, and Dawn Song. Practical blackbox attacks on deep neural networks using efficient query mechanisms. In ECCV, 2018.
 [27] Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. Transferability in machine learning: from phenomena to blackbox attacks using adversarial samples. arXiv preprint arXiv:1605.07277, 2016.
 [28] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical blackbox attacks against machine learning. In Asia Conference on Computer and Communications Security, 2017.
 [29] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. In NeurIPS Workshop, 2017.
 [30] Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do cifar10 classifiers generalize to cifar10? arXiv preprint arXiv:1806.00451, 2018.
 [31] Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, and Vaishaal Shankar. Do imagenet classifiers generalize to imagenet? In ICML, 2019.
 [32] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li FeiFei. Imagenet large scale visual recognition challenge. IJCV, 2015.
 [33] Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017.
 [34] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for largescale image recognition. In ICLR, 2015.
 [35] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.

[36]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew
Wojna.
Rethinking the inception architecture for computer vision.
In CVPR, 2016.  [37] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In ICLR, 2014.

[38]
ChunChen Tu, Paishun Ting, PinYu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi,
ChoJui Hsieh, and ShinMing Cheng.
Autozoom: Autoencoderbased zeroth order optimization method for attacking blackbox neural networks.
In AAAI, 2019.  [39] Stephen J Wright. Coordinate descent algorithms. Mathematical Programming, 151(1):3–34, 2015.
 [40] Yoshihiro Yamada, Masakazu Iwamura, Takuya Akiba, and Koichi Kise. Shakedrop regularization for deep residual learning. arXiv preprint arXiv:1802.02375, 2018.
 [41] Sergey Zagoruyko and Nikos Komodakis. Wide residual networks. In BMVC, 2016.