1 Introduction
Deep neural networks have achieved impressive performance on many machine learning tasks, which has led to growing interests in deploying these models in practical applications. However, recent studies have revealed that models trained on benign examples are susceptible to
adversarial examples, examples crafted by an adversary to control model behavior at test time [4, 32, 11]. The adversarial perturbation overlaid on top of the benign examples is often small enough to be imperceptible to humans, yet can cause the model to misclassify the image.The existence of adversarial examples has raised security concerns for many highstakes realworld applications such as street sign detection for autonomous vehicles. While initial works stated that digital adversarial examples built for signdetection may not be a real threat since the camera can view the objects from different distances and angles [22], more recent attacks were proposed for making stronger adversarial examples that are invariant to various transformations by optimizing over the expected value of a set of predefined transformations [3]. In fact, this security concern has turned into an actual threat after a recent study showed that adversarial stickers are able to fool realworld selfdriving cars [13]. These security concerns and threats have guided researchers to create models that are both accurate in prediction and robust to attacks.
Various methods have been proposed for defending against adversarial examples. One popular approach is to detect and reject adversarial examples [23, 25, 40], which can be ineffective when the adversary is aware of the detection method in order to adapt accordingly [5]. Another approach is to introduce regularization for training robust models [7, 18], but the increase in robustness from such methods is limited. [2] showed that many proposed defenses give a false sense of security by obfuscating gradients, as meaningful gradient information is necessary for optimization based attacks. [2] broke these defenses by attacks that build good approximations for the gradients. Among various defense methods, adversarial training [24, 19, 39, 30] is one of the most common methods for training robust models. In adversarial training, a robust model is trained on adversarial examples that are generated onthefly, which is effective but also makes adversarial training expensive.
Robust models have some interesting properties that have been revealed in recent studies. First, it is argued that there exists tradeoffs between accuracy and robustness [34, 42, 31]. It is difficult to make a model robust to all samples while maintaining the same level of accuracy. Second, it is difficult to adversarially train robust models that generalize since adversarially robust generalization requires more data [28] and models with more capacity [24]. Training high capacity models on large datasets increases the cost of adversarially training robust models. Third, while adversarial training is expensive, it is shown that adversarially trained models learn feature representations that align well with human perception [34]
. These feature embeddings can produce clean interclass interpolations similar to generative models in Generative Adversarial Networks (GANs)
[12]. These properties have inspired us to explore model capacity and sample efficiency.Recently, conditional normalization, built upon instance normalization [35]
[17], has been successful in generative models [20] and style transfer [16]. Conditional normalization can be seen as an adaptive network that shifts the statistics of a layer’s activations by applying network parameters conditioned on the latent factors such as style and classes [9, 10]. Inspired by these studies, we propose to exploit adaptive networks for robustness in the adversarial training framework.Contributions
We propose building hardened networks by adversarially training adaptive networks. To build adaptive networks, we introduce a normalization module conditioned on inputs which allows the network to “adapt” itself for different samples. The conditional normalization module includes a metaconvolutional network that changes the scale and bias parameter for normalization based on input samples. Conditional normalization is a powerful module that enlarges the representative capacity of networks. Our adversarially trained adaptive nets can be potentially more robust than conventional nonadaptive nets as they can adapt the network to be robust to adversarial attacks on a specific sample instead of all samples. Furthermore, adaptive normalization adds far fewer parameters than other methods for increasing expressiveness and robustness (e.g. wide resnets).
Our experiments on the CIFAR10 and CIFAR100 benchmarks empirically show that our proposed adaptive networks are better than their nonadaptive counterparts. The adaptive networks even outperform larger networks with more parameters in terms of both (clean) validation accuracy and robustness. Moreover, we have made several key observations that not only help our understanding but also significantly boost the performance of adversarial training. Such “tricks” like larger stepsize and initializing with a natural model can be widely used in adversarial training, and help us build stronger baselines for nonadaptive networks. Our adaptive network outperforms previous reported results by about 4%, and the strong baselines we achieved by about 1% in robust accuracy.
The proposed adaptive network can be combined with various other methods to improve the robustness against adversarial examples. Besides extensive experiments with our improved fast adversarial training, we show adaptive networks can be combined with the stronger TRADES [42] objective formulation that is very effective for the CIFAR benchmark, which suggests that our method is objective agnostic and can be helpful in improving many of the wellestablished baselines. Finally, we introduce a variant of singlestep adversarial training, when combined with adaptive network, can approach the robust accuracy of nonadaptive network with multistep adversarial training. Though our singlestep variant performs slightly worse than our improved fast adversarial training, it complements recent interests in accelerating adversarial training and showcases why conventional singlestep methods did not result in robustness against iterative attacks.
2 Related work
Here we provide a brief overview of robustness and normalization layers which are closely related to our proposed adaptive networks. We also provide an overview of adversarial training, which plays a critical role in our method,
Robustness, in the whitebox threat model, is commonly measured by computing the accuracy of the model on adversarial examples constructed by gradientbased optimization methods starting from validation samples. This evaluation method provides an upperbound on robustness as there is no theoretical guarantee (at least for all classes of problems) that adversarial examples crafted using firstorder gradient information are optimal. From a theoretical point of view, finding optimal adversarial examples is difficult. Some recent works have proposed finding the optimal solution by modeling neural networks as Mixed Integer Programs (MIPs) and solving those MIPs using commercial solvers [33]. However, finding the optimal solution of an MIP is generally NPhard. Although recent advancements have been made in their formulations by enforcing some properties on the network [38], finding the optimal solution is only feasible for small networks and is very time consuming. That is why certified methods in practice provide lowerbounds on the size of perturbations needed for causing misclassification by solving a relaxed version of the problem.
[27] propose certified defences by including a differentiable certificate as a regularizer. Many studies follow this line of work and propose certified defenses [36, 37, 8]
. While from a theoretical point of view certified defenses are interesting, in practice, adversarial training is still the most popular method for hardening networks – leaders of various computer vision defense competitions and benchmarks utilize adversarial training in their approach
[39, 42, 24].Adversarial training
, in its general form, corresponds to training on the following loss function,
(1) 
where is a differentiable surrogate loss used for training the neural network such as the crossentropy loss, () is the datapoint and its correct label, is the network with trainable parameters , is a hyperparameter that controls how much weight should be given to training on natural examples, and corresponds to the adversarial perturbation for the sample. To keep the perturbation unrecognizable to humans, the norm of is often bounded. Throughout this paper, we will use the common norm bound on . Note that our adversarial training loss merges information from both natural and adversarial examples.
Early adversarial example generation methods required many iterations since their goal was to help an attacker build an adversarial example using minimal perturbations [32, 26, 6]. However, from a defender’s perspective, the goal is to train on fast and bounded adversarial examples. With speed in mind, [11] proposed training on a singlestep attack called the Fast Gradient Sign Method (FGSM). FGSM computes and sets , where is the perturbation bound. Later, it was shown that stronger attacks such as BIM [21], completely break FGSM adversarially trained models. The BIM attack can be seen as an iterative version of FGSM where during each iteration, the perturbation is updated using an FGSMtype step but with a stepsize which is usually smaller than ,
(2) 
where is the perturbation at iteration of the BIM attack. After every iteration of the BIM attack (equation (2)), is clipped such that .
Adversarial training started blooming when [24] proposed training on adversarial examples generated using the PGD attack, which is a variant of BIM with a random initialization and projection back on the norm ball. Through experiments, they showed that the PGD attack is the strongest firstorder adversary, which was later verified by [2]. Consequently, almost all of the successful adversarially trained robust models use the PGD algorithm to generate adversarial examples.
Training on adversarial examples generated using PGD increases the cost of training by a factor of , where is the number of iterations of the PGD attack (i.e., number of times we update using equation (2)). While we will use PGDK attacks for evaluating the robustness of all our models, due to the high computation cost associated with PGD adversarial training, we perform most of our adversarial training by modifying a recently proposed algorithm for speeding up adversarial training [29]. A recent study [1] suggested a well tuned singlestep adversarial training can defend against strong PGD adversarial examples. However, the method in [1] heavily depends on domain specific cyclic learning rate schedule, using a stepsize which is greater than , and early stopping based on frequent examination. Also, they only justify their results empirically without providing intuition on why this rather unconventional setup () is needed.
Normalization layers such as batch normalization [17] and instance normalization [35]
have become important modules in modern neural networks. Normalization layers standardize input to have zero mean and unit variance, and then shift these statistics using scaling and bias parameters.
[43] suggest scaling and bias parameters can be even more important than standardization. Conditional normalization, where scaling and bias are adaptively determined by latent factors, has shown to be powerful in many computer vision tasks including style transfer [16, 10] and generative adversarial networks [20].3 Adaptive Networks
We introduce adaptive networks with conditional normalization modules in this section. Our motivation for adding conditional normalization modules is twofold. First, by introducing adaptive layers conditioned on inputs, we can “adapt” a trained network to be more robust to an individual input sample without requiring any information about its class label – a useful trait for robustness evaluation.
Second, conditional normalization can increase the expressiveness and effective capacity of the network, which has been shown to have a positive effect in improving model robustness. Adversarially trained models with more expressive capacities are more robust than their less expressive alternatives [24, 29]. At a high level, these conditional normalization modules can be considered as adding multibranch structures to a network which is known to be effective in improving accuracy on validation examples [15]. As we will see in the experiments, our normalization module indeed does improve the clean validation accuracy and is more effective^{1}^{1}1The adversarially trained adaptive nets have higher validation accuracy and robustness compared to networks with more trainable parameters. than simply widening or concatenating features in practice.
Below, we show how to create an adaptive network by adding conditional normalization modules to the wide residual network (WRN) [41] architecture.
3.1 Network architecture
Let represent the feature maps of a convolutional layer for a minibatch of samples, where is the batch size, is the width of the layer (number of channels), and and are the feature map’s height and width. If denotes the element at height , width of the channel from the sample, the conditional normalization module transforms the feature maps as,
(3) 
where are scale and bias parameters of the normalization module. The network with conditional normalization becomes adaptive to the latent factor as are outputs of convolutional networks with trainable parameters. Equation (3) represents normalization in a general form: when latent factor is a style image and is normalized by its mean and variance, equation (3) becomes adaptive instance normalization for image style transfer [16]; when latent factor is latent code like random noise, equation (3) becomes the building module for the generator in StyleGAN [20]. We provide details on how we use input sample as latent factor as below.
In our experiments, we add our conditional normalization module to wide residual networks (WRNs) [41] to create adaptive networks for classification. WRNs are a derivative of ResNets [14], and are one of the stateoftheart architectures used for image classification. A WRN is a stack of residual blocks (fig. 1 (a)). To specify WRNs, we follow [41] and denote the architecture as WRN, where represents the depth and represents the widening factor of the network.
The WRN architecture for the CIFAR10 and CIFAR100 datasets we use in this paper consists of a stack of three groups of residual blocks. There is a downsampling layer between two groups, and the number of channels (width of a convolutional layer) is doubled after downsampling. In the three groups, the width of the convolutional layers are , respectively. Each group contains residual blocks, and each residual block contains two
convolutional layers equipped with ReLU activation and batch normalization. There is a
convolutional layer withchannels before the three groups of residual blocks. And there is a global average pooling, a fullyconnected layer and a softmax layer after the three groups. The depth of the WRN is
.We add conditional normalization for the first residual block of each of the three groups. The normalization module is applied between the two convolutional layers in a block, as shown in fig. 1 (b). The inputs to the conditional normalization module are the feature maps produced by the first convolutional layer. Our conditional normalization module consists of a three layer convolutional network: two convolutional layers with , and one convolutional layer to match the dimension of the three different residual blocks, , respectively. We use average pooling as the last layer to get for equation (3). Our adaptive network is only slightly larger than the original WRN, and becomes more robust when adversarially trained, as shown in section 5.
4 Adversarial training
We briefly review the adversarial training algorithm we will use to make our adaptive networks robust, and discuss the “tricks” we found useful in improving these algorithms. We then introduce a variant of singlestep adversarial training that can couple with standard natural training without extra tuning, and shed light on why this singlestep method works and why the conventional singlestep methods fail to become robust against PGD attacks. Finally, we discuss an alternative objective function for adversarial training as adaptive networks can complement other active research directions in improving the practical robustness of networks.
PGD adversarial training. Wellknown robust networks on MNIST and CIFAR10 were adversarially trained by [24] by setting in equation (1) and only training on adversarial examples. Training just on adversarial examples is justified from a robust optimization framework, and modeled as a twoplayer constant sum game between the adversary, which is in charge of the perturbation
and the classifier with network parameters
. Formally, we consider adversarial training based on the following minimax formulation,(4) 
[24] solved the optimization problem in equation (4) in an alternating fashion. Before each minimization step on the network parameters , they compute using a PGDK attack on the fly. Every perturbation update step of the PGDK attack (equation (2)) requires computing , where are the adversarial perturbations of the j minibatch after the previous times update step, and represents network parameters at the j minimization iteration. To compute , required for every step of PGD, we need a complete forward and backward pass on the network. As a result, every iteration of PGD adversarial training is times more expensive than an iteration of natural training. A typical value used for is 7 to train a robust model for CIFAR10 benchmark [24].
Fast adversarial training. To speed up training of robust models, we adopt a fast adversarial training algorithm recently proposed by [29]. [29] showed that they can achieve comparable robustness to PGD adversarial training [24] on the datasets of our interest (CIFAR10 and CIFAR100) with roughly the same cost as standard (nonrobust) training.
The fast algorithm (Free) has a perturbation parameter of shape which is updated once during every minimization iteration. To accelerate robust training, Free applies simultaneous updates for the network parameters and perturbation , which makes its computation cost almost the same as natural training. In the j minimization iteration, both and are computed for the current minibatch and network parameters ,
Then and are updated as,
In Free, each minibatch is replayed times. For example, if , we move on to the next minibatch every other step, and therefore the data for the first two iterations would be the same (i.e., ). Since we train on the same minibatch times in a row, the hyperparameter is moreorless analogous to the number of iterations of the PGD training algorithm . We use the same number of minibatch updates for Free adversarial training and natural training on clean images, i.e., we train Free for
number of epochs in total. Free
can achieve similar robustness accuracy as PGD adversarially trained models. In our modification, we apply two “tricks” which we found to be particularly effective when combined with free adversarial training: initializing with the natural trained model and applying larger stepsize for updating perturbations. We built stronger baselines with such techniques, which can be even further boosted with our adaptive networks.Singlestep adversarial training. Fast Gradient Sign Method (FGSM) [11] is one of the most popular single step method for generating adversarial examples. With a random initialization of perturbation, Random FGSM (RFGSM) is similar to doing a one step of the PGD algorithm. [24] showed robust model adversarially trained with FGSM and RFGSM have almost zero robust accuracy under PGD attacks. A more recent preprint [1] suggested RFGSMbased training can be used to defend PGD attacks when combined with cyclic learning rates and early stopping by examining the robust accuracy on the validation dataset. The RFGSM method in [1] provides an alternative way to train robust models besides PGD adversarial training [24] and fast adversarial training [29] on benchmark datasets such as CIFARs. However, it may encounter difficulty to generalize to problems without special learning rate schedules and problems where we cannot perform online validation for early stopping.
We introduce a variant of RFGSM that works well even with a normal training schedule. We make two key modifications to the classical RFGSM. First, instead of initializing from uniform random value between and
, we initialize from a normal distribution with zero mean and
variance. We find to be rather insensitive between and , and always use in experiments. Second, we do not clip the perturbation after the FGSM update. Note that the perturbation is still bounded to some extent as the stepsize of FGSM is . In the proposed variant, the initialized noise can be viewed as boosting training samples instead of the FGSM update.The classical RFGSM may fail because adversarial examples generated by FGSM during training are likely to fall on the boundary of the bounded ball. After training on those adversarial examples, the loss surface becomes smooth at the boundary but the crossentropy loss may take on large values within the ball which can be exploited by multistep methods like PGD. As shown in fig. 2, the proposed RFGSM makes the loss surface smoother and hence harder to attack. Even for a difficult sample (validation example id 1), where there are adversarial examples for models trained by both the classical RFGSM and the proposed RFGSM, the loss surface of our proposed RFGSM is smoother.
TRADES objective. The proposed adaptive network is complementary to the choice of objective in adversarial training. Besides the minimax problem in equation (4), we can also train adaptive networks with the TRADES objective proposed in [42]. TRADES achieves impressive robust accuracy on the CIFAR10 dataset by combining supervised training and virtual adversarial training as,
(5) 
where controls the tradeoff between robustness and natural accuracy. We follow [42] for training algorithms and parameter settings in our experiments.
Row #  (Robust) model  Evaluated Against 


Natural  PGD20  PGD100  
1  Natural  94.10%  0.00%  0.00%  5.85  
2  PGD7 [24]  83.84%  40.03%  39.38%  5.85  
3  Free10 [29]  81.04%  40.56%  40.03%  5.85  
4  Free10adaptive  85.00%  43.16%  42.68%  6.05  
5  Free10lstep  77.75%  45.10%  44.77%  5.85  
6  Free10WRN285  77.81%  45.99%  45.77%  9.13  
7  Free10init  80.60%  46.88%  46.67%  5.85  
8  Free10adaptive  80.99%  48.09%  47.87%  6.05 
5 Experiments
In this section, we train robust models on CIFAR10 and CIFAR100. In all the experiments, we train WRNs without dropout for 120 epochs and with minibatch size 256. We start with learning rate 0.1 and decrease the learning rate by a factor of 10 at epochs 60 and 90. We use weight decay 1e4 and momentum 0.9. For evaluating the robustness of the models, we attack them with PGDK attacks. For the PGD attacks, we use and , and vary the number of attack iterations .
5.1 Quantitative evaluation and ablation study
We summarize our quantitative evaluation on CIFAR10 and CIFAR100 in table 1  5. In table 1  4, unless otherwise explicitly specified through the name of the model, the architecture used for producing these results is WRN284. We report validation accuracy on natural images and adversarial images generated using PGD attacks with iterations and iterations. We also compare our method with adversarially trained robust models following [24] and [29]. Note that the PGD7 adversarially trained model [24] requires more training time than natural training on clean images, while the Free10 models [29] have similar computation cost as natural training. Models with the suffix “small” are adversarially trained using a stepsize of . The adversarially trained models without the small suffix are trained with a stepsize .
Row #  (Robust) model  Evaluated Against 


Natural  PGD20  PGD100  
1  Natural  74.84%  0.00%  0.00%  5.87  
2  PGD7 [24]  57.18%  18.38%  18.13%  5.87  
3  Free10 [29]  54.18%  19.21%  18.98%  5.87  
4  Free10adaptive  61.19%  21.95%  21.68%  6.07  
5  Free10lstep  50.52%  23.08%  23.02%  5.87  
6  Free10WRN285  51.02%  23.12%  23.03%  9.16  
7  Free10init  55.93%  24.86%  24.61%  5.87  
8  Free10adaptive  57.26%  25.86%  25.69%  6.07 
Advantage of adaptive network. We first evaluate robust models trained with stepsize for perturbation updates following [24] (rows 24 in tables 2 and 1). We can train a robust WRN284 with PGD7 [24] that achieves about 40% accuracy under strong PGD attacks. Our alternative adversarial training mechanism, Free10 [29] achieves slightly better robust accuracy under PGD attacks with a drop in natural accuracy on clean validation images. Since Free10 is significantly faster than PGD adversarial training, we also use it to adversarially train our adaptive networks. Our adaptive network with conditional normalization built on WRN284 (Free10adaptive, row 4) outperforms the PGD adversarially trained WRN284 (PGD7, row 2) and Free10 (row 3) in both natural accuracy and robust accuracy, illustrating the advantage of our adaptive networks.
Strong baseline and effectiveness of our “tricks” in adversarial training. We explore “tricks” to improve the performance of adversarial training. As shown in tables 2 and 1, by comparing Free10 (row 3) and Free10lstep (row 5), we can see that the larger stepsize used for training does improve the robustness of free training but again at an additional cost of decreasing natural validation accuracy.
Note that our Free10adaptive model has slightly more parameters compared to the adversarially trained PGD7 and Free10 models. For this reason, we compare to higher capacity models to ensure that the superiority of our adaptive network is not solely due to having a (slightly) larger number of parameters. To create strong, highcapacity baselines we adversarially train a larger model WRN285 (row 6), and WRN284 with a naturally trained model as initialization (row 7). Our adaptive network is slightly larger than the nonadaptive WRN284, and is much smaller than WRN285. A good initialization surprisingly helps both natural accuracy and robust accuracy. Our adaptive network outperforms the best strong baseline for both natural accuracy and robust accuracy.
(Robust) model  Evaluated Against  

Natural  PGD20  PGD100  
Natural  94.90%  0.00%  0.00% 
Nonadaptive  84.44%  53.74%  53.18% 
Adaptive  84.79 %  54.98%  54.76 % 
(Robust) model  Evaluated Against  
Natural  PGD20  PGD100  
Natural  94.10%  0.00%  0.00% 
PGD7  83.84%  40.03%  39.38% 
RFGSM  85.81%  0.11%  0.00% 
Our RFGSM  84.03 %  38.71%  37.99 % 
Adaptive  84.87%  39.95%  38.92% 
TRADES objective and higher robustness. In table 3, we combine the proposed method with the TRADES objective [42] since our adaptive network is complementary to the objective design of adversarial training. We can achieve better robust accuracy on the CIFAR10 dataset with the TRADES objective, and our adaptive network performs better than the nonadaptive network. Note that the TRADES method applies PGD10 to generate adversarial examples in adversarial training, which is slower than PGD7 in [24], and much slower than the fast algorithm we used.
RFGSM and the proposed variant. We present experimental results on RFGSM adversarial training in table 4. We halved the number of epochs for training so that RFGSM training can complete in similar time as natural training and free adversarial training. Classical RFGSM with uniform sampling and norm clipping cannot provide robustness against strong PGD attacks. The proposed RFGSM variant can defend against PGD attack, and achieves comparable robust accuracy as PGD adversarial training when combined with our adaptive network. Though our RFGSM results are worse than our best robust accuracy in table 1 when we use fast adversarial training with “tricks” to train the adaptive network, the proposed variant works well with standard training of ResNet on CIFAR , which is complementary to the recent interest in replacing PGD with RFGSM training.
(Robust) model  Evaluated Against 


Natural  PGD20  PGD100  
Natural  94.76%  0.00%  0.00%  46.16  
PGD7 [24]  87.3%  45.8%  45.3%  45.90  
Free8 [29]  85.96%  46.82%  46.19%  45.90  
Free10  79.45%  48.03%  47.9 %  46.16  
Free10init  84.03%  50.23%  49.93%  46.16  
Free10adaptive  84.39%  50.93%  50.68%  47.28 
5.2 Larger network and previous benchmark
In table 5, we report results on a larger network WRN3410, which is widely used for the CIFAR10 benchmark. We first directly compare with the accuracy values reported in the literature in [24] and [29] by training on the objective equation (4). Our adaptive network achieves better robust accuracy with more than 3% improvement. Moreover, our adaptive network outperforms the strong baselines we achieved with “tricks” (Free10 and Free10init) on both natural accuracy and robust accuracy.
5.3 Training curves and qualitative analysis
We plot the training and validation accuracy of the Free10, Free10adaptive, and PGD7 adversarially trained (PGD7) models after each epoch in fig. 3. Training accuracy of robust models is computed using adversarial examples that are seen during training, and do not correspond to the natural training accuracy. They can be thought of as robustness on training examples. In figs. 2(d) and 2(a), the PGD7 model fits the adversarial examples built for the training samples to a rather high accuracy, while Free10 seems to never overfit to the trainingset adversarial training samples. The training accuracy of Free10 [29] is quite close to the final adversarial validation accuracy in figs. 2(f) and 2(c). The natural validation accuracy of PGD7 increases faster than Free10 at the beginning, while the accuracy at the end of training become close, as shown in figs. 2(e) and 2(b). Free10 consistently improves robust accuracy against adversarial validation samples, while PGD7 seems to saturate after the fast increase at the beginning (see figs. 2(f) and 2(c)). Our adaptive network (blue curve) always has higher natural and robust validation accuracy than the nonadaptive WRN284 models except for a short range around epoch 60 in figs. 2(c) and 2(b), where the accuracy of the adaptive network decreases. Tuning the learning rate could potentially prevent this decrease and further boost the performance of adaptive networks.
[34] presented an interesting side effect of robust models: largely perturbed adversarial examples for adversarially robust models align with human perception. That is, they “look” like the class which they are getting misclassified to. We use PGD50 to generate adversarial images with large perturbations (). The generated images for our adversarially trained adaptive nets have characteristics that align well with human perception (Fig. 4).
6 Conclusion
Inspired by recent research in conditional normalization [16, 20] and properties of robustness [24, 34, 28], we introduced an adaptive normalization module conditioned on inputs for boosting the robustness of networks. Our adaptive networks combined with a fast adversarial training algorithm, can effectively train robust models that outperform their nonadaptive parallels and also nonadaptive networks with more parameters. Our study on adversarial training presents several “tricks” that can be widely used to improve the training performance of robust models. We also introduce a variant of singestep adversarial training that can achieve competitive robustness against multistep attacks. We verify the effectiveness and efficiency of adaptive networks and our adversarial training with experiments on CIFAR10 and CIFAR100 benchmark and WRN networks.
References
 [1] (2019) Fast is better than free: revisiting adversarial training. openreview preprint. Cited by: §2, §4.
 [2] (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. ICML. Cited by: §1, §2.
 [3] (2017) Synthesizing robust adversarial examples. arXiv preprint arXiv:1707.07397. Cited by: §1.
 [4] (2013) Evasion attacks against machine learning at test time. In ECMLPKDD, pp. 387–402. Cited by: §1.

[5]
(2017)
Adversarial examples are not easily detected: bypassing ten detection methods.
In
ACM Workshop on Artificial Intelligence and Security
, pp. 3–14. Cited by: §1.  [6] (2017) Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 39–57. Cited by: §2.
 [7] (2017) Parseval networks: improving robustness to adversarial examples. In ICML, pp. 854–863. Cited by: §1.
 [8] (2019) Certified adversarial robustness via randomized smoothing. ICML. Cited by: §2.
 [9] (2017) Modulating early visual processing by language. In Advances in Neural Information Processing Systems, pp. 6594–6604. Cited by: §1.
 [10] (2017) A learned representation for artistic style. ICLR. Cited by: §1, §2.
 [11] (2015) Explaining and harnessing adversarial examples. ICLR. Cited by: §1, §2, §4.
 [12] (2014) Generative adversarial nets. In NIPS, Cited by: §1.
 [13] (20190401) Hackers trick a tesla into veering into the wrong lane. MIT Technology Review. Cited by: §1.
 [14] (2015) Deep Residual Learning for Image Recognition. CVPR 7 (3), pp. 171–180. External Links: 1512.03385 Cited by: §3.1.
 [15] (2017) Densely connected convolutional networks. In CVPR, Cited by: §3.
 [16] (2017) Arbitrary style transfer in realtime with adaptive instance normalization. In CVPR, pp. 1501–1510. Cited by: §1, §2, §3.1, §6.
 [17] (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In ICML, pp. 448–456. Cited by: §1, §2.
 [18] (2018) Improving dnn robustness to adversarial attacks using jacobian regularization. In ECCV, pp. 514–529. Cited by: §1.

[19]
(2018)
Adversarial logit pairing
. arXiv preprint arXiv:1803.06373. Cited by: §1.  [20] (2019) A stylebased generator architecture for generative adversarial networks. CVPR. Cited by: §1, §2, §3.1, §6.
 [21] (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533. Cited by: §2.
 [22] (2017) No need to worry about adversarial examples in object detection in autonomous vehicles. arXiv preprint arXiv:1707.03501. Cited by: §1.
 [23] (2018) Characterizing adversarial subspaces using local intrinsic dimensionality. arXiv preprint arXiv:1801.02613. Cited by: §1.

[24]
(2017)
Towards deep learning models resistant to adversarial attacks
. ICLR. Cited by: §1, §1, §2, §2, §3, Table 1, §4, §4, §4, §5.1, §5.1, §5.1, §5.2, Table 2, Table 5, §6.  [25] (2017) Magnet: a twopronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 135–147. Cited by: §1.
 [26] (2016) Deepfool: a simple and accurate method to fool deep neural networks. In CVPR, pp. 2574–2582. Cited by: §2.
 [27] (2018) Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344. Cited by: §2.
 [28] (2018) Adversarially robust generalization requires more data. In NeurIPS, pp. 5014–5026. Cited by: §1, §6.
 [29] (2019) Adversarial training for free. NeurIPS. Cited by: §2, §3, Table 1, §4, §4, §5.1, §5.1, §5.2, §5.3, Table 2, Table 5.
 [30] (2018) Universal adversarial training. arXiv preprint arXiv:1811.11304. Cited by: §1.
 [31] (2018) Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models. In ECCV, pp. 631–648. Cited by: §1.
 [32] (2013) Intriguing properties of neural networks. ICLR. Cited by: §1, §2.
 [33] (2017) Evaluating robustness of neural networks with mixed integer programming. arXiv preprint arXiv:1711.07356. Cited by: §2.

[34]
(2018)
Robustness may be at odds with accuracy
. ICLR 1050, pp. 11. Cited by: §1, Figure 4, §5.3, §6.  [35] (2016) Instance normalization: the missing ingredient for fast stylization. CoRR abs/1607.08022. Cited by: §1, §2.
 [36] (2018) Mixtrain: scalable training of formally robust neural networks. arXiv preprint arXiv:1811.02625. Cited by: §2.
 [37] (2018) Scaling provable adversarial defenses. In NeurIPS, pp. 8400–8409. Cited by: §2.
 [38] (2019) Training for faster adversarial robustness verification via inducing relu stability. ICLR. Cited by: §2.
 [39] (2019) Feature denoising for improving adversarial robustness. CVPR. Cited by: §1, §2.
 [40] (2017) Feature squeezing: detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155. Cited by: §1.
 [41] (2016) Wide residual networks. arXiv preprint arXiv:1605.07146. Cited by: §3.1, §3.
 [42] (2019) Theoretically principled tradeoff between robustness and accuracy. ICML. Cited by: §1, §1, §2, §4, §5.1, Table 3.
 [43] (2019) Fixup initialization: residual learning without normalization. ICLR. Cited by: §2.
Comments
There are no comments yet.