mlsys2020
None
view repo
Enhancing model robustness under new and even adversarial environments is a crucial milestone toward building trustworthy machine learning systems. Current robust training methods such as adversarial training explicitly uses an "attack" (e.g., ℓ_∞norm bounded perturbation) to generate adversarial examples during model training for improving adversarial robustness. In this paper, we take a different perspective and propose a new framework called SPROUT, selfprogressing robust training. During model training, SPROUT progressively adjusts training label distribution via our proposed parametrized label smoothing technique, making training free of attack generation and more scalable. We also motivate SPROUT using a general formulation based on vicinity risk minimization, which includes many robust training methods as special cases. Compared with stateoftheart adversarial training methods (PGDl_inf and TRADES) under l_infnorm bounded attacks and various invariance tests, SPROUT consistently attains superior performance and is more scalable to large neural networks. Our results shed new light on scalable, effective and attackindependent robust training methods.
READ FULL TEXT VIEW PDF
Adversarial examples are malicious inputs designed to fool machine learn...
read it
Recent improvements in deep learning models and their practical applicat...
read it
Conventional adversarial training methods using attacks that manipulate ...
read it
Adversarial training and its variants have become de facto standards for...
read it
Traditional classification algorithms assume that training and test data...
read it
Adversarial examples are malicious inputs crafted to cause a model to
mi...
read it
The evaluation of robustness against adversarial manipulation of neural
...
read it
None
While deep neural networks (DNNs) have achieved unprecedented performance on a variety of datasets and across domains, developing better training algorithms that are capable of strengthening model robustness is the next crucial milestone toward trustworthy and reliable machine learning systems. In recent years, DNNs trained by standard algorithms (i.e., the natural models) are shown to be vulnerable to adversarial input perturbations (Biggio et al., 2013; Szegedy et al., 2014). Adversarial examples crafted by designed input perturbations can easily cause erroneous decision making of natural models (Goodfellow et al., 2015) and thus intensify the demand for robust training methods.
Stateoftheart robust training algorithms are primarily based on the methodology of adversarial training (Goodfellow et al., 2015; Madry et al., 2018), which calls specific attack algorithms to generate adversarial examples during model training for learning robust models. Albeit effective, these methods have the following limitations: (i) poor scalability
– the process of generating adversarial examples incurs considerable computation overhead. For instance, our experiments show that, with the same computation resources, standard adversarial training (with 7 attack iterations per sample in every minibatch) of Wide ResNet on CIFAR10 consumes 10 times more clock time per training epoch when compared with standard training; (ii)
attack specificity – adversarially trained models are usually most effective against the same attack they trained on, and the robustness may not generalize well to other types of attacks (Tramèr and Boneh, 2019; Kang et al., 2019); (iii) preference toward wider network– adversarial training is more effective when the networks have sufficient capacity (e.g., having more neurons in network layers)
(Madry et al., 2018).To address the aforementioned limitations, in this paper we propose a new robust training method named SPROUT, which is short for selfprogressing robust training. We motivate SPROUT by introducing a general framework that formulates robust training objectives via vicinity risk minimization (VRM), which includes many robust training methods as special cases. It is worth noting that the robust training methodology of SPROUT is fundamentally different from adversarial training, as SPROUT features selfadjusted label distribution during training instead of attack generation. In addition to our proposed parametrized label smoothing technique for progressive adjustment of training label distribution, SPROUT also adopts Gaussian augmentation and Mixup (Zhang et al., 2018) to further enhance robustness. We show that they offer a complementary gain in robustness. In contrast to adversarial training, SPROUT spares the need for attack generation and thus makes its training scalable by a significant factor, while attaining better or comparable robustness performance on a variety of experiments. We also show exclusive features of SPROUT in terms of the novel findings that it can find robust models from either randomly initialized models or pretrained models, and its robustness performance is less sensitive to network width. Our implementation is publicly available ^{1}^{1}1Code available at https://github.com/IBM/SPROUT.
Multidimensional performance enhancement. To illustrate the advantage of SPROUT over adversarial training and its variations, Figure 5 compares the model performance of different training methods with the following five dimensions summarized from our experimental results: (i) Clean Acc – standard test accuracy, (ii) L_inf Acc – accuracy under norm projected gradient descent (PGD) attack (Madry et al., 2018), (iii) C&W Acc – accuracy under norm CarliniWagner (C&W) attack, (iv) scalability – per epoch clock runtime, and (v) invariance – invariant transformation tests including rotation, brightness, contrast and gray images. Comparing to PGD based adversarial training (Madry et al., 2018) and TRADES (Zhang et al., 2019), SPROUT attains at least 20% better L_inf Acc, 2% better Clean Acc, 5 faster runtime (scalability), 2% better invariance, while maintaining C&W Acc, suggesting a new robust training paradigm that is scalable and comprehensive.
We further summarize our main contributions as follows:
We propose SPROUT, a selfprogressing robust training method composed of three modules that are efficient and free of attack generation: parametrized label smoothing, Gaussian augmentation, and Mixup. They altogether attain the stateoftheart robustness performance and are scalable to largescale networks.
We will show that these modules are complementary to enhancing robustness. We also perform an ablation study to demonstrate that our proposed parametrized label smoothing technique contributes to the major gain in boosting robustness.
To provide technical explanations for SPROUT, we motivate its training methodology based on the framework of vicinity risk minimization (VRM). We show that many robust training methods, including attackspecific and attackindependent approaches, can be characterized as a specific form of VRM. The superior performance of SPROUT provides new insights on developing efficient robust training methods and theoretical analysis via VRM.
We evaluate the multidimensional performance of different training methods on (wide) ResNet and VGG networks using CIFAR10 and ImageNet datasets. Notably, although SPROUT is attackindependent during training, we find that SPROUT significantly outperforms two major adversarial training methods, PGD
adversarial training (Madry et al., 2018) and TRADES (Zhang et al., 2019), against the same type of attacks they used during training. Moreover, SPROUT is more scalable and runs at least 5 faster than adversarial training methods. It also attains higher clean accuracy, generalizes better to various invariance tests, and is less sensitive to network width.Methods  attackspecific  

Natural  
GA (Zantedeschi et al., 2017)  
LS (Szegedy et al., 2016)  
Adversarial training (Madry et al., 2018)  
TRADES (Zhang et al., 2019)  
Stable training (Zheng et al., 2016)  
Mixup (Zhang et al., 2018)  
LS+GA (Shafahi et al., 2019a)  
Bilateral Adv Training (Wang and Zhang, 2019)  (one or two step)  
SPROUT (ours)  Dirichlet 
Attackspecific robust training. The seminal work of adversarial training with a firstorder attack algorithm for generating adversarial examples (Madry et al., 2018) has greatly improved adversarial robustness under the same threat model (e.g., norm bounded perturbations) as the attack algorithm. It has since inspired many advanced adversarial training algorithms with improved robustness. For instance, TRADES (Zhang et al., 2019) is designed to minimize a theoreticallydriven upper bound on prediction error ofadversarial examples. Liu and Hsieh (2019) combined adversarial training with GAN to further improve robustness. Bilateral adversarial training (Wang and Zhang, 2019) finds robust models by adversarially perturbing the data samples and as well as the associated data labels. A featurescattering based adversarial training method is proposed in (Zhang and Wang, 2019). Different from attackspecific robust training methods, our proposed SPROUT is free of attack generation, yet it can outperform attackspecific methods. Another line of recent works uses an adversarially trained model along with additional unlabeled data (Carmon et al., 2019; Stanforth et al., 2019)
or selfsupervised learning with adversarial examples
(Hendrycks et al., 2019) to improve robustness, which in principle can also be used in SPROUT but is beyond the scope of this paper.Attackindependent robust training. Here we discuss related works on Gaussian data augmentation, Mixup and label smoothing. Gaussian data augmentation during training is a commonly used baseline method to improve model robustness (Zantedeschi et al., 2017). Liu et al. (2018a, b, 2019) demonstrated that additive Gaussian noise at both input and intermediate layers can improve robustness. Cohen et al. (2019) showed that Gaussian augmentation at the input layer can lead to certified robustness, which can also be incorporated in the training objective Zhai et al. (2020). Mixup (Zhang et al., 2018) and its variants (Verma et al., 2018; Thulasidasan et al., 2019) are a recently proposed approach to improve model robustness and generalization by training a model on convex combinations of data sample pairs and their labels. Label smoothing was originally proposed in (Szegedy et al., 2016)
as a regularizer to stabilize model training. The main idea is to replace onehot encoded labels by assigning nonzero (e.g., uniform) weights to every label other than the original training label. Although label smoothing is also shown to benefit model robustness
(Shafahi et al., 2019a; Goibert and Dohmatob, 2019), its robustness gain is relatively marginal when compared to adversarial training. In contrast to currently used static (i.e., predefined) label smoothing functions, in SPROUT we propose a novel parametrized label smoothing scheme, which enables adaptive sampling of training labels from a parameterized distribution on the label simplex. The parameters of the label distribution are progressively adjusted according to the updates of model weights.The task of supervised learning is essentially learning a class classification function that has a desirable mapping between a data sample and the corresponding label
. Consider a loss function
that penalizes the difference between the prediction and the true label from an unknown data distribution , . The population risk can be expressed as(1) 
However, as the distribution is unknown, in practice machine learning uses empirical risk minimization (ERM) with the empirical data distribution of training data
(2) 
to approximate , where is a Dirac mass. Notably, a more principled approach is to use Vicinity Risk Minimization (VRM) (Chapelle et al., 2001), defined as
(3) 
where
is a vicinity distribution that measures the probability of finding the virtual samplelabel pair
in the vicinity of the training pair . Therefore, ERM can be viewed as a special case of VRM when . VRM has also been used to motivate Mixup training (Zhang et al., 2018). Based on VRM, we propose a general framework that encompasses the objectives of many robust training methods as the following generalized cross entropy loss:(4) 
where is the model’s th class prediction probability on , is a mapping adjusting the probability output, and is a mapping adjusting the training label distribution. When , and , where denotes the identity mapping function, the loss in (4) degenerates to the conventional cross entropy loss.
Based on the general VRM loss formulation in (4), in Table 1 we summarize a large body of robust training methods in terms of different expressions of , and .
For example, the vanilla adversarial training in (Madry et al., 2018) aims to minimize the loss of adversarial examples generated by the (multistep) PGD attack with perturbation budget , denoted by . Its training objective can be rewritten as , and . In addition to adversarial training only on perturbed samples of , Wang and Zhang (2019) designs adversarial label perturbation where it uses , and is a mixing parameter. TRADES (Zhang et al., 2019) improves adversarial training with an additional regularization on the clean examples, which is equivalent to replacing the label mapping function from identity to . Label smoothing (LS) alone is equivalent to the setup that , , and , where
is often set as a uniform vector with value
for a class supervised learning task. Joint training with Gaussian augmentation (GA) and label smoothing (LS) as studied in (Shafahi et al., 2019a) is equivalent to the case when , and . We defer the connection between SPROUT and VRM to the next section.In this section, we formally introduce SPROUT, a novel robust training method that automatically finds a better vicinal risk function during training in a selfprogressing manner.
To stabilize training and improve model generalization, Szegedy et al. (2016) introduces label smoothing that converts “onehot” label vectors into “onewarm” vectors representing lowconfidence classification, in order to prevent a model from making overconfident predictions. Specifically, the onehot encoded label is smoothed using
(5) 
where
is the smoothing parameter. A common choice is the uniform distribution
, where is the number of classes. Later works (Wang and Zhang, 2019; Goibert and Dohmatob, 2019) use an attackdriven label smoothing function to further improve adversarial robustness. However, both uniform and attackdriven label smoothing disregard the inherent correlation between labels. To address the label correlation, we propose to use the Dirichlet distribution parametrized by for label smoothing. Our SPROUT learns to update to find a training label distribution that is most uncertain to a given model , by solving(6) 
where . Notably, instead of using a predefined or attackdriven function for in label smoothing, our Dirichlet approach automatically finds a label simplex by optimizing . Dirichlet distribution indeed takes label correlation into consideration as its generated label has the statistical properties
(7) 
where and , . Moreover, onehot label and uniform label smoothing are our special cases when and , respectively. Our Dirichlet label smoothing cotrains with the update in model weights during training (see Algorithm 1).
Gaussian augmentation. Adding Gaussian noise to data samples during training is a common practice to improve model robustness. Its corresponding vicinal function is the Gaussian vicinity function , where
is the variance of a standard normal random vector. However, the gain of Gaussian augmentation in robustness is marginal when compared with adversarial training (see our ablation study).
Shafahi et al. (2019a) finds that combining uniform or attackdriven label smoothing with Gaussian smoothing can further improve adversarial robustness. Therefore, we propose to incorporate Gaussian augmentaion with Dirichlet label smoothing. The joint vicinity function becomes . Training with this vicinity function means drawing labels from the Dirichlet distribution for the original data sample and its neighborhood characterized by Gaussian augmentation.Mixup. To further improve model generalization, SPROUT also integrates Mixup (Zhang et al., 2018) that performs convex combination on pairs of training data samples (in a minibatch) and their labels during training. The vicinity function of Mixup is , where
is the mixing parameter drawn from the Beta distribution and
is the shape parameter. The Mixup vicinity function can be generalized to multiple data sample pairs. Unlike Gaussian augmentation which is irrespective of the label (i.e., only adding noise to ), Mixup aims to augment data samples on the line segments of training data pairs and assign them convexly combined labels during training.Vicinity function of SPROUT. With the aforementioned techniques integrated in SPROUT, the overall vicinity function of SPROUT can be summarized as .
In the experiment, we will show that Dirichlet label smoothing, Gaussian augmentation and Mixup are complementary to enhancing robustness by showing their diversity in input gradients.
Using the VRM framework, the training objective of SPROUT is
(8) 
where denotes the model weights, is the number of training data, is the generalized cross entropy loss defined in (4) and is the vicinity function of SPROUT. Our SPROUT algorithm cotrains and
via stochastic gradient descent/ascent to solve the outer minimization problem on
and the inner maximization problem on . In particular, for calculating the gradient of the parameter, we use the Pytorch implementation based on
(Figurnov et al., 2018). SPROUT can either train a model from scratch with randomly initialized or strengthen a pretrained model. We find that training from either randomly initialized or pretrained natural models using SPROUT can yield substantially robust models that are resilient to large perturbations (see Appendix). The training steps of SPROUT are summarized in Algorithm 1.We also note that our minmax training methodology is different from the minmax formulation in adversarial training (Madry et al., 2018), which is , where denotes the norm of the adversarial perturbation . While the outer minimization step for optimizing can be identical, the inner maximization of adversarial training requires running multistep PGD attack to find adversarial perturbations for each data sample in every minibatch (iteration) and epoch, which is attackspecific and timeconsuming (see our scalability analysis in Table 6). On the other hand, our inner maximization is upon the Dirichlet parameter , which is attackindependent and only requires singlestep stochastic gradient ascent with a minibatch to update . We have explored multistep stochastic gradient ascent on and found no significant performance enhancement but increased computation time.
Advantages of SPROUT. Comparing to adversarial training, the training of SPROUT is more efficient and scalable, as it only requires one additional back propagation to update in each iteration (see Table 6 for a runtime analysis). As highlighted in Figure 5, SPROUT is also more comprehensive as it automatically improves robustness in multiple dimensions owing to its selfprogressing training methodology. Moreover, we find that SPROUT significantly outperforms adversarial training and attains larger gain in robustness as network width increases (see Figure 11), which makes SPROUT a promising approach to support robust training for a much larger set of network architectures.
Dataset and network structure. We use CIFAR10 (Krizhevsky et al., 2009)
and ImageNet
(Deng et al., 2009) for performance evaluation. For CIFAR10, we use both standard VGG16 (Simonyan and Zisserman, 2015) and Wide ResNet. The Wide ResNet models are pretrained PGD robust models given by adversarial training and TRADES. For VGG16, we implement adversarial training with the standard hyperparameters and train TRADES using the official implementation. For ImageNet, we use ResNet152. All our experiments were implemented in Pytorch1.2 and conducted using dual Intel E52640 v4 CPUs (2.40GHz) with 512 GB memory with a GTX 1080 GPU.Implementation details. As suggested in Mixup (Zhang et al., 2018), we set the Beta distribution parameter when sampling the mixing parameter . For Gaussian augmentation, we set , which is within the suggested range in (Zantedeschi et al., 2017). Also, we set the label smoothing parameter . A parameter sensitivity analysis on and is given in Appendix. Unless specified otherwise, for SPROUT we set the model initialization to be a natural model. An ablation study of model initialization is given in ablation study.
Whitebox attacks. On CIFAR10, we compare the model accuracy under strength of whitebox norm bounded nontargeted PGD attack, which is considered as the strongest firstorder adversary (Madry et al., 2018) with an norm constraint (normalized between 0 to 1). All PGD attacks are implemented with random starts and we run PGD attack with 20, 100 and 200 steps in our experiments. To be noted, we use both (step PGD with step size . As suggested, we test our model under different steps PGD and multiple random restarts. In Table 2, we find SPROUT achieves 62.24% and 66.23% robust accuracy on VGG16 and Wide ResNet respectively, while TRADES and adversarial training are 1020% worse than SPROUT. Moreover, we report the results of C&W attack (Carlini and Wagner, 2017) in Appendix. Next, we compare against norm based C&W attack by using the default attack setting with 10 binary search steps and 1000 iterations per step to find successful perturbations while minimizing their norm. SPROUT can achieve 85.21% robust accuracy under constraint while Adv train and TRADES achieves 77.76% and 82.58% respectively. It verifies that SPROUT can improve robustness by a large margin without degrading robustness. SPROUT’s accuracy under C&W attack is similar to TRADES and is better than both natural and adversarial training. The results also suggest that the attackindependent and selfprogressing training nature of SPROUT can prevent the drawback of failing to provide comprehensive robustness to multiple and simultaneous norm attacks in adversarial training (Tramèr and Boneh, 2019; Kang et al., 2019).

VGG16  WideResNet 20  

Methods  No attack  PGD  PGD  PGD  10 PGD  No attack  PGD  PGD  PGD  10 PGD 
Natural train  93.34%  0.6%  0.1%  0.0%  0.0%  95.93%  0.0%  0.0%  0.0%  0.0% 
Adv train (Madry et al., 2018)  80.32%  36.63%  36.29%  36.01%  36.8%  87.25%  45.91%  45.32%  45.02%  44.98% 
TRADES (Zhang et al., 2019)  84.85%  38.81%  38.21%  37.95%  37.94%  84.92%  56.23%  56.13%  55.96%  56.01% 
SPROUT (ours)  89.15%  62.24%  58.93%  57.9%  58.08%  90.56%  66.23%  64.58%  64.30%  64.32% 
Transfer attack. We follow the criterion of evaluating transfer attacks in (Athalye et al., 2018) to inspect whether the models trained by SPROUT will cause the issue of obfuscated gradients and give a false sense of robustness. We generate 10,000 PGD adversarial examples from CIFAR10 natural models with and evaluate their attack performance on the target model. Table 3 shows SPROUT achieves the best accuracy when compared with adversarial training and TRADES, suggesting the effectiveness of SPROUT in defending both whitebox and transfer attacks.
ImageNet results. As many ImageNet class labels carry similar semantic meanings, to generate meaningful adversarial examples for robustness evaluation, here we follow the same setup as in (Athalye et al., 2018) that adopts PGD attacks with randomly targeted labels. Table 4 compares the robust accuracy of natural and SPROUT models. SPROUT greatly improves the robust accuracy across different values. For example, when , SPROUT boosts the robust accuracy of natural model by over . When , a considerably large adversarial perturbation on ImageNet, SPROUT still attains about robust accuracy while the natural model merely has about robust accuracy. Moreover, comparing the clean accuracy, SPROUT is about 4% worse than the natural model but is substantially more robust. We omit the comparison to adversarial training methods as we are unaware of any public pretrained robust ImageNet models of the same architecture (ResNet152) prior to the time of our submission, and it is computationally demanding for us to train and finetune such largescale networks with adversarial training. On our machine, training a natural model takes 31,158.7 seconds and training SPROUT takes 59,201.6 seconds. Comparing to the runtime analysis, SPROUT has a much better scalability than adversarial training and TRADES. However, instead of ResNet152, we use SPROUT to train the same ResNet50 model as the pretrained Free Adv Train network and compare their performance in Appendix.
To further verify the superior robustness using SPROUT, we visualize the loss landscape of different training methods in Figure 10. Following the implementation in (Engstrom et al., 2018), we vary the data input along a linear space defined by the sign of the input gradient and a random Rademacher vector, where the x and y axes represent the magnitude of the perturbation added in each direction and the zaxis represents the loss. One can observe that the loss surface of SPROUT is smoother. Furthermore, it attains smaller loss variation compared with other robust training methods. The results provide strong evidence for the capability of finding more robust models via SPROUT.
In addition to norm bounded adversarial attacks, here we also evaluate model robustness against different kinds of input transformations using CIFAR10 and Wide ResNet. Specifically, we change rotation (with 10 degrees), brightness (increase the brightness factor to 1.5), contrast (increase the contrast factor to 2) and make inputs into grayscale (average all RGB pixel values). The model accuracy under these invariance tests is summarized in Table 6. The results show that SPROUT outperforms adversarial training and TRADES. Interestingly, natural model attains the best accuracy despite the fact that it lacks adversarial robustness, suggesting a potential tradeoff between accuracy in these invariance tests and norm based adversarial robustness.
SPROUT enjoys great scalability over adversarial training based algorithms because its training requires much less number of backpropagations per iteration, which is a dominating factor that contributes to considerable runtime in adversarial training. Table 6 benchmarks the runtime of different training methods for 10 epochs. On CIFAR10, the runtime of adverarial training and TRADES is about 5 more than SPROUT. We also report the runtime analysis using the default implementation of Free Adv Train (Shafahi et al., 2019b). Its 10epoch runtime with the replay parameter is similar to TRADES. But we also note that Free Adv Train may require less number of epochs when training to convergence.
Dissecting SPROUT. Here we perform an ablation study using VGG16 and CIFAR10 to
investigate and factorize the robustness gain in SPROUT’s three modules: Dirichlet label smoothing (Dirichlet), Gaussian augmentation (GA) and Mixup. We implement all combinations of these techniques and include uniform label smoothing (LS) (Szegedy et al., 2016) as another baseline. Their accuracies under PGD 0.03 attack are shown in Table 7. We highlight some important findings as follows.
Dirichlet outperforms uniform LS by a significant factor, suggesting the importance of our proposed selfprogressing label smoothing in improving adversarial robustness.
Comparing the performance of individual modules alone (GA, Mixup and Dirichlet), our proposed Dirichlet attains the best robust accuracy, suggesting its crucial role in training robust models.
No other combinations can outperform SPROUT. Moreover, the robust gains from GA, Mixup and Dirichlet appear to be complementary
, as SPROUT’s accuracy is close to the sum of their individual accuracy. To justify their diversity in robustness, we compute the cosine similarity of their pairwise input gradients and find that they are indeed quite diverse and thus can promote robustness when used together. The details are given in Appendix.
Effect on network width. It was shown in (Madry et al., 2018) that adversarial training (Adv Train) will take effect when a network has sufficient capacity, which can be achieved by increasing network width. Figure 11 compares the robust accuracy of SPROUT and Adv Train with varying network width on Wide ResNet and CIFAR10. When the network has width = 1 (i.e. a standard ResNet34 network (He et al., 2016)), the robust accuracy of SPROUT and Adv Train are both relatively low (less than 47%). However, as the width increases, SPROUT soon attains significantly better robust accuracy than Adv Train by a large margin (roughly 15%). Since SPROUT is more effective in boosting robust accuracy as network width varies, the results also suggest that SPROUT can better support robust training for a broader range of network structures.
This paper introduced SPROUT, a selfprogressing robust training method motivated by vicinity risk minimization. When compared with stateoftheart adversarial training based methods, our extensive experiments showed that the proposed selfprogressing Dirichlet label smoothing technique in SPROUT plays a crucial role in finding substantially more robust models against norm bounded PGD attacks and simultaneously makes the corresponding model more generalizable to various invariance tests. We also find that SPROUT can strengthen a wider range of network structures as it is less sensitive to network width changes. Moreover, SPOURT’s selfadjusted learning methodology not only makes its training free of attack generation but also becomes scalable solutions to large networks. Our results shed new insights on devising comprehensive and robust training methods that are attackindependent and scalable.
This work was done during Minhao Cheng’s internship at IBM Research. ChoJui Hsieh and Minhao Cheng are partially supported by National Science Foundation (NSF) under IIS1901527, IIS2008173 and Army Research Lab under W911NF2020158.
IEEE Conference on Computer Vision and Pattern Recognition
, pp. 248–255. Cited by: Experiment Setup.Evaluating and understanding the robustness of adversarial logit pairing
. arXiv preprint arXiv:1807.10272. Cited by: Loss Landscape Exploration.Towards deep learning models resistant to adversarial attacks
. International Conference on Learning Representations. Cited by: Appendix A, Figure 5, Contributions, Contributions, Related Work, Table 1, Introduction, General Framework for Formulating Robust Training, SPROUT Algorithm, Adversarial Robustness under Various Attacks, Ablation Study, Table 2.Manifold mixup: better representations by interpolating hidden states
. International Conference on Machine Learning. Cited by: Related Work.ACM Workshop on Artificial Intelligence and Security
, pp. 39–49. Cited by: Related Work, Table 1, Experiment Setup.Based on the statistical properties of Dirichlet distribution in (7), we use the final parameter learned from Algorithm 1 with CIFAR10 and VGG16 to display the matrix of its pairwise product in Figure 12. The value in each entry is proportional to the absolute value of the label covariance in (7). We observe some clustering effect of class labels in CIFAR10, such as relatively high values among the group of {airplane, auto, ship, truck} and relatively low values among the group of {bird, cat, deer, dog}. Moreover, since the parameter is progressively adjusted and cotrained during model training, and the final parameter is clearly not uniformly distributed, the results also validate the importance of using parametrized label smoothing to learn to improve robustness.
We perform an sensitivity analysis of the mixing parameter Beta(,) and the smoothing parameter of SPROUT in Figure 13. When fixing , we find that setting too large may affect robust accuracy, as the resulting training label distribution could be too uncertain to train a robust model. Similarly, when fixing , setting too large may also affect robust accuracy.
As suggested by (Madry et al., 2018), PGD attack with multiple random starts is a stronger attack method to evaluate robustness. Therefore, in Table 9, we conduct the following experiment on CIFAR10 and Wide ResNet and find that the model trained by SPROUT can still attain at least 61% accuracy against PGD attack () with the number of random starts varying from 1 to 10 and with 20 attack iterations. On the other hand, the accuracy of Adversarial training and TRADES can be as low as 45.21% and 56.7%, respectively. Therefore, The robust accuracy of SPROUT is still clearly higher than other methods. We can conclude that increasing the number of random starts may further reduce the robust accuracy of all methods by a small margin, but the observed robustness rankings and trends among all methods are unchanged. We also perform one additional attack setting: 100step PGD attack with 10 random restarts and . We find that SPROUT can still achieve 61.18% robust accuracy.
random start  1  3  5  8  10 

Adversarial training  45.88%  45.67%  45.52%  45.52%  45.21% 
TRADES  57.02%  56.84%  56.77%  56.7%  56.7% 
SRPOUT  64.58%  62.53%  61.98%  61.38%  61.00% 
To further test the robustness on constraint, we replace the cross entropy loss with C&W loss (Carlini and Wagner, 2017) in PGD attack. Similar to the PGD attack results, Figure 14 shows that although SPROUT has slightly worse accuracy under small values, it attains much higher robust accuracy when .
Here we compare the performance of SPROUT with a pretrained robust ResNet50 model on ImageNet, which is shared by the authors in (Shafahi et al., 2019b) proposing the free adversarial training method (Free Adv Train). We find that SPROUT obtains similar robust accuracy as Free Adv Train when . As becomes larger, Free Adv Train has better accuracy. However, comparing to the performance of ResNet152 in Table 4, SPROUT’s clean accuracy on ResNet50 actually drops by roughly 13%, indicating a large performance gap that intuitively shound not be present. Therefore, based on the current results, we postulate that the training parameters of SRPOUT for ResNet50 may not have been fully optimized (we use the default training parameters of ResNet152 for ResNet50), and that it is possible that SPROUT has larger gains in robust accuracy as the RestNet models become deeper.
Dirichilet LS  Mixup  GA  

Dirichilet LS  NA  0.1023  0.0163 
Mixup  0.1023  NA  0.0111 
GA  0.0163  0.0111  NA 
In order to show the three modules (Dirichlet LS, GA and Mixup) in SPROUT lead to robustness gains that are complementary to each other, we perform a diversity analysis motivated by (Kariyappa and Qureshi, 2019) to measure the similarity of their pairwise input gradients and report the average cosine similarity in Table 11 over 10,000 data samples using CIFAR10 and VGG16. We find that the pairwise similarity between modules is indeed quite small (). The MixupGA similarity is the smallest among all pairs since the former performs both label and data augmentation based on convex combinations of training data, whereas the latter only considers random data augmentation. The Dirichlet_LSGA similarity is the second smallest (and it is close to the MixupGA similarity) since the former progressively adjusts the training label while the latter only randomly adjusts the training sample . The Dirichlet_LSMixup similarity is relatively high because Mixup depends on the training samples and their labels while Dirichlet LS also depend on them and the model weights. The results show that their input gradients are diverse as they point to vastly different directions. Therefore, SPROUT enjoys complementary robustness gain and can promote robustness when combining these techniques together.
To ensure the robustness of SPROUT is not an artifact of running insufficient iterations in PGD attack (Athalye et al., 2018), Figure 15 shows the robust accuracy with varing number of PGD attack steps from 10 to 500 on Wide ResNet and CIFAR10. The results show stable performance in all training methods once the number of attack steps exceeds 100. It is clear that SPROUT indeed outperforms others by a large margin.
Figure 16 compares the effect of model initialization using CIFAR10 and VGG16 under PGD attack, where the legend means using Model as the initialization and training with Method . Interestingly, Natural+SPROUT attains the best robust accuracy when . TRADES+SPROUT and Random+SPROUT also exhibit strong robustness since their training objective involves the loss on both clean and adversarial examples. In contrast, Adv Train+SPROUT does not have such benefit since adversarial training only aims to minimize adversarial loss. This finding is also unique to SPROUT, as neither Natural+Adv Train nor Natural+TRADES can boost robust accuracy. Our results provide novel perspectives on improving robustness and also indicate that SPROUT is indeed a new robust training method that vastly differs from adversarial training based methods.
Moreover, SPROUT performs better when initializing with natural pretrained model. Therefore, in Figure 16, we have tried different kinds of initialization such as random, adversarial training and TRADES.
Comments
There are no comments yet.