1 Introduction
The past few years have seen an intense research interest in making models robust to adversarial examples [36] Yet despite a wide range of proposed defenses, the stateoftheart in adversarial robustness is far from satisfactory. Recent work points towards sample complexity as a possible reason for the small gains in robustness: Schmidt et al. [34]
show that in a simple model, learning a classifier with nontrivial adversarially robust accuracy requires substantially more samples than achieving good “standard” accuracy. Furthermore, recent empirical work obtains promising gains in robustness via transfer learning of a robust classifier from a larger labeled dataset
[15]. While both theory and experiments suggest that more training data leads to greater robustness, following this suggestion can be difficult due to the cost of gathering additional data and especially obtaining highquality labels.To alleviate the need for carefully labeled data, in this paper we study adversarial robustness through the lens of semisupervised learning. Our approach is motivated by two basic observations. First, adversarial robustness essentially asks that predictors be stable around naturally occurring inputs. Learning to meet such stability constraint does not inherently require labels. Second, the added requirement of robustness fundamentally alters the regime where semisupervision is useful. Prior work on semisupervised learning mostly focuses on the regime where labeled data provides only poor accuracy. However, in our adversarial setting the labeled data alone already produce accurate (but not robust) classifiers. We can use such classifiers on the unlabeled data and obtain useful
pseudolabels, which directly suggests the use of selftraining—one of the oldest frameworks for semisupervised learning [32], which consists of applying a supervised training method on the pseudolabeled data. We provide theoretical and experimental evidence that selftraining is effective for adversarial robustness.On the theoretical side, we consider the simple dimensional Gaussian model [34] with perturbations of magnitude . We scale the model so that labeled examples allow learning a classifier with nontrivial standard accuracy, and roughly examples are necessary for attaining any nontrivial robust accuracy. This implies a sample complexity gap in the highdimensional regime . In this regime, we prove that self training with unlabeled data and just labels achieves high robust accuracy. Our analysis provides a refined perspective on the sample complexity barrier in this model: the increased sample requirement is exclusively on unlabeled data.
On the empirical side, we propose and experiment with robust selftraining (RST), a natural extension of selftraining for robustness. RST uses standard supervised training to obtain pseudolabels and then feeds the pseudolabeled data into a supervised training algorithm that targets adversarial robustness. We use TRADES [45] for heuristic robustness, and stability training [46] combined with randomized smoothing [6] for certified robustness.
For CIFAR10 [17], we obtain 500K unlabeled images by mining the 80 Million Tiny Images dataset [38] with an image classifier. Using RST on the CIFAR10 training set augmented with the additional unlabeled data, we outperform stateoftheart heuristic robustness against strong iterative attacks by . In terms of certified robustness, RST outperforms our fully supervised baseline by and beats previous stateoftheart numbers by . Finally, we also match the stateoftheart certified robustness, while improving on the corresponding standard accuracy by over . We show that some natural alternatives such as virtual adversarial training [24] and aggressive data augmentation do not perform as well as RST. We also study the sensitivity of RST to varying data amount and relevance.
Experiments with SVHN show similar gains in robustness with RST on semisupervised data. Here, we apply RST by removing the labels from the 531K extra training data and see – increases in robust accuracies compared to the baseline that only uses the labeled 73K training set. Swapping the pseudolabels for the true SVHN extra labels increases these accuracies by at most one additional percentage point. This confirms that the majority of the benefit from extra data comes from the inputs and not the labels.
2 Setup
Semisupervised classification task.
We consider the task of mapping input to label . Let denote the underlying distribution of pairs, and let denote its marginal on . Given training data consisting of (i) labeled examples and (ii) unlabeled examples , the goal is to learn a classifier in a model family parameterized by .
Error metrics.
The standard quality metric for classifier
is its error probability,
(1) 
We also evaluate classifiers on their performance on adversarially perturbed inputs. In this work, we consider allow perturbations in a norm ball of radius around the input, and define the corresponding robust error probability,
(2) 
In this paper we study and .
Selftraining.
Consider a supervised learning algorithm that maps a dataset to parameter . Selftraining is the straightforward extension of to a semisupervised setting, and consists of the following two steps. First, obtain an intermediate model , and use it to generate pseudolabels for . Second, combine the data and pseudolabels to obtain a final model .
3 Theoretical results
In this section, we consider a simple highdimensional model studied in [34], which is the only known formal example of an informationtheoretic sample complexity gap between standard and robust classification. For this model, we demonstrate the value of unlabeled data—a simple selftraining procedure achieves high robust accuracy, when achieving nontrivial robust accuracy using the labeled data alone is impossible.
Gaussian model.
We consider a binary classification task where , , uniform on and
for a vector
and coordinate noise variance
. We are interested in the standard error (
1) and robust error (2) for perturbations of size .Parameter setting.
We choose the model parameters to meet the following desiderata: (i) some (difficult to learn) classifier achieves very high robust and standard accuracies, (ii) using examples we can learn a classifier with nontrivial standard accuracy and (iii) we require much more than examples to learn a classifier with nontrivial robust accuracy. As shown in [34], the following parameter setting meets the desiderata,
(3) 
When interpreting this setting it is useful to think of as fixed and of as a large number, i.e. a highly overparameterized regime.
3.1 Supervised learning in the Gaussian model
We briefly recapitulate the sample complexity gap described in [34] for the fully supervised setting.
Learning a simple linear classifier.
We consider linear classifiers of the form . Given labeled data , we form the following simple classifier
(4) 
We achieve nontrivial standard accuracy using examples; see Section A.2 for proof of the following (as well as detailed rates of convergence).
Proposition 1.
There exists a numerical constant such that for all ,
Moreover, as the following theorem states, no learning algorithm can produce a classifier with nontrivial robust error without observing examples. Thus, a sample complexity gap forms as grows.
Theorem 1 ([34]).
Let be any learning rule mapping a dataset to classifier . Then,
(5) 
where the expectation is with respect to the random draw of as well as possible randomization in .
3.2 Semisupervised learning in the Gaussian model
We now consider the semisupervised setting with labeled examples and additional unlabeled examples. We apply the selftraining methodology described in Section 2 on the simple learning rule (4); our intermediate classifier is , and we generate pseudolabels for . We then learning rule (4) to obtain our final semisupervised classifier . The following theorem guarantees that achieves high robust accuracy.
Theorem 2.
There exists a numerical constant such that for , labeled data and additional unlabeled data,
Compared to the fully supervised case, the selftraining classifier requires only a constant number more input examples, and roughly fewer labels. Intuitively, the selftrained classifier succeeds because the intermediate classifier produces labels that are (by Proposition 1) correct strictly more often that not. As grows, the noise averages out while a nonzero signal component remains, and so the angle between and goes to zero. By virtue of our parameter scaling, this guarantees very high robust and standard accuracies. We provide a rigorous proof and rates of convergence in Section A.4. We remark that other learning techniques, such as EM and PCA, can also leverage unlabeled data in this model. The selftraining procedure we describe is similar to 2 steps of EM [8].
In Section A.5 we study a setting where only of the unlabeled data are relevant to the task, which we model as having a signal component. We show that for any fixed high robust accuracy is still possible, but the required number of relevant examples grows by a factor of . This demonstrates that irrelevant data can significantly impede selftraining, but does not stop it completely.
4 Semisupervised learning of robust neural networks
Existing adversarially robust training methods are designed for the supervised setting. In this section, we use these methods to leverage additional unlabeled data by adapting the selftraining framework described in Section 2.
4.1 Robust selftraining
MetaAlgorithm 1 summarizes robustself training. In contrast to standard selftraining, we use a different supervised learning method in each stage, since the intermediate and the final classifiers have different goals. In particular, the only goal of
is to generate high quality pseudolabels for the (nonadversarial) unlabeled data. Therefore, we perform standard training in the first stage, and robust training in the second. The hyperparameter
allows us to upweight the labeled data, which in some cases may be more relevant to the task, and will usually have more accurate labels.4.2 Instantiating robust selftraining
Each stage of robust selftraining performs supervised learning, allowing us to borrow ideas from the literature on supervised standard and robust training. We consider neural networks of the form
, whereis a probability distribution over the class labels.
Standard loss.
As in common, we use the multiclass logarithmic loss for standard supervised learning,
Robust loss.
For the supervised robust loss, we use a robustnesspromoting regularization term proposed in [45] and closely related to earlier proposals in [46, 24, 16]. The robust loss is
(6)  
The regularization term^{1}^{1}1 Zhang et al. [45] write the regularization term , i.e. with rather than taking role of the label, but their open source implementation follows (6). forces predictions to remain stable within , and the hyperparameter balances the robustness and accuracy objectives. We consider two approximation for the maximization in .

[leftmargin = 12pt]

Adversarial training: a heuristic defense via approximate maximization.
We focus on perturbations and use the projected gradient method to approximate the regularization term of (6),
(7) where is obtained via projected gradient ascent on . Empirically, performing approximate maximization during training is effective in finding classifiers that are robust to a wide range of attacks [23].

Stability training: a certified defense via randomized smoothing.
Alternatively, we consider stability training [46, 21], where we replace maximization over small perturbations with much larger additive random noise drawn from ,
(8) Let be the classifier obtained by minimizing . At test time, we use the following smoothed classifier.
(9) Improving on previous work [19, 21], Cohen et al. [6] prove that robustness of to large random perturbations (the goal of stability training) implies certified adversarial robustness of the smoothed classifier .
5 Experiments
In this section, we empirically evaluate robust selftraining (RST) and show that it leads to consistent and significant improvement in robust accuracy, on both CIFAR10 [17] and SVHN [43] and with both adversarial () and stability training (). For CIFAR10, we mine unlabeled data from 80 Million Tiny Images, and study in depth the strengths and limitations of RST. For SVHN, we simulate unlabeled data by removing labels, and show that with RST the harm of removing the labels is small. This indicates that most of the gain comes from additional inputs rather than additional labels. Our experiments build on open source code from [45, 6].
Evaluating heuristic defenses.
Evaluating certified defenses.
For and other models trained against random noise, we evaluate certified robust accuracy of the smoothed classifier against attacks. We perform the certification using the randomized smoothing protocol described in [6], with parameters , , and noise variance .
Evaluating variability.
We repeat training 3 times and report accuracy as X Y, with X the median across runs and Y half the difference between the minimum and maximum.
5.1 Cifar10
5.1.1 Sourcing unlabeled data
To obtain unlabeled data distributed similarly to the CIFAR10 images, we use the 80 Million Tiny Images (80MTI) dataset [38], of which CIFAR10 is a manually labeled subset. However, most images in 80MTI do not correspond to CIFAR10 image categories. To select relevant images, we train an 11way classifier to distinguish CIFAR10 classes and an 11 ‘nonCIFAR10’ class using a Wide ResNet 2810 model [44] (the same as in our experiments below). For each class, we select additional 50K images from 80MTI using the trained model’s predicted scores^{2}^{2}2We exclude any image close to the CIFAR10 test set; see Section B.6 for detail.—this is our 500K images unlabeled which we add to the 50K CIFAR10 training set when performing RST. We provide a detailed description of the data sourcing process in Section B.6.
Model  CW [4]  Best attack  No attack  

63.1  63.1  62.5  64.9  62.5 0.1  89.7 0.1  
TRADES [45]  55.8  56.6  55.4  65.0  55.4  84.9 
Adv. pretraining [15]  57.4  58.2  57.7    57.4  87.1 
Madry et al. [23]  45.8      47.8  45.8  87.3 
Standard selftraining    0.3  0    0  96.4 
5.1.2 Benefit of unlabeled data
We perform robust selftraining using the unlabeled data described above. We use a Wide ResNet 2810 architecture for both the intermediate pseudolabel generator and final robust model. For adversarial training, we compute exactly as in [45] with , and denote the resulting model for . For stability training, we set the additive noise variance to to and denote the result . We provide training details in Section B.1.
Robustness of against strong attacks.
In Table 1, we report the accuracy of and the best models in the literature against various strong attacks at . Section B.3 for details). and correspond to the attacks used in [45] and [23] respectively, and we apply the CarliniWagner attack CW [4] on random test examples, where we use the implementation [27] that performs search over attack hyperparameters. We also tune a PG attack against (to maximally reduce its accuracy), which we denote (see Section B.3 for details). and correspond to the attacks used in [45] and [23] respectively.
gains 7% over TRADES [45], which we can directly attribute to the unlabeled data (see Section B.4). In Section C.6 we also show this gain holds over different attack radii. The model of Hendrycks et al. [15]
is based on ImageNet adversarial pretraining and is less directly comparable to ours due to the difference in external data and training method. Finally, we perform standard selftraining using the unlabeled data, which offers a moderate 0.4% improvement in standard accuracy over the intermediate model, but is not adversarially robust; see
Section C.5.Certified robustness of .
Figure 1a shows the certified robust accuracy as a function of perturbation radius for different models. We compare with [6], which has the highest reported certified accuracy, and , a model that we trained using only the CIFAR10 training set and the same training configuration as . improves on our by 3–5%. The gains of over the previous stateoftheart are due to a combination of better architecture, hyperparameters and training objective (see Section B.5). The certified is strong enough to imply stateoftheart certified robustness via elementary norm bounds. In Figure 1b we compare to the stateoftheart in certified robustness, showing a a 10% improvement over single models, in performance on par with the cascade approach of [41]. We also outperform the cascade model’s standard accuracy by .
5.1.3 Comparison to alternatives and ablations studies
Consistencybased semisupervised learning (Section c.1).
Virtual adversarial training (VAT), a stateoftheart method for (standard) semisupervised training of neural network [24, 26], is easily adapted to the adversariallyrobust setting. We train models using adversarial and stabilityflavored adaptations of VAT, and compare them to their robust selftraining counterparts. We find that the VAT approach offers only limited benefit over fullysupervised robust training, and that robust selftraining offers 3–6% higher accuracy.
Data augmentation (Section c.2).
In the lowdata/standard accuracy regime, strong data augmentation is competitive against and complementary to semisupervised learning [7, 42], as it effectively increases the sample size by generating different plausible inputs. It is therefore natural to compare stateoftheart data augmentation (on the labeled data only) to robust selftraining. We consider two popular schemes: cutout [10] and AutoAugment [7]. While they provide significant benefit to standard accuracy, both augmentation schemes do not improve performance when combined with robust training.
Relevance of unlabeled data (Section c.3).
The theoretical analysis in Section 3 suggests that selftraining performance may degrade significantly in the presence of irrelevant unlabeled data; other semisupervised learning methods share this sensitivity [26]. In order to measure the effect on robust selftraining, we mix out unlabeled data sets with different amounts of random images from 80MTI and compare the performance of resulting models. We find that stability training is more sensitive than adversarial training, and that both methods still yield noticeable robustness gains with roughly 50% relevant data.
Amount of unlabeled data (Section c.4).
Finally, we perform robust selftraining with varying amounts of unlabeled data and make two main observations. We observe that 100K unlabeled data provides roughly half the gain provided by 500K unlabeled data, indicating diminishing returns as data amount grows. However, as we report in Appendix C.4, hyperparameter tuning issues make it difficult to assess how performance trends with data amount.
5.2 Street View House Numbers (SVHN)
The SVHN dataset [43] is naturally split into a core training set of about 73K images, and an ‘extra’ training set with about 531K easier images. In our experiments, we compare three settings: (i) robust training on the core training set only, denoted , (ii) robust selftraining with the core training set and the extra training images, denoted , and (iii) robust training on all the SVHN training data, denoted . As in CIFAR10, we experiment with both adversarial and stability training, so stands for either at or stab.
Beyond validating the benefit of additional data, our SVHN experiments measure the loss inherent in using pseudolabels in lieu of true labels. Figure 2 summarizes the results: the unlabeled provides significant gains in robust accuracy, and the loss of using pseudolabels is below 1%. This reaffirms our intuition that in regimes of interests accurate labels are not crucial for improving robustness. We give a detailed account of our SVHN experiments in Appendix D, where we also compare our results to the literature.
6 Discussion
6.1 Related work
Semisupervised learning.
In the rich semisupervised literature, a recent successful family of semisupervised approaches enforces consistency in the model’s predictions under various perturbations of the unlabeled data [24, 42], or across training [37, 33, 18]. While some authors show modest gains from variants of selftraining [20], the more sophisticated approaches based on consistency are considered to be more successful [26]. However, most previous work on semisupervised learning considers a regime where labeled data is scarce and standard supervised learning cannot get good accuracy. In this work, we consider the very different regime of adversarial robustness, and observe that robust selftraining outperforms consistencybased regularization, even though the latter is naturally applicable to a robust setting. We note that there are several other approaches to semisupervised learning like transductive SVMs, graphbased methods, and generative modeling, surveyed in [5, 47].
Training robust classifiers.
Adversarial examples first appeared in [36], and prompted a host of “defenses” and “attacks”. While several defenses were broken by subsequent attacks [4, 1, 3], the general approach of adversarial training [23, 35, 45] empirically seems to offer gains in robustness. Other lines of work attain certified robustness, though often at a cost to empirical robustness compared to heuristics [29, 40, 30, 41, 14]. Recent work by Hendrycks et al. [15] shows that even though pretraining has limited value for standard accuracy on benchmarks, adversarial pretraining is effective. We complement this work by showing that a similar conclusion holds for semisupervised learning (both practically and theoretically in a stylized model), and extends to certified robustness as well.
Barriers to robustness.
Schmidt et al. [34] show a sample complexity barrier to robustness in a stylized setting. We observed that in this model, unlabeled data is as useful for robustness as labeled data. This observation led us to experiment with robust semisupervised learning. Recent work also suggests other barriers to robustness: Montasser et al. [25] show settings where improper learning and surrogate losses are crucial, in addition to more samples; Bubeck et al. [2] and Degwekar and Vaikuntanathan [9] show possible computational barriers; Gilmer et al. [13] show a highdimensional model where robustness is a consequence of any nonzero standard error, while Tsipras et al. [39], Fawzi et al. [12]
show a setting where robust and standard errors are at odds. Studying ways to overcome these additional theoretical barriers may translate to more progress in practice.
6.2 Conclusion
We show that unlabeled data closes a sample complexity gap in a stylized model and that robust selftraining (RST) is consistently beneficial in practice. Our findings open up a number of avenues for further research. Theoretically, are many labels ever necessary for adversarial robustness? Practically, what is the best way to leverage unlabeled data for robustness, and can semisupervised learning similarly benefit alternative notions of robustness? As data scales grow, computational capacities increase and machine learning moves beyond minimizing average error, we expect unlabeled data to provide continued benefit.
Acknowledgments
YC was supported by the Stanford Graduate Fellowship. AR was supported by Google Fellowship and Open Philanthropy AI Fellowship. PL was supported by the Open Philanthropy Project Award. JCD was supported by the NSF CAREER award 1553086, the Sloan Foundation and ONRYIP N000141912288.
References
 Athalye et al. [2018] A. Athalye, N. Carlini, and D. Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420, 2018.
 Bubeck et al. [2019] S. Bubeck, E. Price, and I. Razenshteyn. Adversarial examples from computational constraints. In International Conference on Machine Learning (ICML), 2019.
 Carlini and Wagner [2017a] N. Carlini and D. Wagner. Adversarial examples are not easily detected: Bypassing ten detection methods. arXiv, 2017a.
 Carlini and Wagner [2017b] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39–57, 2017b.
 Chapelle et al. [2006] O. Chapelle, A. Zien, and B. Scholkopf. SemiSupervised Learning. MIT Press, 2006.
 Cohen et al. [2019] J. M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning (ICML), 2019.
 Cubuk et al. [2019] E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le. Autoaugment: Learning augmentation policies from data. In Computer Vision and Pattern Recognition (CVPR), 2019.
 Dasgupta and Schulman [2007] S. Dasgupta and L. Schulman. A probabilistic analysis of EM for mixtures of separated, spherical Gaussians. Journal of Machine Learning Research (JMLR), 8, 2007.
 Degwekar and Vaikuntanathan [2019] A. Degwekar and V. Vaikuntanathan. Computational limitations in robust classification and winwin results. arXiv preprint arXiv:1902.01086, 2019.
 DeVries and Taylor [2017] T. DeVries and G. W. Taylor. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.
 Engstrom et al. [2018] L. Engstrom, A. Ilyas, and A. Athalye. Evaluating and understanding the robustness of adversarial logit pairing. arXiv preprint arXiv:1807.10272, 2018.
 Fawzi et al. [2018] A. Fawzi, O. Fawzi, and P. Frossard. Analysis of classifiers’ robustness to adversarial perturbations. Machine Learning, 107(3):481–508, 2018.
 Gilmer et al. [2018] J. Gilmer, L. Metz, F. Faghri, S. S. Schoenholz, M. Raghu, M. Wattenberg, and I. Goodfellow. Adversarial spheres. arXiv preprint arXiv:1801.02774, 2018.
 Gowal et al. [2018] S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, T. Mann, and P. Kohli. On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715, 2018.
 Hendrycks et al. [2019] D. Hendrycks, K. Lee, and M. Mazeika. Using pretraining can improve model robustness and uncertainty. In International Conference on Machine Learning (ICML), 2019.
 Kannan et al. [2018] H. Kannan, A. Kurakin, and I. Goodfellow. Adversarial logit pairing. arXiv preprint arXiv:1803.06373, 2018.
 Krizhevsky [2009] A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
 Laine and Aila [2017] S. Laine and T. Aila. Temporal ensembling for semisupervised learning. In International Conference on Learning Representations (ICLR), 2017.
 Lecuyer et al. [2019] M. Lecuyer, V. Atlidakis, R. Geambasu, D. Hsu, and S. Jana. Certified robustness to adversarial examples with differential privacy. In In IEEE Symposium on Security and Privacy (SP), 2019.
 Lee [2013] D. Lee. Pseudolabel: The simple and efficient semisupervised learning method for deep neural networks. In International Conference on Machine Learning (ICML), 2013.
 Li et al. [2018] B. Li, C. Chen, W. Wang, and L. Carin. Secondorder adversarial attack and certifiable robustness. arXiv preprint arXiv:1809.03113, 2018.

Loshchilov and Hutter [2017]
I. Loshchilov and F. Hutter.
SGDR: Stochastic gradient descent with warm restarts.
In International Conference on Learning Representations (ICLR), 2017. 
Madry et al. [2018]
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu.
Towards deep learning models resistant to adversarial attacks.
In International Conference on Learning Representations (ICLR), 2018.  Miyato et al. [2018] T. Miyato, S. Maeda, S. Ishii, and M. Koyama. Virtual adversarial training: a regularization method for supervised and semisupervised learning. IEEE transactions on pattern analysis and machine intelligence, 2018.
 Montasser et al. [2019] O. Montasser, S. Hanneke, and N. Srebro. VC classes are adversarially robustly learnable, but only improperly. arXiv preprint arXiv:1902.04217, 2019.
 Oliver et al. [2018] A. Oliver, A. Odena, C. A. Raffel, E. D. Cubuk, and I. Goodfellow. Realistic evaluation of deep semisupervised learning algorithms. In Advances in Neural Information Processing Systems (NeurIPS), pages 3235–3246, 2018.
 Papernot et al. [2018] N. Papernot, F. Faghri, N. C., I. Goodfellow, R. Feinman, A. Kurakin, C. X., Y. Sharma, T. Brown, A. Roy, A. M., V. Behzadan, K. Hambardzumyan, Z. Z., Y. Juang, Z. Li, R. Sheatsley, A. G., J. Uesato, W. Gierke, Y. Dong, D. B., P. Hendricks, J. Rauber, and R. Long. Technical report on the cleverhans v2.1.0 adversarial examples library. arXiv preprint arXiv:1610.00768, 2018.

Paszke et al. [2017]
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin,
A. Desmaison, L. Antiga, and A. Lerer.
Automatic differentiation in pytorch, 2017.
 Raghunathan et al. [2018a] A. Raghunathan, J. Steinhardt, and P. Liang. Certified defenses against adversarial examples. In International Conference on Learning Representations (ICLR), 2018a.
 Raghunathan et al. [2018b] A. Raghunathan, J. Steinhardt, and P. Liang. Semidefinite relaxations for certifying robustness to adversarial examples. In Advances in Neural Information Processing Systems (NeurIPS), 2018b.
 Recht et al. [2018] B. Recht, R. Roelofs, L. Schmidt, and V. Shankar. Do CIFAR10 classifiers generalize to CIFAR10? arXiv, 2018.
 Rosenberg et al. [2005] C. Rosenberg, M. Hebert, and H. Schneiderman. Semisupervised selftraining of object detection models. In Proceedings of the Seventh IEEE Workshops on Application of Computer Vision, 2005.
 Sajjadi et al. [2016] M. Sajjadi, M. Javanmardi, and T. Tasdizen. Regularization with stochastic transformations and perturbations for deep semisupervised learning. In Advances in Neural Information Processing Systems (NeurIPS), pages 1163–1171, 2016.
 Schmidt et al. [2018] L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Madry. Adversarially robust generalization requires more data. In Advances in Neural Information Processing Systems (NeurIPS), pages 5014–5026, 2018.
 Sinha et al. [2018] A. Sinha, H. Namkoong, and J. Duchi. Certifiable distributional robustness with principled adversarial training. In International Conference on Learning Representations (ICLR), 2018.
 Szegedy et al. [2014] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014.
 Tarvainen and Valpola [2017] A. Tarvainen and H. Valpola. Mean teachers are better role models: Weightaveraged consistency targets improve semisupervised deep learning results. In Advances in neural information processing systems, pages 1195–1204, 2017.

Torralba et al. [2008]
A. Torralba, R. Fergus, and W. T. Freeman.
80 million tiny images: A large data set for nonparametric object and scene recognition.
IEEE transactions on pattern analysis and machine intelligence, 30(11):1958–1970, 2008.  Tsipras et al. [2019] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry. Robustness may be at odds with accuracy. In International Conference on Learning Representations (ICLR), 2019.
 Wong and Kolter [2018] E. Wong and J. Z. Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning (ICML), 2018.
 Wong et al. [2018] E. Wong, F. Schmidt, J. H. Metzen, and J. Z. Kolter. Scaling provable adversarial defenses. arXiv preprint arXiv:1805.12514, 2018.
 Xie et al. [2019] Q. Xie, Z. Dai, E. Hovy, M. Luong, and Q. V. Le. Unsupervised data augmentation. arXiv preprint arXiv:1904.12848, 2019.
 Yuval et al. [2011] N. Yuval, W. Tao, C. Adam, B. Alessandro, W. Bo, and N. A. Y. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
 Zagoruyko and Komodakis [2016] S. Zagoruyko and N. Komodakis. Wide residual networks. In British Machine Vision Conference, 2016.
 Zhang et al. [2019] H. Zhang, Y. Yu, J. Jiao, E. P. Xing, L. E. Ghaoui, and M. I. Jordan. Theoretically principled tradeoff between robustness and accuracy. In International Conference on Machine Learning (ICML), 2019.
 Zheng et al. [2016] S. Zheng, Y. Song, T. Leung, and I. Goodfellow. Improving the robustness of deep neural networks via stability training. In Proceedings of the ieee conference on computer vision and pattern recognition, pages 4480–4488, 2016.
 Zhu et al. [2003] X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semisupervised learning using gaussian fields and harmonic functions. In International Conference on Machine Learning (ICML), pages 912–919, 2003.
Appendix A Theoretical results
a.1 Error probabilities in closed form
We recall our model with uniform on and . Consider a linear classifier . Then the standard error probability is
(10) 
where
is the Gaussian error function. For linear classifier , input and label , the strongest adversarial perturbation of with norm moves each coordinate of by . The robust error probability is therefore
(11) 
In this model, standard and robust accuracies align in the sense that any highly accurate standard classifier, with , will necessarily also be robust. Moreover, for dense (with
), good linear estimators will typically be dense as well, in which case
determines both standard and robust accuracies. Our analysis will consequently focus on understanding the quantity .a.1.1 Optimal standard accuracy and parameter setting
We note that for a given problem instance, the classifier that minimizes the standard error is simply . Its standard error is
Recall our parameter setting,
(12) 
Under this setting, and we have
Therefore, in the regime , the classifier achieves essentially perfect accuracies, both standard and robust. We will show that estimating from labeled data and a large number () of unlabeled data allows us to approach the performance of , without prior knowledge of .
a.2 Performance of supervised estimator
Given labeled data set we consider the linear classifier given by
In the following lemma we give a tight concentration bound for , which determines the standard and robust error probabilities of via equations (10) and (11) respectively
Lemma 1.
There exist numerical constants such that under parameter setting (12) and ,
Proof.
We have
To lower bound the random variable
we consider its squared inverse, and decompose it as followsTo obtain concentration bounds, we note that
Therefore, standard concentration results give
(13) 
Assuming that the two events and hold, we have
Substituting the parameter setting setting (12), we have that for sufficiently large,
for some numerical constant For this to imply the bound stated in the lemma we also need to hold, but this is already implied by
Substituting the parameters settings into the concentration bounds (13), we have by the union bound that the desired upper bound fails to hold with probability at most
for another numerical constant and . ∎
As an immediate corollary to Lemma 1, we obtain the sample complexity upper bounds cited in the main text. See 1
Proof.
For the case we take sufficiently large such that by Lemma 1 we have
for an appropriate . Therefore by the expression (10) for the standard error probability (and the fact that it is never more than 1), we have
for appropriate . Similarly, for the case we apply Lemma 1 combined with to write
with probability . Therefore, using the expression (11) and , we have (using )
for sufficiently large . ∎
a.3 Lower bound
We now briefly explain how to translate the sample complexity lower bound of Schmidt et al. [34] into our parameter setting. See 1
Proof.
The setting of our theorem is identical to that of Theorem 11 in Schmidt et al. [34], which shows that
Using , implies and therefore
Moreover
∎
a.4 Performance of semisupervised estimator
We now consider the semisupervised setting—our primary object of study in this paper. We consider the selftraining estimator that in the first stage uses labeled examples to construct
and then uses it to produce pseudolabels
for the unlabeled data points . In the second and final stage of selftraining, we employ the same simple learning rule on the pseudolabeled data and construct
The following result shows a highprobability bound on , analogous to the one obtained for the fully supervised estimator in Lemma 1 (with different constant factors).
Lemma 2.
Proof.
The proof follows a similar argument to the one used to prove Lemma 1, except now we have to to take care of the fact that the noise component in is not entirely Gaussian. Let be the indicator that the th pseudolabel is incorrect, so that , and let
We may write the final estimator as
where independent of each other. Defining
we have the decomposition and bound
Comments
There are no comments yet.