1 Introduction
As deployment of machine learning systems in the real world has steadily increased over recent years, the trustworthiness of these systems is of crucial importance. This is particularly the case for safetycritical applications. For example, the vision system in a selfdriving car should correctly classify an obstacle or human irrespective of their orientation. Besides being relevant from a security perspective, a measure for spatial invariance also helps to gauge interpretability and reliability of a model. If an image of a child rotated by
is classified as a trash can, can we really trust the system in the wild?As neural networks have been shown to be expressive both theoretically
[18, 4, 15] and empirically [48], in this work we study to what extent standard neural networks predictors can be made invariant to small rotations and translations. In contrast to enforcing conventional invariance on entire group orbits, we weaken the goal to invariance on smaller socalled transformation sets. This requirement reflects the aim to be invariant to transformations that do not affect the labeling by a human. During test time we assess transformation set invariance by computing the prediction accuracy on the worstcase (adversarial) transformation in the (small) transformation set of each image in the test data. The higher this worstcase prediction accuracy of a model is, the more spatially robust we say it is. Importantly, we use the same terminology as in the very active field of adversarially robust learning [40, 29, 23, 33, 6, 26, 37, 39, 35, 44, 28], but we consider adversarial examples with respect to spatial instead of transformations of an image.Recently, it was observed in [11, 13, 34, 20, 14, 2] that worstcase prediction performance drops dramatically for neural network classifiers obtained using standard training, even for rather small transformation sets. In this context, we examine the effectiveness of regularization that explicitly encourages the predictor to be constant for transformed versions of the same image, which we refer to as being invariant on the transformation sets. Broadly speaking, there are two approaches to encourage invariance of neural network predictors. On the one hand, the relative simplicity of the mathematical model for rotations and translations has led to carefully handengineered architectures that incorporate spatial invariance directly [19, 24, 8, 27, 45, 43, 12, 41]. On the other hand, augmentationbased methods [3, 47] constitute an alternative approach to encourage desired invariances on transformation sets. Specifically, the idea is to augment the training data by a random or smartly chosen transformation of every image for which the predictor output is enforced to be close to the output of the original image. This invarianceinducing regularization term is then added to the cross entropy loss for backpropagation.
While augmentationbased methods can be used out of the box whenever it is possible to generate samples in the transformation set of interest, it is unclear how they compare to architectures that are tuned for the particular type of transformation using prior knowledge. Studying robustness against spatial transformations in particular allows us to compare the robust performance of these two approaches, as spatialequivariant networks have been somewhat successful in enforcing invariance. In contrast, this cannot be claimed for higherdimensional type perturbations. In the empirical sections of this paper, we hence want to explore the following questions:

To what extent can augmentation and regularization based methods improve spatial robustness of common deep neural networks?

How does augmentationbased invarianceinducing regularization perform in case of small spatial transformations compared to representative specialized architectures designed to achieve invariance against entire transformation groups?
As a justification for employing this form of invarianceinducing regularization, we prove in our theoretical section 2 that when perturbations come from transformation groups, predictors that optimize the robust loss are in fact invariant on the set of transformed images. Although recent works show a fundamental tradeoff between robust and standard accuracy in constructed perturbation settings [42, 49, 36], we additionally show that this is fundamentally different for spatial transformations due to their group structure.
For the empirical study, we implemented various augmentation based training methods as described in Sec. 3. In Sec. 4, we compare spatial robustness for augmentationbased methods and specialized neural network architectures on CIFAR10 and SVHN. Although groupinvariance should automatically imply robust predictions against all transformations in the group, groupequivariant networks have not been extensively evaluated using adversarially chosen, but rather random transformations. In experiments with CIFAR10 and SVHN, we find that regularized methods can achieve relative adversarial error reduction compared to previously proposed augmentationbased methods (including adversarial training) without requiring additional computational resources. Furthermore, they even outperform representative handcrated networks that were explicitly designed for invariance.
2 Theoretical results for invarianceinducing regularization
In this section, we first introduce our notion of transformation sets and formalize robustness against a small range of translations and rotations. We then prove that, on a population level, constraining or regularizing for transformation set invariance yields models that minimize the robust loss. Moreover, when the label distribution is constant on each transformation set, we show that the set of robust minimizers not only minimizes the natural loss but, under mild conditions on the distribution over the transformations, is even equivalent to the set of natural minimizers.
Although the framework can be applied to general problems and transformation groups, we consider image classification for concreteness. In the following, are the observed images,
is the onehot vector for multiclass labels and both are random variables from a joint distribution
. The function in function space(e.g. deep neural network in experiments) maps the input image to a logit vector that is then used for prediction via a softmax layer.
2.1 Transformation sets
Invariance with respect to spatial transformations is often thought of in terms of group equivariance of the representation and prediction. Instead of invariance with respect to all spatial transformations in a group, we impose a weaker requirement, that is invariance against transformation sets, defined as follows. We denote by a compact subset of images in the support of that can be obtained by transformation of an image . is called a transformation set. For example in the case of rotations, the transformation set corresponds to the set of observed images in a dataset that are different versions of the same image , that can be obtained by small rotations of one another.
By the technical assumption on the space of real images that the sampling operator is bijective, the mapping is bijective. We can hence define , a set of transformation sets, by for a given transformation group. Importantly, the bijectivity assumption also leads to being disjoint for different images . The above definition is distribution dependent and partitions the support of the distribution. More details on the aforementioned concepts and definitions can be found in Sec. A.1 in the Appendix.
We say that a function is (transformation)invariant if for all for all and denote the class of all such functions by . Using this notation, fitting a model with high accuracy under worstcase “small” transformations of the input can be mathematically captured by the robust optimization formulation [5] of minimizing the robust loss
(1) 
in some function space . We call the solution of this problem the (spatially) robust minimizer. While adversarial training aims to optimize the empirical version of Eq. (1), the converged predictor might be far from the global population minimum, in particular in the case of nonconvex optimization landscapes encountered when training neural networks. Furthermore, we show in the following section that for robustness over transformation sets, constraining the model class to invariant functions leads to the same optimizer of the robust loss. These facts motivate invarianceinducing regularization which we then show to exhibit improved robust test accuracy in practice.
2.2 Regularization to encourage invariance
For any regularizer , we define the corresponding constrained set of functions as
where denotes the support of . When and is a semimetric^{1}^{1}1The weaker notion of a semimetric satisfies almost all conditions for a metric without having to satisfy the triangle inequality. on , we have . We now consider constrained optimization problems of the form
(O1)  
(O2) 
The following theorem shows that (O1), (O2) are equivalent to (1) if the set of all invariant functions is a subset of the function space .
Theorem 1.
The proof of Theorem 1 can be found in the Appendix in Sec. A.2. Since exact projection onto the constrained set is in general not achievable for neural networks, an alternative method to induce invariance is to relax the constraints by only requiring . Using Lagrangian duality, (O1) and (O2) can then be rewritten in penalized form for some scalar as
(2)  
(3) 
In Sec. 2.4 we discuss how ordinary adversarial training, and modified variants that have been proposed thereafter, can be viewed as special cases of Eqs. (2) and (3). On the other hand, the constrained regularization formulation corresponds to restricting the function space and is hence comparable with handcrafted network architecture design as described in Sec. 3.1.
2.3 Tradeoff between natural and robust accuracy
Even though high robust accuracy (1) might be the main goal in some applications, one might wonder whether the robust minimizer exhibits lower accuracy on untransformed images (natural accuracy) defined as [42, 49]. In this section we address this question and identify the conditions for transformation set perturbations under which minimizing the robust loss does not lead to decreased natural accuracy. Notably, it even increases under mild assumptions.
One reason why adversarial examples have attracted a lot of interest is because the prediction of a given classifier can change in a perturbation set in which all images appear the same to the human eye. Mathematically, in the case of transformation sets, the latter can be modeled by a property of the true distribution. Namely, it translates into the conditional distribution given , denoted by , being constant for all belonging to the same subset . In other words, is conditionally independent of given , i.e. . Under this assumption the next theorem shows that there is no tradeoff in natural accuracy for the transformation robust minimizer.
Theorem 2 (Tradeoff natural vs. robust accuracy).
Under the assumption of Theorem 1 and if holds, the adversarial minimizer also minimizes the natural loss. If moreover, has support for every and the loss is injective, then every minimizer of the natural loss also has to be invariant.
As a consequence, minimizing the constrained optimization problem (O1) could potentially help in finding the optimal solution to minimize standard test error. Practically, the assumption on the distribution of the transformation sets corresponds to assuming nonzero inherent transformation variance in the natural distribution of the dataset. In practice, we indeed observe a boost in natural accuracy for robust invarianceinducing methods in Sec. 4 on SVHN, a commonly used benchmark dataset for spatialequivariant networks for this reason.
One might wonder how this result relates to several recent publications such as [42, 49] that presented toy examples for which the robust solution must have higher natural loss than the Bayes optimal solution even in the infinite data limit. On a fundamental level, perturbation sets are of different nature compared to transformation sets on generic distributions of . In the distribution considered in [42, 49], there is no unique mapping from to a perturbation set and thus the conditional independence property does not hold in general.
2.4 Different regularizers and practical implementation
In order to improve robustness against spatial transformations we consider different choices of in the regularized objectives (2) and (3) that we then compare empirically in Sec. 4. This allows us to view a number of variants of adversarial training in a unified framework. Broadly speaking, each approach listed below consists of first searching an adversarial example according to some mechanism which is then included in a regularizing function, often some weak notion of distance between the prediction at and the new example. The following choices of regularizers involve the maximization of a regularizing function over the transformation set
where is the KL divergence on the softmax of the (logit) vectors . In all cases we refer to the maximizer as an adversarial example that is found using defense mechanisms as discussed in Section 3.3. Note that for and the assumption in Theorem 1 is satisfied.
Instead of performing a maximization of the regularizing function to find the adversarial example , we can also choose in alternative ways The following variants are explored in the paper, two of which are reminiscent of previous work
The last regularizer suggests using an additive penalty on top of data augmentation, with either one or even multiple random draws, where the penalty can be any of the above semimetrics between and , such as the or distance. Albeit suboptimal, the experimental results in Section 4 suggest that simply adding the additive regularization penalty on top of randomly drawn data matches general adversarial training in terms of robust prediction at a fraction of the computational cost. In addition, Theorem 2 suggests that even when the goal is to improve standard accuracy and one expects inherent variance of nuisance factors in the data distribution it is likely helpful to use regularized data augmentation with instead of vanilla data augmentation. Empirically we observe this on the SVHN dataset in Section 4.
Adversarial example for spatial transformation sets Since is not a closed group and we do not even know whether the observation lies at the boundary of or in the interior, we cannot solve the maximization constrained to in practice. However, for an appropriate choice of set , we can instead minimize an upper bound of (1) which reads
(4) 
where is the set of transformations that we search over and denotes the transformed image with transformation (see Sec. A.1 in the Appendix for an explicit construction of the transformation search set ). The left hand side in (4) is hence what we aim to solve in practice where the expectation is over the empirical joint distribution of . The relaxation of to a range of transformations of that is is also used for the maximization within the regularizers.
In Figure 1 one pair of example images is shown: the original image (panel (a)) is depicted along with a transformed version with (panel (b)) and the respective predictions by a standard neural network classifier.
3 Experimental setup
In our experiments, we compare invarianceinducing regularization incorporated via various augmentationbased methods (as described in Section 2.4) used on standard networks and representative spatial equivariant networks trained using standard optimization procedures.
3.1 Spatial equivariant networks
We compare the robust prediction accuracies from networks trained with the regularizers with three specialized architectures, designed to be equivariant against spatial transformations and translations: (a) GResNet44(GRN) [8]
using p4m convolutional layers (90 degree rotations, translations and mirror reflections) on CIFAR10; (b) Equivariant Transformer Networks (ETN)
[41], a generalization of Polar Transformer Networks (PTN) [12], on SVHN; and (c) Spatial Transformer Networks (STN)
[19] on SVHN. A more comprehensive discussion of the literature on equivariant networks can be found in Sec. 5. We choose the architectures listed above based on availability of reproducible code and previously reported stateofthe art standard accuracies on SVHN and CIFAR10. We train GRN, STN and ETN using standard augmentation as described in Sec. 3.4 (std) and random rotations in addition (). Out of curiosity we also trained a “twostage” STN where we train the localization network separately in a supervised fashion. Specifically, we use a randomly transformed version of the training data, treating the transformation parameters as prediction targets. Details about the implementation and results can be found in Sec. B in the Appendix.3.2 Transformations
The transformations that we consider in Sec. 4 are small rotations (of up to ) and translations in two dimensions of up to 3 px corresponding to approx. 10% of the image size. For augmentation based methods we need to generate such small transformations for a given test image. Although the definition of a transformation
in the theoretical section using the corresponding continuous image functions is clean, we do not have acccess to the continuous function in practice since the mapping is in general not bijective. Instead, we use bilinear interpolation, as implemented in TensorFlow and in a differentiable version of a transformer
[19] for first order attack and defense methods.On top of interpolation, rotation also creates edge artifacts at the boundaries, as the image is only sampled in a bounded set. The empty space that results from translating and rotating an image is filled with black pixels (constant padding) if not noted otherwise. Fig. 1 (b) shows an example. [11]
additionally analyze a “black canvas“ setting where the images are padded with zeros prior to applying the transformation, ensuring that no information is lost due to cropping. Their experiments show that the reduced accuracy of the models cannot be attributed to this effect. Since both versions yield similar results, we report results on the first version of pad and crop choices, having input images of the same size as the original.
3.3 Attacks and defenses
The attacks and defenses we choose essentially follow the setup in [11]. The defense refers to the procedure at training time which aims to make the resulting model robust to adversarial examples. It generally differs from the (extensive) attack mechanism performed at evaluation time to assess the model’s robustness due to computational constraints.
Considered attacks First order methods such as projected gradient descent that have proven to be most effective for transformations are not optimal for finding adversarial examples with respect to rotations and translations. In particular, our experiments confirm the observations reported in [11] that the most adversarial examples can be found through a grid search. For the grid search attack, the compact perturbation set is discretized to find the transformation resulting in a misclassification with the largest loss . In contrast to the case of adversarial examples, this method is computationally feasible for the 3dimensional spatial parameters. We consider a default grid of 5 values per translation direction and 31 values for rotation, yielding 775 transformed examples that are evaluated for each . We refer to the accuracy attained under this attack as grid accuracy. How did we ensure the number of transformations in the grid are sufficient? Considering with a finer grid of 7500 transformations for a subset of the experiments, summarized in Table 11, showed only minor reductions in accuracy compared to the coarser grid. Therefore, we chose the latter for computational reasons.
Considered defenses For the adversarial example which maximizes either the loss or regularization function, we use the following defense mechanisms:

Spatial PGD: In analogy to common practice for adversarial training as in e.g. [40, 26], the SPGD mechanism uses projected gradient descent with respect to the translation and rotation parameters with projection on the constrained set of transformations. We consider 5 steps of PGD, starting from a random initialization, with step sizes of (following [11]) for horizontal, vertical translation and rotation respectively. A discussion on the discrepancy between SPGD as a defense and attack mechanism can be found in Section C.2.

Random: Data augmentation with a distinct random perturbation per image and iteration. This can be seen as the most naive “adversarial” example as it corresponds to worstof with .
3.4 Training details
The experiments are conducted with deep neural networks as the function space and is the crossentropy loss. In the main paper we consider the datasets SVHN [32] and CIFAR10 [22]. For the nonspecialized architectures, we train a ResNet32 [16], implemented in TensorFlow [1]. For the Transformer networks STN and ETN we use a 3layer CNN as localization according to the default settings in the provided code of both networks for SVHN and rotMNIST. For a subset of the experiments we also report results for CIFAR100 [22] in the Appendix.
We train the baseline models with standard data augmentation: random leftright flips and random translations of followed by normalization. Below we refer to the models trained in this fashion as “std”. For the models trained with one of the defenses described in Sec. 3.3, we only apply random leftright flipping since translations are part of the adversarial search. The special case of data augmentation (with translations and rotations, i.e. the defense “random”) without regularization is refered to as .
For optimization of the empirical training loss, we run standard minibatch SGD with a momentum term with parameter and weight decay parameter . We use an initial learning rate of which is divided by after half and threequarters of the training steps. Independent of the defense method, we fix the number of iterations to for SVHN and CIFAR10, and to for CIFAR100. For comparability across all methods, the number of unique original images in each iteration is in all cases. For the baselines and Adversarial training, we additionally trained with a conventional batch size of and report the higher accuracy of both versions. For the regularized methods, the value of is chosen based on the test grid accuracy. All models are trained using a single GPU on a node equipped with an NVIDIA GeForce GTX 1080 Ti and two 10core Xeon E52630v4 processors.
4 Empirical Results
We now compare the natural test accuracy (standard accuracy on the test set, abbreviated as nat) and test grid accuracy (as defined in Sec. 3.3, abbreviated as rob) achieved by standard and regularized (adversarial) training techniques as well as specialized spatial equivariant architectures described in Sec. 3.1. For clarity of presentation, the naming convention we use in the rest of the paper consists of the following components: (a) Reg: refers to what regularizer was used (AT, ALP, , KL, or KLC as defined in Section 2.4); (b) batch: indicates whether the gradient of the loss is taken with respect to the adversarial examples (rob), natural examples (nat) or both (mix), and (c) def: the mechanism used to find the adversarial example, including random (rnd), worstof (Wo) and spatial PGD (SPGD) as described in Sec. 3.3. Thus, Reg(batch, def) corresponds to using Regas the regularization function, the examples defined by batch in the gradient of the loss and the defense mechanism def to find the augmented or adversarial examples.
In Table 1, we report results for a subset of the Reg(batch, def) combinations to facilitate comparisons. Tables with many more combinations can be found Tables 4–9
in the Appendix. We report averages (standard errors are contained in Tables
4–9) computed over five training runs with identical hyperparameter settings. We compare all methods by computing absolute and relative error reductions (defined as
). It is insightful to present both numbers since the absolute values vary drastically between datasets.std 







SVHN (nat)  95.48  93.97  96.03  96.13  96.53  96.30  96.14  96.11  
(rob)  18.85  82.60  90.35  92.71  92.55  92.04  92.42  92.32  
CIFAR (nat)  92.11  89.93  91.76  90.41  90.53  90.11  89.98  89.85  
(rob)  9.52  58.29  71.17  77.47  77.06  75.9  78.93  77.80 
GRN  ETN  STN  
SVHN (nat)  96.07  95.05  95.53  95.57  95.61  95.55 
(rob)  25.12  84.9  13.15  84.21  36.68  79.28 
CIFAR (nat)  93.39  93.08  –  –  –  – 
(rob)  16.85  71.64  –  –  –  – 
Effectiveness of augmentationbased invarianceinducing regularization In Table 1 (top), the three leftmost columns represent unregularized methods which all perform worse in grid accuracy than regularized methods and the two rightmost columns represent adversarial examples with respect to the classification cross entropy loss found via SPGD. When considering the three regularizers (KL, , ALP) with the same batch and def (here chosen to be “rob” and Wo) regularized adversarial training improves the grid accuracy from to on CIFAR10 and to on SVHN, corresponding to a relative error reduction of and respectively. The same can be observed when comparing data augmentation and its regularized variants in Table 2. Together with Tables 5 and 6, SPGD seems to be the more efficient defense mechanism compared to worstof even when is raised to , with comparable computation time.
Computational considerations In Figure 2, we plot the grid accuracy vs. the runtime (in hours) for a subset of regularizers and defense mechanisms on CIFAR10 for clarity of presentation. How much overhead is needed to obtain the reported gains? Comparing AT(rob, Wo) (green line) and ALP(rob, Wo) (red line) shows that significant improvements in grid accuracy can be achieved by regularization with only a small computational overhead. What if we make the defense stronger? While the leap in robust accuracy from Wo (also referred to as rnd) to Wo is quite large, increasing to 20 only gives diminishing returns while requiring more training time. This observation is summarized exemplarily for both KL and ALP regularizer on CIFAR10 in Table 7. Furthermore, for any fixed training time, regularized methods exhibit higher robust accuracies where the gap varies with the particular choice of regularizer and defense mechanism.
Comparison with spatial equivariant networks Although the rotationaugmented GResNet44 obtains higher grid (SVHN: , CIFAR10: ) and natural accuracies (SVHN: , CIFAR10: ) than the rotationaugmented Resnet32 on both SVHN (grid: , nat: ) and CIFAR10 (grid: , nat: ), regularizing standard data augmentation (i.e. regularizers with “rnd”, see Table 2 (right)) using both the distance and the KL divergence matches the GResNet44 on CIFAR10 (: , KL: ) and surpasses it on SVHN on grid (: , KL: ) and natural accuracies by a relative grid error reduction of . The same phenomenon is observed for the augmented ETN and STN on SVHN.^{3}^{3}3We had difficulties to train both ETN and STN to higher than natural accuracy for CIFAR10 even after an extensive learning rate and schedule search so we do not report the numbers here. In conclusion, regularized augmentation based methods match or outperform representative endtoend networks handcrafted to be equivariant to spatial transformations.
Tradeoff natural vs. adversarial accuracy SVHN is one of the main datasets (without artificial augmentation like in rotMNIST [25]) where spatial equivariant networks have reported improvements on natural accuracy. This is due to the inherent orientation variance in the data. In our mathematical framework, this corresponds to the assumption in Theorem 2 of the distribution on the transformation sets having support . Furthermore, as all numbers in SVHN have the same label irrespective of small rotations of at most 30 degrees, the first assumption in Theorem 2 is also fulfilled. Table 1 and 2 confirm the statement in the Theorem that improving robust accuracy may not hurt natural accuracy or even improve it: For SVHN, adding regularization to samples obtained both via Wo adversarial search or random transformation (rnd) consistently not only helps robust but also standard accuracy.
Comparing the effects of different regularization parameters on test grid accuracy We study Tables 1 and 2 and attempt to disentangle the effects by varying only one parameter. For example we can observe that, computational cost aside, fixing any regularizer defense to Wo, the robust regularized loss Reg(rob, Wo) (i.e., ) does better (or not statistically significantly worse) than Reg(nat, Wo) (i.e., ). Furthermore, the KL regularizer generally performs better than for a large number of settings. A possible explanation for the latter could be that upper bounds the squared
loss on the probability simplex and is hence more restrictive.
Choice of The different regularization methods peak at different in terms of grid accuracy. However, they outperform unregularized methods in a large range of values, suggesting that wellperforming values of are not difficult to find in practice. These can be seen in Figures 4 and 5 in the Appendix.
There are many more interesting experiments we have conducted for subsets of the defenses and datasets illustrating different phenomena that we observe. For example we have analyzed a finer grid for the grid search attack and evaluated SPGD as an attack mechanism. A detailed discussion of these experiments can be found in Sec. C.2.
5 Related work
Group equivariant networks There are in general two types of approaches to incorporate spatial invariance into the network. In one of the earlier works in the neural net era, Spatial Transformer Networks were introduced [19] which includes a transformer module that predicts transformation parameters followed by a transformer. Later on, one line of work proposed multiple filters that are discrete group transformations of each other [24, 27, 8, 51, 45]. For continuous transformations, steerability [43, 9] and coordinate transformation [12, 41] based approaches have been suggested. Although these approaches have resulted in improved standard accuracy performances, it has not been rigorously studied whether or by how much they improve upon regular networks with respect to robust test accuracy.
Regularized training Using penalty regularization to encourage robustness and invariance when training neural networks has been studied in different contexts: for distributional robustness [17], domain generalization [30], adversarial training [31, 21, 49], robustness against simple transformations [7]
[50, 46]. These approaches are based on augmenting the training data either statically [17, 30, 7, 46], ie. before fitting the model, or adaptively in the sense of adversarial training, with different augmented examples per training image generated in every iteration [21, 31, 49].Robustness against simple transformations Approaches targeting adversarial accuracy for simple transformations have used attacks and defenses in the spirit of PGD (either on transformation space [11] or on input space projecting to transformation manifold [20]) and simple random or grid search [11, 34]. Recent work [10] has also evaluated some rotationequivariant networks with different training and attack settings which reduces direct comparability with e.g. adversarial based defenses [11].
6 Conclusion
In this work, we have explored how regularized augmentationbased methods compare against specialized spatial equivariant networks in terms of robustness against small translations and rotations. Strikingly, even though augmentation can be applied to encourage any desired invariance, the regularized methods adapt well and perform similarly or better than specialized networks. Furthermore, we have introduced a theoretical framework incorporating many forms of regularization techniques that have been proposed in the literature. Both theoretically and empirically, we showed that for transformation invariances and under certain practical assumptions on the distribution, there is no tradeoff between natural and adversarial accuracy which stands in contrast to the debate around perturbation sets. In summary, it is advantageous to replace unregularized with regularized training for both augmentation and adversarial defense methods. With regard to the regularization parameter choice we have seen that improvements can be obtained for a large range of values, indicating that this additional hyperparameter is not difficult to tune in practice. In future work, we aim to explore whether specialized architectures can be combined with regularized adversarial training to improve upon the best results reported in this work.
7 Acknowledgements
We thank Ludwig Schmidt for helpful discussion, Nicolai Meinshausen for valuable feedback on the manuscript and Luzius Brogli for initial experiments. FY was supported by the Institute for Theoretical Studies ETH Zurich, the Dr. Max Rössler and Walter Haefner Foundation and the Office of Naval Research Young Investigator Award N000141912288.
References
 [1] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, et al. TensorFlow: Largescale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
 [2] Michael A Alcorn, Qi Li, Zhitao Gong, Chengfei Wang, Long Mai, WeiShinn Ku, and Anh Nguyen. Strike (with) a pose: Neural networks are easily fooled by strange poses of familiar objects. arXiv preprint arXiv:1811.11553, 2018.
 [3] Henry S Baird. Document image defect models. In Structured Document Image Analysis, pages 546–556. Springer, 1992.

[4]
Andrew R Barron.
Universal approximation bounds for superpositions of a sigmoidal function.
IEEE Trans. Info. Theory, 39(3):930–945, 1993.  [5] Aharon BenTal, Laurent El Ghaoui, and Arkadi Nemirovski. Robust optimization, volume 28. Princeton University Press, 2009.
 [6] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In Proceedings of the IEEE Symposium on Security and Privacy (SP), pages 39–57. IEEE, 2017.

[7]
Gong Cheng, Junwei Han, Peicheng Zhou, and Dong Xu.
Learning rotationinvariant and Fisher discriminative convolutional neural networks for object detection.
IEEE Transactions on Image Processing, 28(1):265–278, 2019.  [8] Taco Cohen and Max Welling. Group equivariant convolutional networks. In Proceedings of the International Conference on Machine Learning, pages 2990–2999, 2016.
 [9] Taco S Cohen, Mario Geiger, Jonas Köhler, and Max Welling. Spherical CNNs. In Proceedings of the International Conference on Learning Representations, 2018.
 [10] Beranger Dumont, Simona Maggio, and Pablo Montalvo. Robustness of rotationequivariant networks to adversarial perturbations. arXiv preprint arXiv:1802.06627, 2018.
 [11] Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. Exploring the landscape of spatial robustness. In Proceedings of the International Conference on Machine Learning, 2019.
 [12] Carlos Esteves, Christine AllenBlanchette, Xiaowei Zhou, and Kostas Daniilidis. Polar transformer networks. In Proceedings of the International Conference on Learning Representations, 2018.
 [13] A. Fawzi and P. Frossard. Manitest: Are classifiers really invariant? In British Machine Vision Conference (BMVC), 2015.
 [14] Robert Geirhos, Carlos RM Temme, Jonas Rauber, Heiko H Schütt, Matthias Bethge, and Felix A Wichmann. Generalisation in humans and deep neural networks. In Advances in Neural Information Processing Systems, pages 7549–7561, 2018.
 [15] Boris Hanin. Universal function approximation by deep neural nets with bounded width and relu activations. arXiv preprint arXiv:1708.02691, 2017.

[16]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Deep residual learning for image recognition.
In
Proceedings of the IEEE Conference on Computer Vision and Patern Recognition
, pages 770–778, 2016.  [17] Christina HeinzeDeml and Nicolai Meinshausen. Conditional variance penalties and domain shift robustness. arXiv preprint arXiv:1710.11469, 2017.
 [18] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
 [19] Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. Spatial Transformer Networks. In Advances in Neural Information Processing Systems, pages 2017–2025, 2015.
 [20] Can Kanbak, SeyedMohsen MoosaviDezfooli, and Pascal Frossard. Geometric robustness of deep networks: analysis and improvement. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4441–4449, 2018.
 [21] Harini Kannan, Alexey Kurakin, and Ian Goodfellow. Adversarial Logit Pairing. arXiv preprint arXiv:1803.06373, 2018.
 [22] Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical Report 4, University of Toronto, 2009.
 [23] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
 [24] Dmitry Laptev, Nikolay Savinov, Joachim M Buhmann, and Marc Pollefeys. TIPOOLING: Transformationinvariant pooling for feature learning in convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 289–297, 2016.
 [25] Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th International Conference on Machine Learning, pages 473–480, 2007.

[26]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and
Adrian Vladu.
Towards deep learning models resistant to adversarial attacks.
In Proceedings of the International Conference on Learning Representations, 2018.  [27] Diego Marcos, Michele Volpi, Nikos Komodakis, and Devis Tuia. Rotation equivariant vector field networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 5058–5067, 2017.
 [28] Matthew Mirman, Timon Gehr, and Martin Vechev. Differentiable abstract interpretation for provably robust neural networks. In Proceedings of the International Conference on Machine Learning, pages 3575–3583, 2018.
 [29] SeyedMohsen MoosaviDezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 2574–2582, 2016.
 [30] Saeid Motiian, Marco Piccirilli, Donald A Adjeroh, and Gianfranco Doretto. Unified deep supervised domain adaptation and generalization. In Proceedings of the IEEE International Conference on Computer Vision, volume 2, page 3, 2017.

[31]
Taesik Na, Jong Hwan Ko, and Saibal Mukhopadhyay.
Cascade adversarial machine learning regularized with a unified embedding.
In Proceedings of the International Conference on Learning Representations, 2018.  [32] Y. Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. In NIPS workshop on Deep Learning and Unsupervised Feature Learning, page 5, 2011.
 [33] Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical blackbox attacks against machine learning. In Proceedings of the ACM Asia Conference on Computer and Communications Security, pages 506–519. ACM, 2017.
 [34] Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Towards practical verification of machine learning: The case of computer vision systems. arXiv preprint arXiv:1712.01785, 2017.
 [35] Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. Certified defenses against adversarial examples. In Proceedings of the International Conference on Learning Representations, 2018.
 [36] Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C. Duchi, and Percy Liang. Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032, 2019.
 [37] Pouya Samangouei, Maya Kabkab, and Rama Chellappa. DefenseGAN: Protecting classifiers against adversarial attacks using generative models. In Proceedings of the International Conference on Learning Representations, 2018.
 [38] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. In Proceedings of the International Conference on Learning Representations, 2015.
 [39] Aman Sinha, Hongseok Namkoong, and John Duchi. Certifiable distributional robustness with principled adversarial training. In Proceedings of the International Conference on Learning Representations, 2018.
 [40] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In Proceedings of the International Conference on Learning Representations, 2014.
 [41] Kai Sheng Tai, Peter Bailis, and Gregory Valiant. Equivariant Transformer Networks. In Proceedings of the International Conference on Machine Learning, 2019.

[42]
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and
Aleksander Madry.
Robustness may be at odds with accuracy.
In Proceedings of the International Conference on Learning Representations, 2019.  [43] Maurice Weiler, Fred A Hamprecht, and Martin Storath. Learning steerable filters for rotation equivariant CNNs. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, 2018.
 [44] Eric Wong and Zico Kolter. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, pages 5283–5292, 2018.
 [45] Daniel E Worrall, Stephan J Garbin, Daniyar Turmukhambetov, and Gabriel J Brostow. Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 5028–5037, 2017.
 [46] Qizhe Xie, Zihang Dai, Eduard Hovy, MinhThang Luong, and Quoc V. Le. Unsupervised data augmentation. arXiv preprint arXiv:1904.12848, 2019.
 [47] Larry S. Yaeger, Richard F. Lyon, and Brandyn J. Webb. Effective training of a neural network character classifier for word recognition. In Advances in Neural Information Processing Systems, pages 807–816, 1997.
 [48] Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. Understanding deep learning requires rethinking generalization. In Proceedings of the International Conference on Learning Representations, 2015.
 [49] Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled tradeoff between robustness and accuracy. In Proceedings of the International Conference on Machine Learning, 2019.
 [50] Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. Improving the robustness of deep neural networks via stability training. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 4480–4488, 2016.
 [51] Yanzhao Zhou, Qixiang Ye, Qiang Qiu, and Jianbin Jiao. Oriented response networks. In Proceedings of the IEEE Conference on Computer Vision and Patern Recognition, pages 519–528, 2017.
Appendix A Appendix
a.1 Rigorous definition of transformation sets and choice of
In the following we introduce the concepts that are needed to rigorously define transformation sets that are subsets of the finitedimensional (sampled) image space . In particular, because rotations of continuous angles are not welldefined for sampled images we need to introduce the space of image functions with elements , i.e. maps Euclidean coordinates in to the RGB intensities of an image. The observed finitedimensional vector is then a sampled version of an image function . Here we assume that the sampling operator is bijective, with rigorous definitions later in the section.
Next we define subsets in the continuous function space and then transfer the concept back to the finitedimensional . Let us define the symmetric group of all rotations and horizontal and vertical translations acting on . We denote the elements in the group by , uniquely parameterized by and can be represented by a coordinate transform matrix , see e.g. [8]. Two of the three dimensions represent the values for the translations and the third represents the rotation.
The transformed image (function) can be expressed by for each where is the coordinate transform matrix associated with as in [8]. For each , the group orbit is . By definition, the group orbits partition the space and every belongs to a unique orbit.
Subsets of orbits
In our setting, requiring invariance in the entire orbit (i.e. with respect to all translations and rotations) is too restrictive. First of all, large transformations rarely occur in nature because of physical laws and common human perspectives (an upside down tower for example). Secondly, in image classification, robustness is usually only required against adversarial attacks which would not fool humans, i.e. lead them to mislabel the image. If the transformation set is too large, this requirement is no longer fulfilled. For this purpose we consider a closed subset of each group orbit. It follows from the group orbit definition that for every it either belongs to one unique or no such set.
As described in the paragraph of Equation (4), when observing a (sampled) image in the training set, we do not know where in its corresponding subset it lies. At the same time, for our augmentationbased methods, we do not want the set of transformations that we search over (transformation search set for short), to be image dependent. Instead, in this construction we aim to find to be the smallest set of transformations such that (4) is satisfied. For this purpose, it suffices that the effective search set of images for any image covers the corresponding subset for all , i.e.
Here we give an explicit construction of using the maximal transformation for each subset that is needed to transform an image of the subset to another. In particular, we define the maximal transformation vector by the elementwise maximum over all such maximum transformations
for
. Although the subsets themselves for each image are not known, using prior knowledge in each application one can usually estimate the largest possible range of transformations
against which robustness is desired or required. For example for images, one could use experiments with humans to determine for which range of angles their reaction time to correctly label each image stays approximately constant. The maximal vector can now be used to determine the minimal set of transformations . A simplified illustration for when consists of just one orbit (corresponding for example to one image function and all its rotated variants) can be found in Figure 3.(a)  (b)  (c) 
Sampling issues
In reality, the observed image is not a function on but a vector that is the result of sampling an image function . We use to denote the sampling operator and hence . Then the space of observed finite dimensional images is the range space of . In order to counter the problem that the sampling operator is in general not injective, we add another constraint to by requiring that is bijective so that the quantity is welldefined. That is, for a finitedimensional image , there exists exactly one possible continuous image . As a consequence, if and a transformed version exist in , then . This is a rather technical assumption that is typically fulfilled in practice. In the main text, we also refer to as the image corresponding to the sampled image transformed by the group element .
We can now define specific to be the subsets of such that with , the set corresponds to the support of the marginal distribution on . We refer to as transformation sets. By definition of and bijectivity of , there is an injective mapping from any to the set of transformation sets .
a.2 Proof of Theorem 1
Please refer to Section A.1 for the necessary notation for this section. Furthermore, define .
We prove the first statement of the theorem by contradiction. Let be the minimizer of and let us assume that and in particular that it is constant on all transformation sets except and the marginal distribution over that can be defined as for any , is discrete (for simplicity of presentation) and has nonzero probability.
Let’s assume that there is at least one transformation set , on which is not constant and collect all different values in the set (with cardinality strictly bigger than since not constant) and denote the distribution over by . Since there is a unique mapping that maps each to a unique transformation (see Section A.1), we can lower bound of the robust loss as follows for any :
(5) 
where the inequality follows from
The right hand side is minimized with respect to the set by choosing where is defined as because setting for all and else leads to equality in equation (5) and by assumption that . Morever, since by assumption, choosing for all implies which contradicts optimality of and thus proves the first statement of the theorem.
For the second statement let us rewrite
By the first statement we know that the set of invariant functions that minimize the robust loss
is nonempty. For all , it holds by definition of that .
a.3 Proof of Theorem 2
On a high level, similar to the proof of Theorem 1, we can construct a minimizer of the natural loss given the assumption that . Since on both losses are equivalent, together with Theorem 1 this shows that the robust minimizer also minimizes the unconstrained natural loss.
Assume minimizes , and in particular, it is constant on all transformation sets except for some . Again by existence of a mapping and by assumption we can write for any
(6)  
We then obtain
(7) 
when setting for all and otherwise. Together with equation (6), we thus have that for all by definition of .
If additionally the support of is equal to and is injective, the inequality (7) becomes a strict inequality for and hence we have which contradicts the definition of being the minimizer of the natural loss.
Appendix B Twostage STN
Since STNs are known to be sensitive to hyperparameter settings and thus difficult to train endtoend [41], we apply the following twostage procedure to simulate its functionality: (1) we first train a ResNet32 as a localization regression network (LocNet) to predict the attack perturbation separately by learning from a training set, which contains perturbed images and uses the transformations as the prediction targets; (2) at the same time we train a ResNet32 classifier with data augmentation, namely random translations and rotations; (3) during the test phase, the output of the LocNet is used by a spatial transformer module that transforms the image before entering the pretrained classifier. We refer to this twostage STN as STN+.
LocNet and Classifier
For the classifiers, we take the two models trained on CIFAR10 and SVHN using standard data augmentation and random rotations from our previous experiments. Since we do not expect the regressors (or LocNets) to be perfect in terms of prediction capability, there will still be some transformation left after the regression stage. Thus, the classifiers should effectively see a smaller range of transformations than without the inclusion of a LocNet and transformer module. The training procedure used to train the classifiers is described in Section 3.4.
Effect of rendering edges on LocNet
The LocNet is trained on zero padded – suffix – as well as reflect padded inputs – suffix – for comparison. The former possibly yields an unfair advantage of this approach compared to other methods as the neural network can exploit the edges (induced through zero padding) to learn the transformation parameters. Therefore, we also consider reflection padding to assess the effect of the different paddings on final performance. Nonetheless, zero padding is consistent with the augmentation setting for the endtoend trained networks and regularized methods and was also the choice considered by [11]. For completeness we also show results when using reflection padding for training LocNet although it lacks comparability with the other methods since attacks should be reflectionpadded as well.
Minimizing loss of information in the prediction transformation process
In the spatial transformer module we compare two variants of handling the labels predicted by the LocNet. We can either backtransform the transformed image with the negative predicted labels, which will, under the assumption that the regressor successfully learnt object orientations, turn back the image but potentially result in extra padding space before we feed the images into the classifier. Alternatively, we can subtract the predicted transformation from the attack transformation, then use the remaining transformation as the new “attack transformation”. The latter will result in much smaller padding areas, if the LocNet is performing well. From the experimental results we do see a big drop if we naively transform images twice. We denote the former method as “naive” and latter as “trick”.
Observed results
For CIFAR10, this twostage classifier achieved relatively high grid accuracies. However, the obtained accuracies are still lower than expected, given that the LocNet is allowed to learn rotations with a separately trained regressor on the transformed training set. For SVHN we also see a gain compared to adversarial training without regularizer. However, the performance still lags behind the accuracies obtained by the regularizers. The results are summarized in Table 3.
Dataset  STN+(c) trick  STN+(r) trick  STN+(c) naive  STN+(r) naive 
SVHN (nat)  94.92  95.51  94.92  95.51 
(rob)  90.95  90.28  64.91  59.68 
CIFAR10 (nat)  91.29  90.99  91.29  90.99 
(rob)  83.05  84.31  44.88  42.84 
Appendix C More experimental results
In this section we discuss additional experimental results that we collected and and analyzed.
c.1 Stability to selection of regularization parameter
SVHN  CIFAR10  
SVHN  CIFAR10 
SVHN  CIFAR10  
SVHN  CIFAR10 
c.2 Additional experimental results
std  std*  AT(rob, Wo10)  AT(mix, Wo)  
SVHN (nat)  95.48 (0.15)  93.97 (0.09)  96.03 (0.03)  96.56 (0.07) 
(rob)  18.85 (1.27)  82.60 (0.23)  90.35 (0.27)  88.83 (0.10) 
CIFAR10 (nat)  92.11 (0.18)  89.93 (0.18)  91.76 (0.23)  93.44 (0.19) 
(rob)  9.52 (0.66)  58.29 (0.60)  71.17 (0.26)  68.14 (0.48) 
CIFAR100 (nat)  70.23 (0.18)  66.62 (0.37)  68.79 (0.34)  73.03 (0.13) 
(rob)  5.09 (0.25)  28.53 (0.25)  38.21 (0.10)  35.93 (0.24) 

Mixed batch experiments
In addition to the results reported in the main text, in this section we also report results on more experiments that use the “mixed batch” setting, meaning that the gradient of the loss is taken with respect to both the adversarial and natural examples. This is common practice in the adversarial example literature [21] and we denote this approach by “mix”. As can be seen in Table 4, for adversarial training a mixed batch improves natural accuracy at the expense of test grid performance. For the regularization methods, we observe a much small, and not consistent, effect of the batch type as can be seen in Table 6. For example, comparing ALP(rob, ) vs. ALP(mix, ) shows that the performance differences are mostly not significant.






SVHN (nat)  96.16 (0.10)  96.00 (0.02)  96.13 (0.07)  96.14 (0.04)  96.54 (0.01)  
(rob)  90.69 (0.05)  92.27 (0.09)  92.71 (0.09)  92.42 (0.03)  92.62 (0.03)  
CIFAR10 (nat)  89.33 (0.16)  90.83 (0.18)  90.41 (0.05)  89.98 (0.21)  89.82 (0.13)  
(rob)  73.50 (0.19)  77.34 (0.19)  77.47 (0.28)  78.93 (0.23)  78.89 (0.07)  









SVHN (nat)  96.05 (0.04)  96.53 (0.03)  96.41 (0.07)  96.3 (0.09)  96.39 (0.04)  96.11 (0.08)  96.30 (0.09)  
(rob)  92.16 (0.05)  92.55 (0.08)  92.17 (0.11)  92.04 (0.19)  92.48 (0.05)  92.32 (0.17)  92.42 (0.20)  
CIFAR10 (nat)  88.32 (0.13)  90.53 (0.16)  91.13 (0.13)  90.11 (0.25)  90.67 (0.12)  89.85 (0.27)  89.70 (0.10)  
(rob)  75.46 (0.25)  77.06 (0.16)  75.89 (0.23)  75.90 (0.31)  76.72 (0.21)  77.80 (0.17)  77.72 (0.35)  
CIFAR100 (nat)      68.54 (0.27)    68.04 (0.27)  89.82 (0.13)  68.44 (0.39)  
(rob)      49.30 (0.33)    49.98 (0.31)  78.89 (0.07)  52.58 (0.20)  








CIFAR10 (nat)  89.34 (0.16)  90.83 (0.18)  89.33 (0.22)  89.47 (0.04)  90.11 (0.25)  90.62 (0.07)  
(rob)  73.40 (0.19)  77.34 (0.19)  77.52 (0.16)  73.22 (0.14)  75.90 (0.31)  76.78 (0.15) 
std*  (nat, rnd)  KL(nat, rnd)  ALP(rob, rnd)  KL(rob, rnd)  ALP(mix, rnd)  
SVHN (nat)  93.97 (0.09)  96.34 (0.08)  96.16 (0.10)  96.09 (0.06)  96.23 (0.08)  96.19 (0.07) 
(rob)  82.60 (0.23)  90.51 (0.15)  90.69 (0.05)  90.48 (0.16)  90.92 (0.17)  90.48 (0.15) 
CIFAR10 (nat)  89.93 (0.18)  87.80 (0.11)  89.34 (0.16)  88.75 (0.18)  89.47 (0.04)  89.43 (0.28) 
(rob)  58.29 (0.60)  71.60 (0.27)  73.50 (0.19)  71.49 (0.30)  73.22 (0.14)  71.97 (0.11) 
ALP(nat, Wo)  (nat, Wo)  KLC(nat, Wo)  KL(nat, Wo)  
SVHN (nat)  96.39 (0.03)  96.05 (0.04)  96.18 (0.06)  96.00 (0.02) 
(rob)  91.98 (0.13)  92.16 (0.05)  91.99 (0.12)  92.27 (0.09) 
CIFAR10 (nat)  88.78 (0.11)  88.32 (0.13)  89.61 (0.09)  90.83 (0.18) 
(rob)  75.43 (0.13)  75.46 (0.25)  76.15 (0.23)  77.34 (0.19) 
Weakness of first order attack.
Table 10 shows the accuracies of various models trained with SPGD defenses and evaluated against the SPGD and the grid search attack on all datasets. We observe that the SPGD attack constitutes are very weak attack since the associated accuracies are much larger than for the grid search attack. In other words, the SPGD attack only yields a very loose upper bound on the adversarial accuracy. This stands in stark contrast to attacks and has first been noted and discussed in [11]. Interestingly, using the first order method as a defense mechanism proves to be very effective in terms of grid accuracy. When used in combination with a regularizer this defense yields the largest overall accuracies as shown and discussed in Section 4. Recall that due to computational reasons grid search cannot be used as a defense mechanism. Therefore, the strongest computationally feasible defense does not use the same mechanism as the strongest attack in our setting.
SVHN (nat)  96.27 (0.00)  96.06 (0.10)  96.30 (0.09) 
(grid)  84.81 (0.01)  87.29 (0.09)  92.42 (0.20) 
(SPGD)  95.26 (0.04)  95.46 (0.10)  95.92 (0.13) 
CIFAR10 (nat)  92.19 (0.23)  91.83 (0.19)  89.70 (0.10) 
(grid)  64.26 (0.25)  69.74 (0.27)  77.72 (0.35) 
(SPGD)  88.84 (0.27)  89.87 (0.10)  88.15 (0.21) 
CIFAR100 (nat)  71.11 (0.37)  68.87 (0.19)  68.44 (0.39) 
(grid)  33.40 (0.21)  37.87 (0.12)  52.58 (0.20) 
(SPGD)  65.01 (0.32)  65.56 (0.12)  66.04 (0.40) 
Stronger grid search attack
To evaluate how much grid accuracy changes with a finer discretization of the perturbation set , we compare the default grid to a finer one for a subset of experiments, summarized in Table 11. Specifically, “(grid775)" shows the test grid accuracy using the default grid containing 5 values per translation direction and 31 values for rotation, yielding a total of 775 transformed examples that are evaluated for each . “(grid7500)” shows the test grid accuracy on a much finer grid with 10 values per translation direction and 75 values for rotation, resulting 7500 transformed examples. We observe that the test grid accuracy only decreases slightly for the finer grid and the reduction in accuracy is smaller for ALP than for AT. Due to computational reasons we use the grid containing 775 values for all other experiments.
SVHN (grid775)  88.83 (0.10)  89.75 (0.17)  92.17 (0.11) 
(grid7500)  88.02 (0.12)  89.29 (0.15)  91.79 (0.12) 
CIFAR10 (grid775)  68.14 (0.48)  70.35 (0.16)  75.89 (0.23) 
(grid7500)  65.69 (0.28)  68.28 (0.16)  74.58 (0.16) 
CIFAR100 (grid775)  35.93 (0.24)  38.21 (0.10)  49.30 (0.33) 
(grid7500)  33.62 (0.23)  36.04 (0.21)  47.95 (0.23) 
c.3 Regularization effect on range of incorrect angles
Standard 
Comments
There are no comments yet.