Certified Defense via Latent Space Randomized Smoothing with Orthogonal Encoders

08/01/2021 ∙ by Huimin Zeng, et al. ∙ University of Maryland Technische Universität München 0

Randomized Smoothing (RS), being one of few provable defenses, has been showing great effectiveness and scalability in terms of defending against ℓ_2-norm adversarial perturbations. However, the cost of MC sampling needed in RS for evaluation is high and computationally expensive. To address this issue, we investigate the possibility of performing randomized smoothing and establishing the robust certification in the latent space of a network, so that the overall dimensionality of tensors involved in computation could be drastically reduced. To this end, we propose Latent Space Randomized Smoothing. Another important aspect is that we use orthogonal modules, whose Lipschitz property is known for free by design, to propagate the certified radius estimated in the latent space back to the input space, providing valid certifiable regions for the test samples in the input space. Experiments on CIFAR10 and ImageNet show that our method achieves competitive certified robustness but with a significant improvement of efficiency during the test phase.



There are no comments yet.


page 2

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks (DNNs) show impressive performance in many tasks, like image recognition, language understanding and audio processing. However, it is also widely known that deep neural networks can be vulnerable to adversarially perturbed input examples 

[Szegedy et al., 2013]

. Therefore, it is important to have strong defenses against such adversarial examples, especially for security-critical scenarios. Empirical defenses have been proposed to train robust classifiers 

Carlini and Wagner [2017], Kannan et al. [2018], Kurakin et al. [2016], Shaham et al. [2018]. Despite their successes in defending against certain attacks, there is no worst-case performance guarantees. Another line of work Raghunathan et al. [2018], Wong and Kolter [2017] is called certified defense, which provides provable robustness: bounded perturbation in the input will not cause a large change in the output. However, existing certified defense fails to scale to large datasets or apply to arbitrary model types.

Input Space Randomized Smoothing (IS-RS) Cohen et al. [2019] provides certified robustness of a smoothed classifier, smoothed-out by input space Gaussian augmentations

. Compared with other certified defenses, IS-RS guarantees certified robustness scalable to large-scale datasets and large network architectures. However, in this work, we argue that the efficiency of IS-RS can be further improved. There are three sources for the inefficiencies in IS-RS: (1) The deep depth of the network that the Gaussian noise has to (forward/backward) propagate through. (2) The large number of Gaussian samples needed for an accurate enough empirical estimation of the expectation over the Gaussian distribution. (1) and (2) are the most important factors. In addition, there is another minor issue: (3) The high-dimensionality of the space in which we sample Gaussian noise. These three factors will negatively affect the efficiency of randomized smoothing, in terms of training, inference as well as the final robustness evaluation. It is true that training and inference of the classifier may not require a lot of Gaussian samples per example, but if we consider the entire training set and test set, the overall number of Gaussian noises needed is large. More importantly, evaluating the certification for each data point requires a large quantity of Gaussian noises to maintain the high confidence (usually over 100,000 for a single image). Therefore, it is crucial to increase the efficiency of IS-RS.

Motivated by observing the sources of inefficiencies of IS-RS, we propose latent space Randomized Smoothing (LS-RS), which achieves efficient certified robustness of a classifier smoothed by injecting Gaussian noise in the (compact) latent space. In order to define the latent space of a network more precisely, we split a normal neural network into two sub-networks, as shown in Figure 1: an encoder and a classifier . The latent space is the space that links the output feature space of and the input feature space of . Under this setting, the forward pass of any given input is computed with and the output of , i.e. , will be smoothed with Gaussian noises to obtain the a smoothed function.

Figure 1: Split a network into two parts.

LS-RS is computational more efficient than IS-RS due to the following reasons. (1) LS-RS could be implemented at deeper layers, therefore the effective depth required for forward/backward propagation of the Gaussian augmented images could be significantly reduced. (2) LS-RS could be implemented on a compact latent space, with a representation size potentially significantly smaller than the size of the input layer, improving the inefficiency rooted from the high-dimensionality of the space to inject Gaussian noise. In fact, LS-RS provides us with great flexibility of choosing the latent space, via which the dimensionality of latent space could be easily controlled.

There are a few technical challenges to implement an LS-RS algorithm. Challenge I: A certified robustness guarantee against adversarial perturbations in the input space, for a LS-RS model, does not exist. The reason is that the Gaussian noises are sampled in the latent space instead of the input space. To this end, the robustness certification of randomized smoothing could only be established for instead of , which is frustratingly useless, since in practice, the adversarial perturbations are usually created for corrupting the input space. Challenge II: Even if a certified radius in the latent space is obtained, reverting it back to the input space requires a tight characterization of the Lipschitz constant of the sub-network (encoder) , mapping the data in the input space to that in the latent space. Computing the Lipschitz constant of a network is usually computationally difficult. Challenge III:

The Lipschitz constant of a linear layer is an upper bound of the singular values across all singular vectors (i.e., spectral components). An algorithm to compute the Lipschitz constant of the sub-network

might require computation of the SVD of convolutional layers, which is too expensive and thus violating our purpose of improving the efficiency of IS-RS. In addition, more seriously, under bad condition numbers, i.e., the singular values vary drastically across different spectral components, the Lipschitz constant of the network might be too loose to obtain competitive certified radius in the input space.

In this paper, to solve the aforementioned challenges in implementing LS-RS, we propose a novel design of the encoder to be an orthogonal convolutional network Jia et al. [2019], Wen et al. [2020]. In other words, we design the encoder such that for any and , and . It is obvious that the singular values of all spectral components by model design. With this guaranteed “flattness” of the spectral components for the sub-network, we obtain the Lipschitz constant for free without any additional computation and our characterization of the Lipschitz constant is tight enough for a competitive certified radius in the input space, as verified in our experiments.

We established an equivalency between IS-RS and LS-RS in terms of certified accuracy under our framework.

Summary of contributions:

  1. We introduce a novel latent space randomized smoothing by augmenting the latent feature representation with i.i.d. Gaussian noise instead of the input images.

  2. Based on LS-RS, we significantly increase the efficiency of the randomized smoothing framework from different aspects. For instance, the choice of the latent space is flexibility, which could further reduce the forward/backward complexity. On CIFAR10 dataset Krizhevsky et al. [2009], the average time used for certifying one single image could be reduced from around 12 seconds to around 8 seconds, a 33.3% efficiency improvement, with only 5.93% degradation in performance for , 5.50% for and 1.50% for . Similarly, on ImageNet Russakovsky et al. [2015], we can also observe a significant improvement on test efficiency without sacrificing much accuracy.

  3. Finally, we point out a technical contribution of adopting orthogonal convolutinoal layers and the norm-preserving GroupSort non-linear activation to build the sub-network . Without exhaustively and expensively computing the Lipschitz of the non-convex encoder, we are able to easily find equivalency between the certified radius of the input space and that of the latent space.

2 Methodology

We will start by reviewing the randomized smoothing framework. Let us consider a classification model , mapping examples in an input space to a label in the label space. Then there exists a robustness guarantee for a “smoothed” version of the base classifier . Formally, the smoothed classifier is defined as:


In other words, given a test input , a smoothed classifier will augment it with isotropic Gaussian noises (parameterized by ), and predict a label that the majority of the augmented images output after propagating through the base classifier . If we denote the class output by the majority of the Gaussian augmented images as

and any other “runner-up” class as

then the result of the smoothed classifier under any perturbation of the input within a radius will be robust


where the certified radius depends on the base classifier

and the Gaussian Standard Deviation


where denotes the inverse of the standard Gaussian CDF.

2.1 Latent Space Randomized Smoothing

To implement LS-RS, we propose to split a neural network into two sub-networks and as shown in Figure 1. Sampling the parameterized Gaussian noise in the latent space to augment the latent feature representation requires only the second part of the network to be “smoothed”. Formally, we define the ’‘partially” smoothed classifier as following:


For simplicity, we use to denote unsmoothed representation in the latent space: . Therefore, Equation 4 can be re-formulated as


Comparing Equation 5 with Equation 1, it is straightforward to derive the robustness guarantee:




it is true that


Therefore, Equation 6 to Equation 8 provide robustness guarantee for .

2.2 Lipschitz Preserving Layers

The core idea of our approach is to adopt Lipschitz-preserving layers to derive the certification in the input space from the latent space. Here, we introduce the Lipschitz-preserving layers that we will use to build our network, including orthogonal convolutional layers and non-linear activation.

Orthogonal convolutional layers. A circular convolutional layer , parameterized by weights , is orthogonal if its input and output satisfy that


Existing works that implement orthogonal convolutional layers are mainly in two categories, namely encouraging orthogonality through a penalization term in the objective function and enforcing orthogonal through parameterization or model design. Gouk et al. [2021], Tsuzuku et al. [2018], Li et al. [2019], Sedghi et al. [2018], Singla and Feizi [2019], Anil et al. [2019], Trockman and Kolter [2021], Jia et al. [2019], Wen et al. [2020], Jia et al. [2017], Cisse et al. [2017] The former does not guarantee orthogonality (for instance, computing and regularizing the largest singular value of a convolutional layer) whereas the latter does guarantee. For our purpose of obtaining Lipschitz preserving layers, we use parameterization or model design, i.e., the latter, to enforce orthogonality.

Since the Cayley transform of a skew-symmetric matrix is always orthogonal,

Trockman and Kolter [2021] proposed to apply the Cayley transform to skew-symmetric convolutions in the Fourier domain to parameterize such convolutional layers to be orthogonal. Mathematically, for a skew-symmetric matrix , satisfying , the Cayley transform guarantees the matrix to be orthogonal:


However, directly applying the Cayley transform to the convolutions of the neural network could be problematic: even if convolutions could be easily skew-symmetrized, it is rather inefficient to find their inverse. Technically, one can firstly perform the Fast Fourier Transform (

) on the weights of the conlutoinal layer and the input tensor , converting them into spectral domain: and . Correspondingly, in the Fourier domain, the resulted pixel after the convolution and the inverse convolution could be computed using


Then, the Fourier-domain weights for the skew-symmetric convolution (using the conjugate transpose) and certain matrices required for inverse are computed:


Next, based on all matrices computed just now, it is possible to compute the Cayley transform:

Plugging in and , the orthogonal convolution in the spectral domain is achieved according to Equation 10:


where is orthogonal. Therefore, ultimately, the results in the spatial domain could be obtained by applying inverse to :

which is the output of the orthogonal convolutional layer.

GroupSort activation.

We adopt an alternative activation function to build our encoder, which is called GroupSort

Anil et al. [2019]

. GroupSort separates the variables before the activation into groups, sorts each group into ascending order, and outputs the combined vector. Note that GroupSort is both Lipschitz and gradient norm-preserving. The Lipschiz-preserving property of GroupSort enables us to restrain our encoder to be orthogonal. As for preserving the gradient norm, this property contributes to the training of the orthogonal encoder, since there will be no gradient vanishing problem. (Unlike ReLU, which is not norm-preserving and could lead to gradient vanishing.)

The major advantage of using these modules is that the Lipschitz constant of these layers is 1. Therefore, the concatenation of them, including orthogonal fully connected layers, orthogonal convolutional layers and other Lipschitz-preserving non-linear functions, will lead to a orthogonal network with a global Lipschitz constant 1. This property enables us to establish the relationship between the certified radius computed in the latent space and that of the input space easily. More precisely, the certified radius in the input space will be preserved after the Lipschitz-preserving layers.

3 Guarantees and Analysis

3.1 Robustness Guarantee

From Equation 6 to Equation 8, we know how the certified radius could be computed for the latent representation. In other words, we can only guarantee that the perturbations in the latent space within will not cause wrong predictions for the given input example. The next step is to understand the relationship between the perturbation in the latent space and the perturbation in the input space. Obviously, it is not sufficient to find robustness guarantee in the latent space, since the attackers will always try to directly perturb the images in the input space. How can we find the certified radius in the input space when the certified radius in the latent space is available?

Theorem 1.

Split a base classifier into an encoder and a classifier . Let be -Lipschitz, then within the certified radius , it is guaranteed that all possible adversarial examples will labeled by as , which is the same as .


-Lipschitz of provides

Given certification

Therefore, if , then


Theorem 1 demonstrates how to establish the robustness certification in the input space using the certification of latent space, when the encoder is L-Lipschitz. Obviously, since the encoder is allowed to amplify the input signal -times, the certified radius computed in the latent space must be divided by to obtain the radius in the input space. Moreover, when use norm-preserving layers to build the encoder, indicating that the Lipschitz constant of is , the certified radius in the input space is bounded by the certified radius computed in the latent space.

3.2 Does Rescaling of Lipschitz Affect?

When we review the proof of Theorem 1, it is to notice that theoretically, orthogonality is not the necessary condition to derive the robustness guarantee. That is, it is not required to use layers with Lipschitz constant 1 to build the encoder . However, we argue that rescaling the Lipschitz constant will not affect the tightness of the bounds.

Consider an -Lipschitz encoder . We compute the certified radius in the latent space , using Theorem 1. Therefore, the resulting certified radius in the input space is . For , we can always split it into two modules, a multiplication module of constant and an orthogonal network . As a result, given two input images and , we have

Similarly, for , it is true that

Therefore, it is sufficiently representative to use orthogonal encoders to compute tight certified radius. Using an encoder with larger Lipschiz constant will not contribute to a better certified radius, since we can just multiply the input with this large constant and then use an orthogonal encoder to get the equivalent result.

4 Experiments

In this section, we evaluate our proposed latent space randomized smoothing on two benchmark datasets. In order to address the inefficiency of randomized smoothing we pointed out in the first section, we compare the efficiency of our proposed LS-RS and baseline IS-RS by reporting average time consumption for certifying one test sample. Moreover, to evaluate the tightness of the robustness guarantee provided by LS-RS we also report the averaged certified radius (ACR) Zhai et al. [2020] of the input space for both datasets. Finally, we present the ablation study of tuning the depth of the latent space to show how LS-RS enables great flexibility in term of controlling the depth of the classifier and the dimensionality of the latent space.

Baselines and experimental settings. We use Input Space Randomized Smoothing (IS-RS) Cohen et al. [2019] as the baseline algorithm. For CIFAR10 dataset, we use ResNet18 as the baseline model and modify the architecture by converting its first convolutional layer to be orthogonal and replacing its first residual blocks with orthogonal residual blocks. As for ImageNet, we use WideResNet34 as the baseline architecture and follow the similar routine to obtain its orthogonal version. We evalute our models using GeForce RTX 2080Ti 11GB.

Implementation of the networks. We show how the orthogonal version of resnets could be implemented by firstly demonstrating the orthogonal skip connection and then the resulted network with substitutions. As argued by Trockman and Kolter [2021]

, the Lipschitz constant of residual connections

could be ensured by making the main branch and the skip connection a convex combination with a new learnable parameter : . Therefore, we use the modified skip connection (shown in Figure 1(b)) to build our orthogonal encoder in all experiments. Moreover, as we mentioned just now, LS-RS provides us with great flexibility of tuning the depth of the latent space. As shown in Figure 3, we can control how many the original free skip connections could be replaced by the orthogonal ones. The fraction reveals exactly how we can control the depth as well as the dimensionality of the latent space.

(a) vanilla skip connection
(b) convex combination of two branches
Figure 2: Comparison between the original skip connection and the modified skip connection.
Figure 3: Replace the skip connections in a network with orthogonal skip connections. The first network is the baseline network without any orthogonal module. For the second network, three out of eight skip connections are substituted by orthogonal ones, whereas for the third network, six out of eight are replaced.

Experimental results and analysis. In this section, by observing the results showing in Table 1

, we firstly conclude that the speed of certification of LS-RS is much faster than IS-RS, while achieving comparable robustness guarantee on CIFAR10 dataset. Moreover, we also include the experimental results on ImageNet in Appendix, which also verify this statement. Note that in all tables, the hyperparameter

refers to the fraction of replacement with orthogonal skip connections. Speaking of the flexibility provided by LS-RS, we also report Table 2 to show the effect of the depth of the latent space. Obviously, the deeper the latent space is established, the faster the certification could proceed, whereas both the robustness and the accuracy would be sacrificed. Actually, according to the concentration theory, the number of Gaussian samples needed to approximate the expectation using the empirical mean is higher in a higher-dim space. This intuition is problematic. Since in practice, it is likely that the latent space happens to be a space whose dimensionality is raised in comparison to the input space. In the experiments on CIFAR10, for Table 1, the dimensionality of the input space is , whereas the dimensionality of the latent space is . As shown in the table, even if we raise the dimensionlity in the latent space and do not sample more Gaussian samples, we are still able to achieve much higher efficiency and comparable certified radius. Finally, concerning the performance drop in Table 2

, one possible reason is that the training of deep orthogonal layers could be challenging. Unlike training a normal neural network, there is no batch normalization or similar training-stabilizing methods available to optimize the training process of orthogonal networks. Moreover, since the Lipschitz constants of the orthogonal convolutional layers are restricted to be 1, indicating that the expressive power of the encoder

could be still limited. With more norm-preserving modules in the network, the expressive power of the entire network could be severely limited, leading to the less satisfying results.

Defense FoR ACR Accuracy (%) time (s/example)
IS-RS 0.00 0/18 0.000 93.15 -
IS-RS 0.25 0/18 0.472 80.78 11.212
IS-RS 0.50 0/18 0.564 68.24 12.747
IS-RS 1.00 0/18 0.532 49.35 12.953
LS-RS 0.00 9/18 0.000 86.92 -
LS-RS 0.25 9/18 0.444 77.56 7.904
LS-RS 0.50 9/18 0.533 66.68 7.771
LS-RS 1.00 9/18 0.524 50.72 7.985
Table 1: Efficiency and robustness evaluation on CIFAR10.
Defense FoR ACR Accuracy (%) time (s/example)
LS-RS 0.00 1/18 0.000 93.15 -
LS-RS 0.25 1/18 0.478 80.18 12.689
LS-RS 0.50 1/18 0.562 68.89 13.038
LS-RS 1.00 1/18 0.522 50.55 12.879
LS-RS 0.00 5/18 0.000 91.26 -
LS-RS 0.25 5/18 0.475 79.70 10.375
LS-RS 0.50 5/18 0.561 68.09 9.643
LS-RS 1.00 5/18 0.540 50.07 9.913
LS-RS 0.00 9/18 0.000 86.92 -
LS-RS 0.25 9/18 0.444 77.56 7.904
LS-RS 0.50 9/18 0.533 66.68 7.771
LS-RS 1.00 9/18 0.524 50.72 7.985
LS-RS 0.00 13/18 0.000 79.40 -
LS-RS 0.25 13/18 0.387 71.32 6.131
LS-RS 0.50 13/18 0.485 62.14 5.807
LS-RS 1.00 13/18 0.480 47.70 5.827
Table 2: Tuning the depth of the latent space for ResNet18 on CIFAR10.

5 Related Work

State-of-the-art adversarial defenses can be categorized into empirical defenses and certifiable defenses. Empirical defenses are robust to known adversarial attacks, but are still vulnerable to unknown stronger attacks. Adversarial training, as one of the most powerful empirical defenses, has been demonstrating its power in defending against strong attacks. By including adversarial examples during training phase, the network can directly learn how to classify adversarial examples  [Madry et al., 2017, Shafahi et al., 2019, Zhang et al., 2019] correctly. However, such defense could be subverted by stronger attacks such as iterative attacks [Qian and Wegman, 2018] or non-uniform attacks [Zeng et al., 2020]. By contrast, certified defenses can provide provable robustness of the models against specific adversarial perturbations. They work by obtaining the perturbation with minimum such that , where is a classifier and is the input data [Cheng et al., 2017, Lomuscio and Maganti, 2017, Dutta et al., 2018, Fischetti and Jo, 2017]. Since the problem is NP-hard, a relaxation of the non-linearities can be quite useful. Linear inequality constraints show better efficiency [Singh et al., 2018, Gehr et al., 2018, Zhang et al., 2018]. Some defenses integrate the verification methods into the training process, trying to minimize the robust loss directly. A bound derived with a semi-definite programming (SDP) relaxation was minimized as a regularizer [Raghunathan et al., 2018]. In addition,  Wong and Kolter [2017] presents a similar defense but the upper bound is relaxed with a LP relaxation. Though such certified defense gives provable robustness, they cannot scale to large datasets and arbitrary model types.

Randomized smoothing. In a recent work [Cohen et al., 2019], randomized smoothing was formally proposed to provide a tight robustness guarantee for deep neural networks. By adding i.i.d. Gaussian noises to the inputs of a normally trained base classifier, a smoothed classifier is obtained to provide provably robust against -norm bounded perturbations with statistical confidence. Based on the baseline randomized smoothing, further following-up works were carried out and proposed approaches to augment the efficacy of randomized smoothing. Substantially boosting certifiable robustness of smoothed classifiers, Salman et al. [2019] combined adversary training. In [Zhai et al., 2020], a modified version of randomized smoothing was proposed by injecting the approximated average certified radius into the training objective, with which the ACR of test samples could be further improved. Randomized smoothing is powerful and efficient during training. By simply augmenting the train samples with Gaussian noise, the computational cost is still satisfying. In fact, randomized smoothing is one of few certifiable defenses that could be applied to large datasets, for instance ImageNet. However, during test time, the efficiency of randomized smoothing is frustrating. In order to maintain the concentration bounds and obtain robustness guarantee with high confidence, a large quantity of Gaussian noises must be sampled (usually over 100000 for a single test sample), which could be extremely time consuming.

Estimating Lipschitz constant of neural networks. In order to map the certification from the latent space back to the input space, one possible solution is to evaluate the Lipschitz property of the encoder, which has never been an easy task due to the non-convexity of deep neural networks. In most works on estimating the Lipschitz of neural networks, it is recommended to firstly evaluate the Lipschitz constants for each layer and then multiply them together to obtain the overall Lipscthiz constant for the network. In [Gouk et al., 2021, Tsuzuku et al., 2018], it is recommended to approximate the operator norm as the bound of Lipschitz constant for different layers using power method. However, in practice, power method is usually implemented with limited iterations, which could not provide tight bounds and therefore, leads to weak robustness guarantees. Sedghi et al. [2018] proposed to compute the largest singular values of convolutional layers using SVD in the Fourier domain, which is much faster than computing SVD for weight matrices directly. Still, even in the Fourier domain, it could rather computationally expensive to perform SVD for each layer. Singla and Feizi [2019] found upper bounds for the spectral norm of convolution kernels by appropriately reshaping the weight matrix, which is computationally efficient but sacrifices the tightness of the bounds.

Orthogonal convolutional networks. To avoid the loose estimated Lipschitz bound, we adopt the orthogonal neural networks, whose Lipschitz bound is imposed by their architectures. One main challenge in designing these networks is to enforce orthogonality of the convolutional layers. Early works flatten the higher-order convolutional kernel into a matrix, and enforces orthogonality of the resulted matrix [Jia et al., 2017, Cisse et al., 2017]. However, this approach does not lead to the orthogonality of the original convolution. Projected gradient descent via singular value clipping is proposed in Sedghi et al. [2018], which is expensive in practice. Recent works adopt parameterization-based approaches, either using block convolutions [Li et al., 2019] Cayley transform [Trockman and Kolter, 2021], or spectral factorization [Su et al., 2021]. After reviewing their flexibility and scalability, we adopt the algorithm and code of [Su et al., 2021] to build our orthogonal modules.

6 Conclusion

Aiming at increasing the efficiency of randomized smoothing, this work studies the potential of performing smoothing in the latent space. We propose Latent Space Randomized Smoothing, which is achieved by sampling Gaussian noises in the latent space. To this end, we are able to find robustness guarantee for the partial network after the latent space. In order to establish the equivalency between the robust certification between the latent space and input space, we adopt orthogonal convolutional layer and the norm-preserving GroupSort activation to build the encoder, the first sub-network. With Lipschitz constant equal to 1, we proved in the paper that the certified radius computed in the latent space is identical to that in the input space and rescaling the Lipschtiz constant of the encoder will not contribute to be tighter bounds.

Our method LS-RS improves the efficiency of randomized smoothing drastically compared to IS-RS, while achieving slightly worse but still comparable performance on robustness guarantee. The utilization of norm-preserving layers motivates us to analyze adversarial perturbation from a different perspective in the latent space. The core benefit of such approach is that the computational complexity used for sampling and forward pass could be thoroughly reduced.

7 Acknowledgement

This work is supported by National Science Foundation IIS-1850220 CRII Award 030742-00001 and DOD-DARPA-Defense Advanced Research Projects Agency Guaranteeing AI Robustness against Deception (GARD), and Adobe, Capital One and JP Morgan faculty fellowships.


  • C. Anil, J. Lucas, and R. Grosse (2019) Sorting out lipschitz function approximation. In

    International Conference on Machine Learning

    pp. 291–301. Cited by: §2.2, §2.2.
  • N. Carlini and D. Wagner (2017) Adversarial examples are not easily detected: bypassing ten detection methods. Cited by: §1.
  • C. Cheng, G. Nührenberg, and H. Ruess (2017) Maximum resilience of artificial neural networks. In International Symposium on Automated Technology for Verification and Analysis, pp. 251–268. Cited by: §5.
  • M. Cisse, P. Bojanowski, E. Grave, Y. Dauphin, and N. Usunier (2017) Parseval networks: improving robustness to adversarial examples. In International Conference on Machine Learning, pp. 854–863. Cited by: §2.2, §5.
  • J. M. Cohen, E. Rosenfeld, and J. Z. Kolter (2019) Certified adversarial robustness via randomized smoothing. arXiv preprint arXiv:1902.02918. Cited by: §1, §4, §5.
  • S. Dutta, S. Jha, S. Sankaranarayanan, and A. Tiwari (2018) Output range analysis for deep feedforward neural networks. In NASA Formal Methods Symposium, pp. 121–138. Cited by: §5.
  • M. Fischetti and J. Jo (2017)

    Deep neural networks as 0-1 mixed integer linear programs: a feasibility study

    arXiv preprint arXiv:1712.06174. Cited by: §5.
  • T. Gehr, M. Mirman, D. Drachsler-Cohen, P. Tsankov, S. Chaudhuri, and M. Vechev (2018) Ai2: safety and robustness certification of neural networks with abstract interpretation. In 2018 IEEE Symposium on Security and Privacy (SP), pp. 3–18. Cited by: §5.
  • H. Gouk, E. Frank, B. Pfahringer, and M. J. Cree (2021) Regularisation of neural networks by enforcing lipschitz continuity. Machine Learning 110 (2), pp. 393–416. Cited by: §2.2, §5.
  • K. Jia, S. Li, Y. Wen, T. Liu, and D. Tao (2019) Orthogonal deep neural networks. IEEE transactions on pattern analysis and machine intelligence. External Links: Document, ISSN 0162-8828, Link Cited by: §1, §2.2.
  • K. Jia, D. Tao, S. Gao, and X. Xu (2017) Improving training of deep neural networks via singular value bounding. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    pp. 4344–4352. Cited by: §2.2, §5.
  • H. Kannan, A. Kurakin, and I. J. Goodfellow (2018)

    Adversarial logit pairing

    ArXiv abs/1803.06373. Cited by: §1.
  • A. Krizhevsky, G. Hinton, et al. (2009) Learning multiple layers of features from tiny images. Cited by: item 2.
  • A. Kurakin, I. Goodfellow, and S. Bengio (2016) Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236. Cited by: §1.
  • Q. Li, S. Haque, C. Anil, J. Lucas, R. B. Grosse, and J. Jacobsen (2019) Preventing gradient attenuation in lipschitz constrained convolutional networks. In Advances in neural information processing systems, pp. 15390–15402. Cited by: Appendix A, §2.2, §5.
  • A. Lomuscio and L. Maganti (2017) An approach to reachability analysis for feed-forward relu neural networks. arXiv preprint arXiv:1706.07351. Cited by: §5.
  • A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2017)

    Towards deep learning models resistant to adversarial attacks

    arXiv preprint arXiv:1706.06083. Cited by: §5.
  • H. Qian and M. N. Wegman (2018) L2-nonexpansive neural networks. arXiv preprint arXiv:1802.07896. Cited by: §5.
  • A. Raghunathan, J. Steinhardt, and P. Liang (2018) Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344. Cited by: §1, §5.
  • O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. (2015) Imagenet large scale visual recognition challenge. International journal of computer vision 115 (3), pp. 211–252. Cited by: item 2.
  • H. Salman, G. Yang, J. Li, P. Zhang, H. Zhang, I. Razenshteyn, and S. Bubeck (2019) Provably robust deep learning via adversarially trained smoothed classifiers. arXiv preprint arXiv:1906.04584. Cited by: §5.
  • H. Sedghi, V. Gupta, and P. M. Long (2018) The singular values of convolutional layers. arXiv preprint arXiv:1805.10408. Cited by: §2.2, §5, §5.
  • A. Shafahi, M. Najibi, A. Ghiasi, Z. Xu, J. Dickerson, C. Studer, L. S. Davis, G. Taylor, and T. Goldstein (2019) Adversarial training for free!. arXiv preprint arXiv:1904.12843. Cited by: §5.
  • U. Shaham, Y. Yamada, and S. Negahban (2018) Understanding adversarial training: increasing local stability of supervised models through robust optimization. Neurocomputing 307, pp. 195–204. Cited by: §1.
  • G. Singh, T. Gehr, M. Mirman, M. Püschel, and M. Vechev (2018) Fast and effective robustness certification. In Advances in Neural Information Processing Systems, pp. 10802–10813. Cited by: §5.
  • S. Singla and S. Feizi (2019) Bounding singular values of convolution layers. arXiv preprint arXiv:1911.10258. Cited by: §2.2, §5.
  • J. Su, W. Byeon, and F. Huang (2021) Scaling-up diverse orthogonal convolutional networks with a paraunitary framework. arXiv preprint arXiv:2106.09121. Cited by: §5.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1.
  • A. Trockman and J. Z. Kolter (2021) Orthogonalizing convolutional layers with the cayley transform. arXiv preprint arXiv:2104.07167. Cited by: Appendix A, §2.2, §2.2, §4, §5.
  • Y. Tsuzuku, I. Sato, and M. Sugiyama (2018) Lipschitz-margin training: scalable certification of perturbation invariance for deep neural networks. arXiv preprint arXiv:1802.04034. Cited by: §2.2, §5.
  • Y. Wen, S. Li, and K. Jia (2020) Towards understanding the regularization of adversarial robustness on neural networks. In International Conference on Machine Learning, pp. 10225–10235. Cited by: §1, §2.2.
  • E. Wong and J. Z. Kolter (2017) Provable defenses against adversarial examples via the convex outer adversarial polytope. arXiv preprint arXiv:1711.00851. Cited by: §1, §5.
  • H. Zeng, C. Zhu, T. Goldstein, and F. Huang (2020) Are adversarial examples created equal? a learnable weighted minimax risk for robustness under non-uniform attacks. arXiv preprint arXiv:2010.12989. Cited by: §5.
  • R. Zhai, C. Dan, D. He, H. Zhang, B. Gong, P. Ravikumar, C. Hsieh, and L. Wang (2020) Macer: attack-free and scalable robust training via maximizing certified radius. arXiv preprint arXiv:2001.02378. Cited by: §4, §5.
  • D. Zhang, T. Zhang, Y. Lu, Z. Zhu, and B. Dong (2019) You only propagate once: painless adversarial training using maximal principle. arXiv preprint arXiv:1905.00877. Cited by: §5.
  • H. Zhang, T. Weng, P. Chen, C. Hsieh, and L. Daniel (2018) Efficient neural network robustness certification with general activation functions. In Advances in Neural Information Processing Systems, pp. 4939–4948. Cited by: §5.

Appendix A Experiments on ImageNet

In this section, we provide the experimental results on ImageNet. As described in the main text, we use WideResNet34 as the baseline neural network architecture and modify its first two layer and the following consecutive 7 residual blocks to be orthogonal. The widen factor for both baseline model and orthogonal model is set to two.

Defense FoR ACR Accuracy (%) time (s/example)
IS-RS 0.00 0/34 0.000 74.70 -
IS-RS 0.25 0/34 0.654 73.89 99.775
IS-RS 0.50 0/34 1.166 71.92 87.738
IS-RS 1.00 0/34 2.123 69.19 87.893
LS-RS 0.00 16/34 0.000 71.62 -
LS-RS 0.25 16/34 0.532 69.69 47.909
LS-RS 0.50 16/34 1.063 68.45 48.098
LS-RS 1.00 16/34 1.854 65.07 46.912
Table 3: Efficiency and robustness evaluation on ImageNet.

By observing Table 3, we can conclude that on ImageNet dataset, the average time used for certifying one single image could be reduced from around 91 seconds to around 47 seconds, a 48.3% efficiency improvement, with 14.07% degradation in performance for , 8.83% for and 12.67% for .

We point out that, most current works Li et al. [2019], Trockman and Kolter [2021], adopting orthogonal convolutions to build Lipschitz-bounded neural networks, mainly focus on CIFAR10, and barely scale to ImageNet. It is the first time that orthogonal convolutions are applied on ImageNet dataset. As shown by the results in Table 3, there is a significant improvement in efficiency while a slight performance drop compared to the IS-RS.

Appendix B Training Details

Experiments on CIFAR10. In all experiments, we used identical training hyperparameters except

to train the baseline models as well as our orthogonal models. The models were trained for 100 epochs. The learning rate was initialized to be 0.01 and was adjusted with a decaying factor 0.1 every 30 epochs. The optimizer was momentum with decaying factor 0.9. While evaluating the robustness of the smoothed classifier, all test examples were used.

Experiments on ImageNet. In all experiments, we used identical training hyperparameters except to train the baseline models as well as our orthogonal models. The models were trained for 90 epochs. The learning rate was initialized to be 0.1 and was adjusted with a decaying factor 0.1 every 30 epochs. The optimizer was momentum with decaying factor 0.9. During test time, all examples of the validation set were used to compute the accuracy. However, for the sake of simplicity, a subset of images were sampled to compute the certified radius. Specifically speaking, we randomly sample one image for each class and evaluate their robustness certification.

Appendix C Ethics Statement

Adversarial examples could raise extremely high threats to modern machine learning systems. Therefore, it is crucial to develop adversarial defenses. Deep networks have shown impressive performance on various tasks such as object detection, speech recognition, and game playing. However, they could still fail catastrophically in the presence of small adversarial perturbations, which are imperceptible. The existence of such adversarial examples exposes a severe vulnerability in current ML systems. Therefore, it is vital to develop reliable and efficient defense mechanisms to increase the robustness of such machine learning models in the context of adversarial attacks.

There are two streams for design defense algorithms. Empirical defense mechanisms can defend against existing attacks by including adversarial examples during training but fail when stronger attackers strike. In contrast, defenses with guarantees are much more reliable and could withstand arbitrary attacks. Randomized smoothing (RS) is one of the most potent certifiable defense algorithms. In the framework of RS, the robustness certification for the victim model is obtained by sampling Gaussian noises and using them to augment the test images.

Randomized smoothing suffers from sampling an enormous number of noises. Our work can reduce the computational complexity and increase the efficiency of RS. However, despite the impressive defending power of randomized smoothing, they suffer from sampling an enormous Gaussian noise during the evaluation phase, which could be extremely time-consuming. Our work provides a new methodology of establishing robustness certification for randomized smoothing in a more controllable latent space of neural networks by utilizing orthogonal convolutions. Sampling Gaussian noise in the latent space can save much computational complexity and increase the efficiency of RS.

Moreover, our work also carries out a new perspective of analyzing the adversarial signals in the latent space within a network, inspiring further potentials to design new defense mechanisms against adversarial attacks.