. Specifically, CNNs and rectified linear units (ReLUs) have resulted in breakthroughs in image recognition[LeCun et al.1989, Nair and Hinton2010] and are de facto standards for image recognition and other applications [He et al.2016, Radford, Metz, and Chintala2016]
. Though CNNs can classify image data as accurately as humans, they are sensitive to small perturbations of inputs, i.e., injecting imperceptible perturbations can make deep models misclassify image data. Such attacks are called adversarial attacks and the perturbed inputs are called adversarial examples[Szegedy et al.2013].
We can roughly divide adversarial attacks into two types; white-box attacks, which use the information of target models [Goodfellow, Shlens, and Szegedy2014, Madry et al.2018, Moosavi-Dezfooli, Fawzi, and Frossard2016], and black-box attacks, which do not require the information of target models [Papernot, McDaniel, and Goodfellow2016, Chen et al.2017, Papernot et al.2017]
. Black-box attacks, rather than white-box attacks, can threaten online deep-learning services since it is difficult to access the target models in online deep-learning applications[Papernot et al.2017, Yuan et al.2019].
Most black-box attacks are transferred attacks, which are generated as white-box attacks for substitute models instead of the target model [Papernot, McDaniel, and Goodfellow2016]. This implies that deep models have common sensitivity against specific perturbations. In fact, tsuzuku2018structural tsuzuku2018structural have recently shown that CNNs have the structural sensitivity from the perspective that convolution can be regarded as the product of the circulant matrix and proposed single Fourier attack (SFA).111Fp Fp concurrently proposed the same attack.
Fourier basis functions create singular vectors of circulant matrices, and SFA uses these singular vectors since the dominant singular vector can be the worst noise for a matrix-vector product. Although SFA is a very simple attack composed of a single-frequency component, it is universal adversarial perturbations for CNNs, i.e., it can decrease the classification accuracy of various CNN-based models without using the information about the model parameters and without depending on input images. To the best of our knowledge, an effective defense method against SFA has not been proposed. Therefore, such a method is necessary.
To defend CNNs against SFA, we first reveal that the spectral norm constraint [Sedghi, Gupta, and Long2019]
(hereinafter, we call it SNC) can reduce the structural sensitivity. While SNC was proposed to improve generalization performance, it can improve robustness in the Fourier domain since singular values of convolution layers correspond to the magnitude of the frequency response. However, SNC is not so practical since it requires high computational cost to compute the spectral norm (the largest singular value). We then developAbsum; an efficient regularization method for reducing the structural sensitivity of CNNs. Instead of the spectral norm, we use the induced -norm ( operator norm) since it is the upper bound of the spectral norm for convolution. However, a constraint of the induced -norm, which is equivalent to regularization, requires a tight constraint for robustness, which prevents minimization of the loss function. This is because the induced
-norm is a conservative measure; it handles the effects of negative inputs even though inputs always have positive values after ReLU activations. To improve robustness without preventing the loss minimization, Absum relaxes the induced-norm by penalizing the absolute values of the summations of weights instead of elements on the basis that input vectors always have positive elements. Absum is as simple as standard regularization methods such as weight decay, but it can reduce sensitivity to SFA. We provide the proximal operator to minimize loss functions with Absum.
Image recognition experiments on MNIST, Fashion-MNIST (FMNIST), CIFAR10, CIFAR100, and SVHN demonstrate that Absum and SNC outperform and regularization methods in terms of improving robustness against SFA, and the computation time of Absum is about one-tenth that of SNC. In the additional empirical evaluation, we reveal that robust CNNs against SFA can be robust against transferred attacks by using white-box attacks (projected gradient descent: PGD [Kurakin, Goodfellow, and Bengio2016, Madry et al.2018]). This implies that sensitivity to SFA is one of the causes of the transferability of adversarial attacks. As a further investigation of Absum and SNC, we reveal that adversarial perturbations for CNNs trained with Absum and SNC have little high-frequency components, i.e., these CNNs are robust against high-frequency noise. Furthermore, our experiments show that Absum is effective against PGD when using adversarial training.
The following are main contributions of this paper:
We show that SNC improves robustness against SFA. SNC was proposed to improve generalization performance, but effectiveness in robustness against SFA had not been evaluated.
We propose Absum and its proximal operator. Absum improves robustness against SFA as well as SNC while its computational cost is lower than that of SNC.
In the futher empirical evaluation, Absum and SNC can also improve robustness against other black-box attacks (transferred attacks and High-Frequency attacks [Wang et al.2019]). In addition, Absum can improve robustness against PGD when used with adversarial training.
CNNs, ReLUs and Circulant Matrix
In this section, we outline CNNs, ReLUs, and a circulant matrix for convolution operation. Let be an input map, be an output map, and be a filter matrix such that , where . The output of the convolution operation becomes
Note that when the filter size is and , we can embed it in the matrix
by padding with zeros[Sedghi, Gupta, and Long2019]. After the convolution, we usually use ReLU activations as the following function:
Typical model architectures use a combination of convolution and ReLU. For example, a standard block of ResNet [He et al.2016] is composed as
whereIoffe and Szegedy2015].
Since SFA and Absum are based on a circulant matrix for convolution operation, we show that the convolution can be expressed as a product of a vector and doubly block circulant matrix. Let and be vectors obtained by stacking the columns of and , respectively. Convolution can be written as
where is the following matrix:
The coefficients are cyclically shifted in , and block matrices are cyclically shifted in . Therefore, is called a doubly block circulant matrix.
Single Fourier Attack
As mentioned above, convolution can be written by a doubly block circulant matrix. Such matrices always have eigenvectors, where elements of are composed of the Fourier basis , where [Jain1989, Sedghi, Gupta, and Long2019, Tsuzuku and Sato2019], and singular vectors are also composed of even if we stack convolution layers [Tsuzuku and Sato2019, Karner, Schneid, and Ueberhuber2003]. From these characteristics, tsuzuku2018structural tsuzuku2018structural proposed SFA. The perturbed input image by SFA is
where is the -th column vector of , is an input image, and is magnitude of the attack. SFA is composed of and its complex conjugate to create a perturbation that has real values since inputs of CNNs are assumed to be real values. The and
are hyperparameters such that. Figure 1 shows examples of CIFAR10 perturbed by SFA. We can see that
determines a space-frequency of the noise. Note that stacked convolution layers without activation functions (e.g.,) also have singular vectors composed of Fourier basis functions. Even though we use nonlinear activation functions, many model architectures (e.g., WideResNet, DenseNet-BC, and GoogLeNet) are sensitive to SFA [Tsuzuku and Sato2019].
Vulnerability of CNNs in Frequency Domain
Sensitivity to SFA can be regarded as sensitivity to a single-frequency noise [Yin et al.2019]. To understand the vulnerability of CNNs, several studies focused on sensitivity of CNNs in the frequency domain [Yin et al.2019, Wang et al.2019, Das et al.2018, Liu et al.2019]. These studies point out that sensitivity to high-frequency components in images is one of the causes of adversarial attacks since human visual systems are not sensitive to high-frequency components unlike CNNs. In fact, several studies show that CNNs are sensitive to high-frequency noise [Jo and Bengio2017, Wang et al.2019, Yin et al.2019, Das et al.2018]. jo2017measuring jo2017measuring and highF highF show that CNNs misclassify images processed by low-pass filters and highF highF call this a High-Frequency attack, which is a simple black-box adversarial attack. There is a hypothesis that robust CNNs against high-frequency noise are also robust against adversarial attacks [Wang et al.2019, Yin et al.2019]. Note that highF highF claimed that sensitivity in the high-frequency domain contributes to high performance on clean data; thus, there is a trade-off.
Adversarial attacks can be transferred to other models and transferred white-box attacks become adversarial black-box attacks [Papernot et al.2017]. These attacks can be defended against by adversarial training, which is a promising defense method [Papernot et al.2017, Madry et al.2018]. However, the computational cost of adversarial training is larger than naive training. Note that Absum can be used with adversarial training. Several studies proposed black-box attacks using queries to ask the target model about predicted labels of given data, but these attacks might still be impractical since they require many queries [Chen et al.2017, Brendel, Rauber, and Bethge2018, Ilyas et al.2018]. On the other hand, SFA only uses the information that the target model is composed of CNNs and is more practical.
Our method simply penalizes parameters in a similar manner compared to standard regularization methods. As standard regularization methods, regularization (weight decay) is commonly used for improving generalization performance due to its simplicity. regularization is also used since it induces sparsity [Goodfellow, Bengio, and Courville2016]. In addition, spectral norm (induced 2-norm) regularization can also improve generalization performance [Yoshida and Miyato2017, Sedghi, Gupta, and Long2019]. Due to space limitations, we outline other studies less relevant than the above studies in the appendix.
Defense Methods against SFA
In this section, we first show that SNC can improve robustness against SFA. Since SNC has a large time complexity, we next discuss whether standard regularizations can be alternatives. Finally, we discuss Absum and its proximal operator, which is an efficient defense method against SFA.
Spectral Norm Constraint
SFA is based on the following properties of linear transform:
where is the largest singular value (spectral norm or induced 2-norm), and is the right singular vector corresponding to . Equation (7) shows that the singular vector can be the worst noise for linear transform, and SFA uses the singular vectors for convolutional layers. Since the spectral norm determines the impact of SFA, we can reduce sensitivity to SFA by constraining the spectral norm. The constraint of the spectral norm for CNNs (i.e., SNC) [Sedghi, Gupta, and Long2019, Gouk et al.2018] was proposed in the context of improving generalization performance. SNC clips if it exceeds a preset threshold; thus, it can directly control sensitivity to a single-frequency perturbation. However, the constraints of the exact spectral norm222The spectral norm in spectral norm regularization [Yoshida and Miyato2017] is often quite different from that of [Sedghi, Gupta, and Long2019, Gouk et al.2018]. of incurs large computation cost; the time for each convolution when input size is , and the numbers of input and output channels are even if we use the efficient spectral norm constraints [Sedghi, Gupta, and Long2019]. SNC can be infeasible when the size of inputs increases.
Standard Regularizations fail to Defend
Instead of using the spectral norm, we can assess the effect of the perturbation for linear transform by using
Equation (8) is the induced -norm , and we have for convolution (it is proved in the appendix). This norm is calculated as:
Thus, the penalty of the induced -norm can be regularization [Gouk et al.2018]. Therefore, regularization can improve robustness. However, the induced -norm is a conservative measure of robustness [Szegedy et al.2013]; the highly weighted regularization for robustness can prevent minimization of the loss function. Figure 2 shows the test accuracy of models, which is trained with regularization, on data perturbed by SFA against the regularization weight . In this figure, the robust accuracy against SFA increases along with the regularization weight, i.e., the robustness increases according to the regularization weight. However, the accuracy significantly decreases when the weight exceeds a certain point. This is because training with high weighted regularization does not have sufficient search space to minimize the loss function. Note that weight decay can also penalize the spectral norm (in the appendix) and imposes tight regularization, as discussed in the experiments section. Therefore, we need a weak regularization method such that models become both highly robust and accurate.
Absum: Simple and Weak Regularization
To develop a weak regularization method, we reconsider the optimization problem of eq. (8). The maximum point (eq. (9)) is achieved by , where , i.e., if and if . However, we should consider the sign of input in practice because we usually use ReLUs as activation functions. As described in eq. (3), ReLUs are used before convolution as . Thus, cannot have negative elements, i.e., cannot be when . Therefore, the induced -norm can overestimate sensitivity to the perturbation. From this insight, we consider the norm of when instead of eq. (8)
For robustness, we use this value as the regularization term. We call our method Absum since this value is the absolute value of the summation of the filter coefficients.
The objective function of training with Absum is
where is a loss function, and are the -th training image and label, respectively, is the parameter vector including in the model, and is a regularization weight. The is the filter matrix of the -th convolution, and is the number of convolution filters.333We penalize the filter matrix for each channel. If one convolution layer has output channels and input channels, the regularization term becomes . Figure 3 shows search spaces of Absum (blue) and regularization (red) when we have two parameters. The constraint of Absum is looser than regularization because a large element is allowed if a small element satisfies . Even if , the search space of Absum is a dimensional space while that of regularization is a point if . Note that the search space of weight decay is also the point when . Therefore, the loss function with Absum can be lower than that with regularization and weight decay if we use a large .
Note that when the filter size is and , we only need to compute since zeros padded in do not affect eq. (12) (hereafter, we use instead of ).
Proximal Operator for Absum
Since is not differentiable at , the gradient method might not be effective for minimizing eq. (12). To minimize eq. (12), we use a proximal gradient method, which can minimize a differentiable loss function with a non-differentiable regularization term [Parikh, Boyd, and others2014]. We now introduce proximal operator for Absum. For clarity, let be . The proximal operator for is
The following lemmas show that eq. (13) is the proximal operator for Absum:
If , is a convex function.
If , and , we have
The proofs of lemmas are provided in the appendix. Lemma 1 shows that we can use the proximal gradient method, and Lemma 2 shows that the proximal operator of Absum can be obtained as the closed-form of eq. (15
). By using the proximal operator after stochastic gradient descent (SGD), we update the-th convolution filter:
where is a learning rate, and is a minibatch size. We provide the pseudocode of the whole training in the appendix. We can compute the proximal operator in time for each convolution when the filter size is because we only need to compute the summation of parameters and elementwise operations. We can also compute weight decay and regularization in since the number of parameters in each convolution is . Therefore, the order of computational complexity of Absum is the same as those of weight decay and regularization. When we have input channels and output channels, the computational costs of Absum, weight decay, and regularization are and less than that of SNC where .
Note that the loss function for training deep neural networks is usually non-convex while is convex. Several studies investigate the proximal gradient method when is non-convex [Li and Lin2015], and NIPS2016_6504 NIPS2016_6504 use the proximal gradient method for inducing sparse structures in deep learning. We observed that the algorithm of Absum can find a good parameter point during the experiments.
We discuss the evaluation of the effectiveness of SNC and Absum in improving robustness against SFA. Next, we show that Absum is more efficient than SNC especially when the size of input images and models are large. Finally, as the further investigation, we discuss the evaluation of the performance of Absum and SNC in terms of robustness against transferred attacks, vulnerability in frequency domain, and robustness against PGD when used with adversarial training. To evaluate effectiveness, we conducted experiments of image recognition on MNIST [LeCun et al.1998], FMNIST [Xiao, Rasul, and Vollgraf2017], CIFAR10, CIFAR100 [Krizhevsky and Hinton2009], and SVHN [Netzer et al.2011]. We compared Absum and SNC with standard regularizations (weight decay (WD) and regularization).
We provide details of the experimental conditions in the appendix. In all experiments, we selected the best regularization weight from among for Absum and standard regularization methods, and the best spectral norm from among for SNC. In SNC, we clipped once in 100 iterations due to the large computational cost. For MNIST and FMNIST, we stacked two convolutional layers and two fully connected layers and used ReLUs as activation functions. For CIFAR10, CIFAR100, and SVHN, the model architecture was ResNet-18 [He et al.2016]. We used SFA with and on MNIST and FMNIST, and and on CIFAR10, CIFAR100, and SVHN.
In addition, we used PGD to evaluate robustness against transferred attacks and white-box attacks since PGD is a sophisticated white-box attack. In addition to naive training, we evaluated robustness against PGD when we used adversarial training [Kurakin, Goodfellow, and Bengio2016, Madry et al.2018] with each method because Absum can be used with it due to its simplicity. Model architectures were the same as in the experiments involving SFA. The hyperparameter settings for PGD were based on [Madry et al.2018]. The norm of the perturbation was set to for MNIST and FMNIST and for CIFAR10, CIFAR100, and SVHN at training time. For PGD, we updated the perturbation for 40 iterations with a step size of 0.01 on MNIST and FMNIST at training and evaluation times, and on CIFAR10, CIFAR100, and SVHN, for 7 iterations with a step size of 2/255 at training time and 100 iterations with the same step size at evaluation time.
Effectiveness and Efficiency
Robustness against SFA
Table 1 lists the accuracies of each method on test data perturbed by SFA and selected and . In this table, Avg. denote robust accuracies against SFA averaged over , and Min. denotes minimum accuracies among hyperparameters , i.e., robust accuracies against optimized SFA. CLN denotes accuracies on clean data. The and are selected so that Avg. would become the highest. In Tab. 1, Absum and SNC are more robust against SFA compared with WD and . Although SNC is more robust than Absum on CIFAR10 and CIFAR100, clean accuracies of SNC are less than those of Absum and the computation time of SNC is larger than that of Absum as discussed below. In the appendix, we provide accuracies against each and the results in which and are selected so that each of CLN and Min. would become highest.
Figure 4 shows the test accuracies of the methods on MNIST and CIFAR10 perturbed by SFA against regularization weights. In this figure, min and max denote the minimum and maximum test accuracies among , respectively, and avg. denotes test accuracies averaged over . All methods tend to increase their minimum accuracy (results of SFA with optimized ) according to the regularization weight. However, and WD significantly decrease in accuracy when the regularization weight is higher than . On the other hand, Absum with the high regularization weight does not decrease in accuracy. Figure 5 shows the lowest training loss in training on CIFAR10 against . WD and with a large prevent minimization of the training loss. On the other hand, Absum with a large can decrease the training loss because the search space of has dimensional space even if . In conclusion, standard regularization methods might not be effective in improving robustness against SFA because the high regularization weight imposes too tight of constraints to minimize the loss function. On the other hand, Absum imposes looser constraints; thus, we can improve robustness while maintaining classification performance. The results of other datasets are almost the same as Fig. 4 (included in the appendix). We also provide figures showing the accuracy and the training loss of SNC against in the appendix.
To confirm the efficiency of Absum, we evaluated the runtime for one epoch. We also evaluated the runtime of the forward and backward processes of ResNet-18 for one image when input size increases by using random synthetic three channels images whose sizes were 3232, 6464, 128128, 256256, 512512, and 10241024 with 10 random labels. The results are shown in Fig. 6. As shown in Fig. 6 (a), Absum is about ten times faster than SNC on image datasets with ResNet18. The runtime of SNC is comparable to those of other methods on MNIST and FMNIST because we use only two convolution layers, and image sizes of these datasets are smaller than other datasets. In Fig. 6 (b), the runtime of Absum does not increase significantly compared with SNC and the increase in the runtime of Absum is similar to those of standard regularization methods. This is because the computational cost of Absum does not depend on the size of input images. Since SNC incurs large computational cost and depends on the input size, we could not evaluate the runtime when the image width is larger than 256.
Extensive Empirical Investigation
Robustness against Transferred Attacks
Sensitivity to SFA is caused by convolution operation and is universal for CNNs. This sensitivity might be a cause of transferability of adversarial attacks, and robust CNNs against SFA can be robust against transferred attacks. To confirm this hypothesis, we investigate sensitivity to transferred PGD. We generate adversarial examples by using the substitute models that were trained under the same setting as that presented in the previous section but with different random initializations. We used these substitute models rather than completely different models because they can be regarded as one of the worst-case instances for transferred attacks [Madry et al.2018]. The accuracies on these adversarial examples are listed in Tab. 2. Absum and SNC improve robustness compared to WD and . Tables 1 and 2 imply that the method of improving robustness against SFA can also improve robustness against the transferred attacks. This is the first study that shows the relation between robustness against SFA and against transferred white-box attacks.
Sensitivity in Frequency Domain
Several studies show that CNNs are sensitive to high-frequency noise unlike human visual systems since CNNs are biased towards high-frequency information [Wang et al.2019, Yin et al.2019]. From the robustness against SFA, which is regarded as single-frequency noise, Absum and SNC can be expected not to bias CNNs towards high-frequency information. To confirm this hypothesis, we first investigate the power spectra of adversarial perturbations of models trained using each method. Next, we investigate robustness against High-Frequency attacks, which remove high-frequency components of image data. High-Frequency attacks have a hyperparameter of radius that determines the cutoff frequency, and we set it as half the image width. In these experiments, and are the same as those in Tab. 1.
Figure 7 shows the power spectra of PGD perturbations on CIFAR10 and Tab. 3 lists the accuracies on the test data processed by High-Frequency attacks. In Fig. 7, we shift low frequency components to the center of the spectrum and power spectra are averaged over test data and RGB channels. This figure shows that vulnerabilities of WD and are biased in the high-frequency domain, while vulnerability of SNC is highly biased in the low-frequency domain. The power spectrum of Absum is not biased towards a specific frequency domain. Due to these characteristics, SNC and Absum are more robust against High-Frequency attacks than WD and (Tab. 3). Since human visual systems can perceive low-frequency noise better than high-frequency noise, attacks for Absum and SNC might be more perceptible than attacks for WD and . Note that Absum is more robust against high-pass filtering than SNC, which is presented in the appendix. This result supports that Absum does not bias CNNs towards a specific frequency domain while SNC biases CNNs towards the low-frequency domain.
Robustness against PGD with Adversarial Training
Table 4 lists the accuracies of models trained by adversarial training on data perturbed by PGD. When using adversarial training, Absum improves robustness against PGD, the highest among regularization methods, on almost all datasets. This implies that sensitivity to SFA is one of the causes of vulnerabilities of CNNs. The of Absum tends to be higher than the of WD and ; thus, Absum can also improve robustness against PGD without deteriorating classification performance due to its looseness. Note that Absum does not improve robustness against PGD whithout adversarial training since the structural sensitivity of CNNs does not necessarily cause all vulnerabilities of CNN-based models (we discussed this in the appendix). Even so, Absum is more effective than other standard regularizations since it can efficiently improve robustness against black-box attacks (SFA, transferred attacks, and High-Frequency attacks) and enhance adversarial training, as mentioned above.
We proposed Absum; an efficient defense method against SFA that can reduce the structural sensitivity of CNNs with ReLUs while its computational cost remains comparable to standard regularizations. By reducing the structural sensitivity, Absum can improve robustness against not only SFA, but also transferred PGD, and High-Frequency attacks. Due to its simplicity, Absum can be used with other methods, and Absum can enhance adversarial training of PGD.
- [Athalye, Carlini, and Wagner2018] Athalye, A.; Carlini, N.; and Wagner, D. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proc. ICML, 274–283.
[Brendel, Rauber, and
Brendel, W.; Rauber, J.; and Bethge, M.
Decision-based adversarial attacks: Reliable attacks against black-box machine learning models.In Proc. ICLR.
- [Carlini and Wagner2017] Carlini, N., and Wagner, D. 2017. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), 39–57. IEEE.
[Chen et al.2017]
Chen, P.-Y.; Zhang, H.; Sharma, Y.; Yi, J.; and Hsieh, C.-J.
Zoo: Zeroth order optimization based black-box attacks to deep neural
networks without training substitute models.
Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 15–26. ACM.
- [Cisse et al.2017] Cisse, M.; Bojanowski, P.; Grave, E.; Dauphin, Y.; and Usunier, N. 2017. Parseval networks: Improving robustness to adversarial examples. In Proc. ICML, 854–863.
- [Das et al.2018] Das, N.; Shanbhogue, M.; Chen, S.-T.; Hohman, F.; Li, S.; Chen, L.; Kounavis, M. E.; and Chau, D. H. 2018. Shield: Fast, practical defense and vaccination for deep learning using jpeg compression. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 196–204. ACM.
- [Dhillon et al.2018] Dhillon, G. S.; Azizzadenesheli, K.; Lipton, Z. C.; Bernstein, J.; Kossaifi, J.; Khanna, A.; and Anandkumar, A. 2018. Stochastic activation pruning for robust adversarial defense. Proc. ICLR.
- [Ding, Wang, and Jin2019] Ding, G. W.; Wang, L.; and Jin, X. 2019. AdverTorch v0.1: An adversarial robustness toolbox based on pytorch. arXiv preprint arXiv:1902.07623.
- [Goodfellow, Bengio, and Courville2016] Goodfellow, I.; Bengio, Y.; and Courville, A. 2016. Deep learning. MIT press.
- [Goodfellow, Shlens, and Szegedy2014] Goodfellow, I.; Shlens, J.; and Szegedy, C. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
- [Gouk et al.2018] Gouk, H.; Frank, E.; Pfahringer, B.; and Cree, M. 2018. Regularisation of neural networks by enforcing lipschitz continuity. arXiv preprint arXiv:1804.04368.
- [He et al.2016] He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep residual learning for image recognition. In Proc. CVPR, 770–778.
- [Ilyas et al.2018] Ilyas, A.; Engstrom, L.; Athalye, A.; and Lin, J. 2018. Black-box adversarial attacks with limited queries and information. In Proc. ICML, 2137–2146.
- [Ioffe and Szegedy2015] Ioffe, S., and Szegedy, C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proc. ICML, 448–456.
- [Jain1989] Jain, A. K. 1989. Fundamentals of Digital Image Processing. Prentice-Hall.
- [Jo and Bengio2017] Jo, J., and Bengio, Y. 2017. Measuring the tendency of cnns to learn surface statistical regularities. arXiv preprint arXiv:1711.11561.
- [Karner, Schneid, and Ueberhuber2003] Karner, H.; Schneid, J.; and Ueberhuber, C. W. 2003. Spectral decomposition of real circulant matrices. Linear Algebra and Its Applications 367:301–311.
- [Krizhevsky and Hinton2009] Krizhevsky, A., and Hinton, G. 2009. Learning multiple layers of features from tiny images. Technical report.
- [Kurakin, Goodfellow, and Bengio2016] Kurakin, A.; Goodfellow, I.; and Bengio, S. 2016. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.
- [LeCun et al.1989] LeCun, Y.; Boser, B.; Denker, J. S.; Henderson, D.; Howard, R. E.; Hubbard, W.; and Jackel, L. D. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1(4):541–551.
- [LeCun et al.1998] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P.; et al. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11):2278–2324.
- [Li and Lin2015] Li, H., and Lin, Z. 2015. Accelerated proximal gradient methods for nonconvex programming. In Proc. NIPS, 379–387.
- [Liu et al.2019] Liu, Z.; Liu, Q.; Liu, T.; Xu, N.; Lin, X.; Wang, Y.; and Wen, W. 2019. Feature distillation: Dnn-oriented jpeg compression against adversarial examples. In Proc. CVPR, 860–868.
- [Madry et al.2018] Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; and Vladu, A. 2018. Towards deep learning models resistant to adversarial attacks. In Proc. ICLR.
- [Moosavi-Dezfooli, Fawzi, and Frossard2016] Moosavi-Dezfooli, S.-M.; Fawzi, A.; and Frossard, P. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proc. CVPR, 2574–2582.
[Nair and Hinton2010]
Nair, V., and Hinton, G. E.
Rectified linear units improve restricted boltzmann machines.In Proc. ICML, 807–814. Omnipress.
- [Netzer et al.2011] Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; and Ng, A. Y. 2011. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
- [Papernot et al.2016] Papernot, N.; McDaniel, P.; Wu, X.; Jha, S.; and Swami, A. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP), 582–597. IEEE.
- [Papernot et al.2017] Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z. B.; and Swami, A. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, 506–519. ACM.
- [Papernot, McDaniel, and Goodfellow2016] Papernot, N.; McDaniel, P.; and Goodfellow, I. 2016. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277.
- [Parikh, Boyd, and others2014] Parikh, N.; Boyd, S.; et al. 2014. Proximal algorithms. Foundations and Trends® in Optimization 1(3):127–239.
- [Radford, Metz, and Chintala2016] Radford, A.; Metz, L.; and Chintala, S. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. Proc. ICLR.
- [Sedghi, Gupta, and Long2019] Sedghi, H.; Gupta, V.; and Long, P. M. 2019. The singular values of convolutional layers. In Proc. ICLR.
- [Srivastava et al.2014] Srivastava, N.; Hinton, G. E.; Krizhevsky, A.; Sutskever, I.; and Salakhutdinov, R. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15(1):1929–1958.
- [Szegedy et al.2013] Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; and Fergus, R. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.
- [Tsuzuku and Sato2019] Tsuzuku, Y., and Sato, I. 2019. On the structural sensitivity of deep convolutional networks to the directions of fourier basis functions. Proc. CVPR.
- [Tsuzuku, Sato, and Sugiyama2018] Tsuzuku, Y.; Sato, I.; and Sugiyama, M. 2018. Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks. In Proc. NIPS, 6542–6551.
- [Vaswani et al.2017] Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. Attention is all you need. In Proc. NIPS, 5998–6008.
- [Wang et al.2019] Wang, H.; Wu, X.; Yin, P.; and Xing, E. P. 2019. High frequency component helps explain the generalization of convolutional neural networks. arXiv preprint arXiv:1905.13545.
- [Wen et al.2016] Wen, W.; Wu, C.; Wang, Y.; Chen, Y.; and Li, H. 2016. Learning structured sparsity in deep neural networks. In Proc. NIPS, 2074–2082.
- [Xiao, Rasul, and Vollgraf2017] Xiao, H.; Rasul, K.; and Vollgraf, R. 2017. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
[Yin et al.2019]
Yin, D.; Lopes, R. G.; Shlens, J.; Cubuk, E. D.; and Gilmer, J.
A fourier perspective on model robustness in computer vision.ICML2019 Workshop (accepted in NeurIPS2019).
- [Yoshida and Miyato2017] Yoshida, Y., and Miyato, T. 2017. Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941.
- [Yuan et al.2019] Yuan, X.; He, P.; Zhu, Q.; and Li, X. 2019. Adversarial examples: Attacks and defenses for deep learning. IEEE transactions on neural networks and learning systems.
Appendix A Proofs of Lemmas
In this section, we provide the proofs of the lemmas.
If , is a convex function.
If is a convex function, we have , where and . Therefore, we investigate , and if , we prove the lemma. We have
since and . Let and ; thus, we have
From the triangle inequality, we have ; thus, this completes the proof. ∎
If , and , we have
For clarity, let . We have three cases; (a) , (b) , and (c) . In (a), we have , and at the optimal point. Therefore, , and the solution becomes . The condition is , i.e., . In (b), we have , and we can optimize in the same manner as (a). As a result, if . In (c), is non-differentiable, but we can use subgradient such as . Let be where small and be the standard basis; thus, we have when . As a result, is bounded as . We then have ; thus, . Since , we have . Thus, the condition becomes since . By substituting into , we have subject to and . Thus, the minimum point is , i.e., . Therefore, is the minimum point when . This completes the proof. ∎
Appendix B Inequality of Induced Norms for Convolution
The -th singular value of a doubly circulant matrix can be written as (not arranged in descending order), and we have . Therefore, the spectral norm of is bounded above by the induced -norm as .
Appendix C Regularization and Induced Norm
In this section, we explain that regularization (weight decay: WD) can constrain the induced norm of a convolutional layer. The regularization term of the convolution filter is
On the other hand, the square of the Frobenius norm of becomes
Therefore, if we use the regularization, we constrain the Frobenius norm of . In addition, let be matrices, we have the following inequalities:
where is the induced 2-norm, which is the largest singular value. From the above inequalities, we have , and thus, if we decrease the Frobenius norm of , the induced 2-norm and -norms are also decreased.
Appendix D Algorithm of Absum
Algorithm 1 shows the whole training algorithm of Absum. First, we update parameters by SGD (lines 3 and 4). Next, we apply the proximal operator to each convolution filter (lines 5-13). These processes are iteratively performed.
Appendix E Related Work
Adversarial attacks are divided into two types; white-box and black-box attacks. The fast gradient sign method (FGSM) and PGD are popular as simple and sophisticated white-box attacks, respectively [Goodfellow, Shlens, and Szegedy2014, Kurakin, Goodfellow, and Bengio2016, Madry et al.2018]
. Though many defense methods against white-box attacks have been proposed, e.g., defensive distillation[Papernot et al.2016] and stochastic defense [Dhillon et al.2018], several methods have been toppled by strong attacks [Athalye, Carlini, and Wagner2018, Carlini and Wagner2017]. A promising method is adversarial training [Goodfellow, Shlens, and Szegedy2014, Kurakin, Goodfellow, and Bengio2016, Madry et al.2018], which uses adversarial examples as training data. However, its computational cost is larger than naive training. Note that Absum can be used with adversarial training and enhances it, as discussed in experiments. Black-box attacks are more practical than white-box attacks since it is difficult to access the target models in online applications [Papernot et al.2017, Yuan et al.2019]. Most black-box attacks are transferred white-box attacks and can be defended against by adversarial training [Papernot et al.2017]. Several black-box attacks use queries that ask the target model about the predicted labels of given input data, but these attacks might still be impractical since they require a large amount of queries [Brendel, Rauber, and Bethge2018, Chen et al.2017, Ilyas et al.2018]. On the other hand, SFA only uses the information that the target model is composed of CNNs and is more practical.
An early study [Szegedy et al.2013] showed that the induced norm can be a measure of robustness, and Parseval networks constrain the induced norm of linear layers to improve robustness [Cisse et al.2017]. Parseval networks are more robust against FGSM than naive models and can enhance adversarial training. However, the computational cost of Parseval networks is larger than standard regularization methods. In addition, its robustness might be less than that of the spectral norm regularization [Tsuzuku, Sato, and Sugiyama2018] though Parseval networks penalize the spectral norm like the spectral norm constraint. The spectral norm regularization can improve generalization performance [Yoshida and Miyato2017]. However, the spectral norm in spectral norm regularization is often quite different from that of [Gouk et al.2018, Sedghi, Gupta, and Long2019] for convolution.
As simple regularization methods, srivastava2014dropout srivastava2014dropout shows that maxnorm regularization can improve generalization performance of deep learning. The maxnorm regularization in [Srivastava et al.2014] restricts the norm of weight vectors to be strictly less than or equal to a threshold as where is the -th row vector of in eq. (5). Therefore, the maxnorm regularization on convolution is and is similar to regularization. In fact, we observed that the effectiveness of maxnorm regularization is similar to weight decay.
Appendix F Experimental Conditions
We had roughly two experimental conditions according to the dataset. In all experiments, we selected the best regularization weight from among for Absum and standard regularization methods and selected the best spectral norm from among for spectral norm constraint (SNC) [Sedghi, Gupta, and Long2019]
. In SNC, we clipped singular values once in 100 iterations due to the large computational cost. Our experiments ran once for each hyperparameter. We assumed that all images were divided by 255 and pixels had the values between 0 and 1. In addition, MNIST, CIFAR10 and CIFAR100 were standardized as (mean, standard deviation)=(0,1) before the images were applied to the models as preprocessing. In the evaluation of robustness, we standardized input images by using the means and standard deviations of clean data after adversarial perturbation. The computation graph of the standardization was preserved in gradient-based attacks; thus, perturbations of PGD were optimized while considering this preprocess.
MNIST and Fashion-MNIST
. For MNIST and Fashion-MNIST (FMNIST), we stacked two convolutional layers and two fully connected layers, the first convolutional layer had the 10 output channels and the second convolutional layer had 20 output channels. The kernel sizes of the convolutional layers were 5, their strides were 1, and we did not use zero-padding in these layers. After each convolutional layer, we applied max pooling (the stride was 2) and ReLU activation. The output of the second convolutional layer was applied to the first fully connected layer (the size was), and we used the ReLU activation after the first fully connected layer. The size of the second fully connected layer was , and we used softmax as the output function. After the second convolution layer and before the second fully connected layer, we applied 50 % dropout. We trained the model for 100 epochs by using Momentum SGD (the learning rate of 0.01 and momentum of 0.5). We set the minibatch size to 64.
We changed to , , , , in SFA since the size of the images was and evaluated the accuracy of the model on the test data perturbed by SFA. The norm of the perturbation of SFA was set to . The perturbed inputs were clipped so that each element would be included in . For fair comparison, all regularization methods were applied to only convolution filter parameters.
CIFAR10, CIFAR100, and SVHN
CIFAR10 and CIFAR100 contain 50,000 training images and 10,000 test images [Krizhevsky and Hinton2009]. SVHN contains 73,257 images for training and 26,032 images for testing [Netzer et al.2011]. For SVHN, we used cropped digits, which were cropped as . The model architecture was ResNet-18 for CIFAR10, CIFAR100, and SVHN [He et al.2016].444Our training settings are based on the open source of https://github.com/kuangliu/pytorch-cifar. As the preprocessing for training, given images were randomly cropped as
after padding a sequence of four on each border of the images. Horizontal flip was randomly applied to images with a probability of 0.5. We trained the model for 350 epochs with Momentum SGD (momentum 0.9). The initial learning rate was set to 0.1, and after 150 and 250 epochs, we divided the learning rate by 10. We set the minibatch size to 128.
We changed in SFA to since the size of the images was and evaluated the accuracy of the model on the test data perturbed by SFA. The norm of the perturbation of SFA was set to . The perturbed inputs were clipped so that each element would be included in . For fair comparison, all regularization methods were applied to only convolution filter parameters.
Note that about 20 % of SVHN test and train datasets have the class label of ‘1’. Due to the class imbalance, models output class ‘1’ regardless of input images in some hyperparameter settings. In this case, the robust accuracies are always about 20%; thus, these models sometimes outperform properly trained models with naive training in terms of robust accuracy. However, these results are not meaningful, and we do not list them in the tables. For the other datasets, we also do not list the results of the models that output one class regardless of input images.
To evaluate robustness in the frequency domain, we used High-Frequency attacks. High-Frequency attacks can be regarded as low-pass filteres, which remove high-frequency components. In High-Frequency attacks [Wang et al.2019]
, we first apply discrete Fourier transform (DFT)to data as
Next, we decompose the low- and high-frequency components as
where and are elements of low- and high-frequency components in the frequency domain, respectively, is a centroid of the image, is the Euclidean distance, and is a radius that determines the cutoff frequency. Finally, we apply the inverse DFT to as
and is an input image attacked by High-Frequency attacks. While is gradually reduced and accuracies are iteratively evaluated for each in [Wang et al.2019], we used fixed as half of the image width since we just focus on comparing Absum with other methods.
In addition to High-Frequency attacks, we evaluated robust accuracies against high-pass filter . Note that images processed by the high-pass filter are not adversarial examples since it is difficult for humans to accurately classify these images. Even so, this experiment reveals how the model trained using each method is biased towards the low-frequency components.
We evaluated the computation time of Absum. We used one NVIDIA Tesla V100 GPU and 32 Intel(R) Xeon(R) Silver 4110 CPUs, and our implementation used Python 3.6.8, pytorch 0.4.1, CUDA 9.0, and numpy 1.11.3 in this experiment. Note that we used numpy to compute the FFT and singular value decomposition, which is difficult to parallelize, in SNC. We clipped singular values once in 100 iterations due to the large computational cost. The model architectures and training process were the same as those of the experiments involving SFA. We usedand . We also conducted an experiment to evaluate the computational time when the input size increases. We generated random images whose sizes were , , , , , and with ten random labels, and evaluated the computation time of the forward and backward processes of ResNet18 for one image.
Robustness against PGD
We also evaluated the effectiveness of Absum against PGD. We evaluated Absum with adversarial training [Madry et al.2018] in addition to naive training because Absum and other regularization methods can be used with adversarial training. In these experiments, we used advertorch [Ding, Wang, and Jin2019] to generate adversarial examples of PGD.
Model architectures and training conditions were almost the same as the experiments of SFA. The number of epochs for MNIST and FMNIST was set to 100. On the other hand, we observed overfitting in the adversarial training on CIFAR10, CIFAR100, and SVHN. Therefore, we trained the model for 150 epochs with Momentum SGD (momentum 0.9). The initial learning rate was set to 0.1, and after 50 and 100 epochs, we divided the learning rate by 10. We also applied weight decay of to all parameters on CIFAR10 and CIFAR100 in the adversarial training of PGD since overfitting easily occurred in adversarial training on these datasets.
In PGD, the norm of the perturbation was set to for MNIST and FMNIST, and
for CIFAR10 at evaluation time. For PGD, we updated the perturbation for 40 iterations with a step size of 0.01 on MNIST and FMNIST at training and evaluation times. On CIFAR10, CIFAR100, and SVHN, we updated the perturbation for 7 iterations with a step size of 2/255 at training time and 100 iterations at evaluation time. The starting points of PGD were randomly initialized from a uniform distribution of [-2/255, 2/255]. For adversarial training, we used training data perturbed by PGD withon MNIST and on CIFAR10, CIFAR100, and SVHN. In adversarial training, we only used adversarial examples of training data. The above conditions are based on [Madry et al.2018].
Appendix G Additional Experimental Results
Robustness against SFA
|Absum||WD||L1||SNC||w/o Reg.||Absum||WD||L1||SNC||w/o Reg.||Absum||WD||L1||SNC||w/o Reg.|
Figure 8 shows the accuracies on datasets perturbed by SFA against hyperparameters of SFA. As shown in Fig. 8, the models trained with WD and regularization are sensitive to certain frequency noise (e.g., in Figs. 8 (j) and (k)). Table 5 lists the average, minimum, and clean accuracies on datasets perturbed by SFA. In this table, Avg. denotes accuracies on data perturbed by SFA averaged over hyperparameters , and Min. denotes minimum accuracies on data perturbed by SFA among . CLN denotes accuracies on clean data. The and are selected for each of Avg., Min., and CLN so that each of them would become the highest.
Figure 9 shows the accuracies of the methods on FMNIST, CIFAR100, and SVHN perturbed by SFA against regularization weights. These results are almost the same as those of MNIST and CIFAR 10. On all the datasets, Absum improves the avg. and min. accuracies according to the regularization weights, while the other methods decrease the accuracies according to them.
Figure 10 shows the accuracies of SNC on all datasets perturbed by SFA against the threshold of the spectral norm . We can see that, on MNIST and FMNIST, accuracies increase along with . This is because, when the spectral norm is small, gradient vanishing occurs in the stacked convolutional layers. On the other hand, on CIFAR10, CIFAR100, and SVHN, where we used ResNet, minimum accuracy decreases, when the spectral norm becomes larger than a certain point, while max accuracy increases along with the spectral norm. Figure 11 shows the lowest training loss in training with SNC on CIFAR10 against . We can see that SNC with low prevents the minimizing of the loss function.
Robustness against Transferred PGD
Table 6 lists robust accuracies against transferred PGD for various . We can see that Absum and SNC can improve robustness against transferred PGD better than WD and .