AdvRush: Searching for Adversarially Robust Neural Architectures

08/03/2021 ∙ by Jisoo Mok, et al. ∙ Seoul National University 13

Deep neural networks continue to awe the world with their remarkable performance. Their predictions, however, are prone to be corrupted by adversarial examples that are imperceptible to humans. Current efforts to improve the robustness of neural networks against adversarial examples are focused on developing robust training methods, which update the weights of a neural network in a more robust direction. In this work, we take a step beyond training of the weight parameters and consider the problem of designing an adversarially robust neural architecture with high intrinsic robustness. We propose AdvRush, a novel adversarial robustness-aware neural architecture search algorithm, based upon a finding that independent of the training method, the intrinsic robustness of a neural network can be represented with the smoothness of its input loss landscape. Through a regularizer that favors a candidate architecture with a smoother input loss landscape, AdvRush successfully discovers an adversarially robust neural architecture. Along with a comprehensive theoretical motivation for AdvRush, we conduct an extensive amount of experiments to demonstrate the efficacy of AdvRush on various benchmark datasets. Notably, on CIFAR-10, AdvRush achieves 55.91 accuracy under FGSM attack after standard training and 50.04 under AutoAttack after 7-step PGD adversarial training.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 4

page 10

page 11

page 12

page 13

page 15

page 16

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The rapid growth and integration of deep neural networks in every-day applications have led researchers to explore the susceptibility of their predictions to malicious external attacks. Among proposed attack mechanisms, adversarial examples [65]

, in particular, raise serious security concerns because they can cause neural networks to make erroneous predictions with the slightest perturbations in the input data that are indistinguishable to the human eyes. This interesting property of adversarial examples has been drawing much attention from the deep learning community, and ever since their introduction, a plethora of defense methods have been proposed to improve the robustness of neural networks against adversarial examples 

[75, 69, 35]. However, there remains one important question that is yet to be explored extensively: Can the adversarial robustness of a neural network be improved by utilizing an architecture with high intrinsic robustness? And if so, is it possible to automatically search for a robust neural architecture?

Figure 1: Standard accuracy vs. robust accuracy evaluation results on CIFAR-10 for various neural architectures and the neural architecture searched by AdvRush. All architectures are adversarially trained using 7-step PGD and evaluated under AutoAttack. AdvRush architecture achieves both standard accuracy-wise and robust accuracy-wise optimal frontiers.

We tackle the problem of searching for a robust neural architecture by employing Neural Architecture Search (NAS) [79]

. Due to the heuristic nature of designing a neural architecture, it used to take machine learning engineers with years of experience and expertise to fully exploit the power of neural networks. NAS, a newly budding branch of automated machine learning, aims to automate this labor-intensive architecture search process. As the neural architectures discovered automatically through NAS begin to outperform hand-crafted architectures across various domains 

[8, 48, 77, 66, 43], more emphasis is being placed on the proper choice of an architecture to improve the performance of a neural network on a target task.

The primary objective of the existing NAS algorithms is concentrated on improving the standard accuracy, and thus, they do not consider the robustness of the searched architecture during the search process. Consequently, they provide no guarantee of robustness for the searched architecture, since the “no free lunch” theorem for adversarial robustness prevents neural networks from obtaining sufficient robustness without additional effort [14, 4, 78]. In addition, the trade-off between the standard accuracy and the adversarial robustness indicates that maximizing the standard accuracy and the adversarial robustness cannot go hand in hand [67], further necessitating a NAS algorithm designed specifically for adversarial robustness.

In this work, we propose a novel adversarial robustness-aware NAS algorithm, named AdvRush, which is a shorthand for “Adversarially Robust Architecture Rush.” AdvRush is inspired by a finding that the degree of curvature in the neural network’s input loss landscape is highly correlated with intrinsic robustness, regardless of how its weights are trained [78]. Therefore, by favoring a candidate architecture with a smoother input loss landscape during the search phase, AdvRush discovers a neural architecture with high intrinsic robustness. As shown in Figure 1, after undergoing an identical adversarial training procedure, the searched architecture of AdvRush simultaneously achieves the best standard and robust accuracies on CIFAR-10.

We provide comprehensive experimental results to demonstrate that the architecture searched by AdvRush is indeed equipped with high intrinsic robustness. On CIFAR-10, standard-trained AdvRush achieves 55.91% robust accuracy (2.50% improvement from PDARTS [8]) under FGSM, and adversarially-trained AdvRush achieves 50.04% robust accuracy (3.04% improvement from RobNet-free [23]

) under AutoAttack. Furthermore, we evaluate the robust accuracy of AdvRush on CIFAR-100, SVHN, and Tiny-ImageNet to investigate its transferability; across all datasets, AdvRush consistently shows a substantial increase in the robust accuracy compared to other architectures. For the sake of reproducibility, we have included the code and representative model files of AdvRush in the supplementary materials.

Our contributions can be summarized as follows:

  • We propose AdvRush, a novel NAS algorithm for discovering a robust neural architecture. Because AdvRush does not require independent adversarial training of candidate architectures for evaluation, its search process is highly efficient.

  • The effectiveness of AdvRush is demonstrated through comprehensive evaluation under a number of adversarial attacks. Furthermore, we validate the transferability of AdvRush to various benchmark datasets.

  • We provide extensive theoretical justification for AdvRush and complement it with the visual analysis of the discovered architecture. In addition, we provide a meaningful insight into what makes a neural architecture more robust against adversarial perturbations.

2 Related Works

2.1 Adversarial Attacks and Defenses

Adversarial attack methods can be divided into white-box and black-box attacks. Under the white-box setting [52, 46, 34, 45, 68], the attacker has full access to the target model, including its architecture and weights. FGSM [19], PGD [39], and CW [5] attacks are the famous white-box attacks, popularly used to evaluate the robustness of a neural network. On the contrary, under the black-box setting, the attacker has limited to no access to the target model. Thus, black-box attacks rely on a substitute model [51] or the target model’s prediction score [6, 3, 64, 21, 26, 44] to construct adversarial examples.

In response, numerous adversarial defense methods have been proposed to alleviate the vulnerability of neural networks to adversarial examples. Adversarial training [19] is known to be the most effective defense method. By utilizing adversarial examples as training data, adversarial training plays a min-max game; while the inner maximum produces stronger adversarial examples to maximize the cross-entropy loss, the outer minimum updates the model parameters to minimize it. A large number of defenses now adopt some form of regularization or adversarial training to improve robustness [30, 42, 59, 74, 75, 47, 55]. Our work is closely related to the defense approaches that utilize a regularization term derived from the curvature information of the neural network’s loss landscape to mimic the effect of adversarial training [47, 55].

Apart from adversarial training, feature denoising [41, 40, 60, 69, 71] has also been proven to be effective at improving the robustness of a neural network. Although gradient masking [62, 53] may appear to be a viable defense method, it can easily be circumvented by attacks based on approximate gradients [50, 11].

2.2 Neural Architecture Search

Early NAS algorithms based on evolutionary algorithm (EA) 

[58, 57]

or reinforcement learning (RL) 

[79, 80, 1] often required thousands of GPU hours to search for an architecture, making their immediate application difficult. The majority of the computational overhead in these algorithms was caused by the need to train each candidate architecture to convergence and evaluate it. By using performance approximation techniques, recent NAS works were able to significantly expedite the architecture search process. Examples of commonly-adopted performance approximation techniques include cell-based micro search space design [80] and parameter sharing [54]. Modern gradient-based algorithms that exploit such performance approximation techniques can be categorized largely into sampling-based NAS [70, 2, 16, 76], and continuous relaxation-based NAS [38, 10, 8, 7, 72] according to their candidate architecture evaluation method.

Following the successful acceleration of NAS, its application has become prevalent in various domains. In computer vision, neural architectures discovered by modern NAS algorithms continue to produce impressive results in a variety of applications: image classification 

[10, 8], object detection [66, 9, 28, 22], and semantic segmentation [37, 48, 77]

. NAS is also being applied to the domains outside computer vision, such as natural language processing 

[29] and speech processing [43, 31, 56].

Despite the proliferation of NAS research, only a limited amount of literature pertaining to the subject of robust neural architecture exists. RobNet [23] is the first work to empirically reveal the existence of robust neural architectures. They randomly sample architectures from a search space and adversarially train each one of them to evaluate their robustness. Because their method results in a huge computational burden, RobNet uses a narrow search space with only three possible operations. RAS [32], an EA-based method, uses adversarial examples from a separate victim model to measure the robustness of candidate architectures, but their approach and objective are restricted to improving the robustness under black-box attacks. RACL [15], a gradient-based method, suggests to use the Lipschitz characteristics of the architecture parameters to achieve the target Lipschitz constant.

3 Theoretical Motivation

Analyzing the topological characteristics of a neural network’s loss landscape is an important tool for understanding its defining properties. This section introduces theoretical backgrounds on the relationship between the loss landscape characteristics and adversarial robustness that inspired our method. Section 3.1 provides definitions for parameter and input loss landscapes. Using the provided definitions, section 3.2 shows how the degree of curvature in the input loss landscape of a neural architecture relates to its intrinsic adversarial robustness.

3.1 Parameter and Input Loss Landscapes

We define a neural network as a function where is its architecture, and

is the set of trainable weight parameters. Then, a loss function of a neural network can be expressed as

, where is the input data.

Since both and lie on a high-dimensional space, a direct visual analysis of is impossible. Therefore, the high-dimensional loss surface of

is projected onto an arbitrary low-dimensional space, namely a 2-dimensional hyperplane 

[20, 36]

. Given two normalized projection vectors

and of the 2-dimensional hyperplane and a starting point , the points around

are interpolated as follows:

(1)

where and are the degrees of perturbations in the and directions, respectively.

Depending on the choice of the starting point , the loss landscape can be visualized in either the parameter space () or the input space (). The computation of loss values for is formulated as: , which corresponds to the parameter loss landscape. Similarly, for , the loss values are computed as follows: , which corresponds to the input loss landscape. In this paper, we will primarily focus on the input loss landscape.

3.2 Intrinsic Robustness of a Neural Architecture

Consider two types of loss functions for updating of an arbitrary neural network : a standard loss and an adversarial loss. On one hand, standard training uses clean input data to update , such that the standard loss is minimized. On the other hand, adversarial training uses adversarially perturbed input data to update , such that the adversarial loss is minimized. From here on, we refer to after standard training and after adversarial training as and as , respectively.

Searching for a robust neural architecture is equivalent to finding an architecture with small , regardless of the training method. Interestingly enough, the degree of curvature in the input loss landscape of is highly correlated with . One way of quantifying the degree of curvature in the input loss landscape is through the eigenspectrum of

, the Hessian matrix of the loss computed with respect to input data. From here on, we use the largest eigenvalue of the Hessian matrix

to quantify the degree of curvature in the input loss landscape under second-order approximation and denote it as  [73].

Consider and , two independently trained neural networks with an identical neural architecture . We define to be a set of which interpolates between and along some parametric curve. For instance, a quadratic Bezier curve [18] with endpoints and , connected by , can be expressed as follows:

(2)

denotes the number of (., the number of bends in the curve). For every , a high correlation between its and are observed [78]. The following theorem provides theoretical evidence for the empirically observed correlation between the two.

Theorem 1

(Zhao et al. [78]) Consider the maximum adversarial loss of any on the path , where represents with confined by an -ball. Assume:
(a) the standard loss on the path is a constant for all .
(b) for small , where and denote the gradient and the Hessian of at clean input . Let

denote the normalized inner product in absolute value for the largest eigenvector

of and . Then, we have

(3)

Please refer to Zhao et al. [78] for the proof. The left-hand side of Eq. (3) corresponds to of all . For specifically, Eq. (3) can be re-written as follows:

(4)

Geometrically speaking, adversarial attack methods perturb in a direction that maximizes the change in . The resulting adversarial examples fool by targeting the steep trajectories on the input loss landscape, crossing the decision boundary of a neural network with as little effort as possible [73]. Therefore, the more curved the input loss landscape of is, its predictions are more likely to be corrupted by adversarial examples.

4 Methodology

Based on the findings in Section 3.2, the problem of searching for an adversarially robust neural architecture can be re-formulated into the problem of searching for a neural architecture with a smooth input loss landscape. Since it is computationally infeasible to calculate the curvature of for every , we opt to evaluate candidate architectures under after standard training:

(5)

where refers to the Hessian of of at clean input , and refers to the largest eigenvalue of .

Therefore, during the search process, AdvRush penalizes candidate architectures with large . By favoring a candidate neural architecture with a smoother loss landscape, AdvRush effectively searches for a robust neural architecture. In this section, we show how the objective of AdvRush in Eq. (5) can be incorporated into the bi-level optimization problem of NAS [38] and provide a mathematical derivation for approximating the Hessian matrix.

4.1 AdvRush Framework

Standard training of each candidate architecture and evaluating its robustness against adversarial examples incur tremendous computational overhead to derive in Eq. (5). Thus, AdvRush employs differentiable architecture search [38] to allow for a simultaneous evaluation of all candidate architectures. AdvRush starts by constructing a weight-sharing supernet [38], , from which candidate architectures inherit weight parameters . Following the convention in differentiable architecture search [38], we represent the supernet in the form of a directed acyclic graph (DAG) with number of nodes. Each node of this DAG corresponds to a feature map, and each edge corresponds to a candidate operation that transforms . Each intermediate node of the graph is computed based on all of its predecessors:

(6)

To make the search space continuous for gradient-based optimization, categorical choice of a particular operation is continuously relaxed by applying a softmax function over all the possible operations:

(7)

where is a set of operation mixing weights (., architecture parameters). is the pre-defined set of operations that are used to construct the supernet. By definition, the size of must be equal to .

Through continuous relaxation, both the architecture parameters and the weight parameters in the supernet can be updated via gradient descent. Once the supernet converges, a single neural architecture can be obtained through the discretization step: . The objective of AdvRush now becomes to update to induce smoothness in the input loss landscape of the standard-trained , such that the final discretization step will yield in Eq. (5).

AdvRush accomplishes the above objective by driving the eigenvalues of of to be small. Consequently, their maximum, will also be small. Let denote the eigenvalues of . Our ideal loss term can be defined as the Frobenius norm of : . The resulting bi-level optimization problem of AdvRush can be expressed as follows:

(8)

refers to training data, and to validation data. In the following section, we show how can be computed without a significant increase in the search cost of AdvRush.

4.2 Approximation of

can be expressed in terms of norm: , where the expectation is taken over . Because the direct computation of is expensive, we linearly approximate it through the finite difference approximation of the Hessian:

(9)

where controls the scale of the loss landscape on which we induce smoothness. However, computing multiple in directions drawn from and taking its average would be computationally inefficient because each computation of requires calculation of the gradient. Therefore, we minimize the input loss landscape along the the high curvature direction, to maximize the effect of  [47, 27, 17].

With the approximated , the bi-level optimization problem of AdvRush can be expressed as:

(10)

is the clean input data from . The value of in the denominator of Eq. (9) is absorbed by the regularization strength . The remaining in Eq. (10

) is treated as a hyperparameter of AdvRush.

Because the loss landscape of a randomly initialized supernet is void of useful information, in AdvRush, we warm up the architecture parameters and the weight parameters of the supernet without . Upon the completion of the warm-up process, we introduce

for additional epochs. Please refer to the Appendix for the comprehensive pseudo code of AdvRush search process.

5 Experiments

In the following sections, we present extensive experimental results to demonstrate the effectiveness of AdvRush. Notably, we observe that the neural architecture discovered by AdvRush consistently exhibits superior robust accuracy across various benchmark datasets: CIFAR-10 [33], CIFAR-100 [33], SVHN [49]

, and Tiny-ImageNet 

[12]. NVIDIA V100 GPU and NVIDIA GeForce GTX 2080 Ti GPU are used for our experiments.

5.1 Experimental Settings

AdvRush Following the convention in NAS literature [38, 10], the CIFAR-10 dataset is used to execute AdvRush. We use DARTS [38] as the backbone differentiable architecture search algorithm because it is one of the most widely-benchmarked algorithms in NAS. For the search phase, the training set of CIFAR-10 is evenly split into two: one for updating and the other for updating . We generally follow the hyperparameter setting in DARTS [38], with a few modifications. We run AdvRush for total of 60 epochs, 50 of which are allocated for the warm-up process. The value of in Eq. (10) is set to be 1.5. We use a batch size of 32 for all epochs. To update , we use momentum SGD, with the initial learning rate of 0.025, momentum of 0.9, and weight decay factor of 3e-4. To update , we use Adam with the initial learning rate of 3e-4, momentum of (0.5, 0.999), and weight decay factor of 1e-3.

Standard & Adversarial Training Based on the model size measured in the number of parameters, following hand-crafted and NAS-based architectures are used for comparison: ResNet-18 [24], DenseNet-121 [25], DARTS [38], PDARTS [8], RobNet-free [23], and RACL [15]. For a fair evaluation of each architecture’s intrinsic robustness, all the tested architectures are trained using identical training settings. 1) Standard training: We train all architectures for 600 epochs. We use SGD optimizer with the initial learning rate of 0.025, which is annealed to zero through cosine scheduling. We set the weight decay factor to be . 2) Adversarial training: We use 7-step PGD training [39] with the step size of 0.01 and the total perturbation scale of 0.031 () to train all architectures. In addition to the evaluation on CIFAR-10, we evaluate the transferability of AdvRush by adversarially training the searched architecture on the following datasets: CIFAR-100, SVHN, and Tiny-ImageNet. Because a different set of hyperparameters is used for each dataset, the dataset configurations and the summary of hyperparameters for adversarial training are provided in the Appendix.

Adversarially Trained Standard Trained
Model Params Clean FGSM PGD PGD APGDCE AA Clean FGSM
ResNet-18 11.2M 84.09% 54.64% 45.86% 45.53% 44.54% 43.22% 95.84% 50.71%
DenseNet-121 7.0M 85.95% 58.46% 50.49% 49.92% 49.11% 47.46% 95.97% 45.51%
DARTS 3.3M 85.17% 58.74% 50.45% 49.28% 48.32% 46.79% 97.46% 50.56%
PDARTS 3.4M 85.37% 59.12% 51.32% 50.91% 49.96% 48.52% 97.49% 54.51%
RobNet-free 5.6M 85.00% 59.22% 52.09% 51.14% 50.41% 48.56% 96.40% 36.99%
RACL 3.6M 84.63% 58.57% 50.62% 50.47% 49.42% 47.64% 96.76% 52.38%
AdvRush 4.2M 87.30% 60.87% 53.07% 52.80% 51.83% 50.05% 97.58% 55.91%
Table 1: Evaluation of robust accuracy on CIFAR-10 under white-box attacks. The best result in each column is in bold, and the second best result is underlined. PGD and PGD refer to PGD attack with 20- and 100-iterations, respectively. AA refers to the final evaluation result after completing the standard group of AutoAttack methods. All attacks are -bounded with total perturbation scale of 0.031.
SourceTarget ResNet-18 DenseNet-121 DARTS PDARTS RobNet-free RACL AdvRush
ResNet-18 45.86% 66.31% 66.46% 67.54% 67.89% 66.20% 68.52%
DenseNet-121 63.14% 50.49% 65.58% 66.60% 66.58% 65.17% 67.18%
DARTS 64.40% 60.84% 50.45% 65.90% 65.54% 64.55% 66.89%
PDARTS 64.46% 60.44% 64.73% 51.32% 65.61% 64.23% 66.71%
RobNet-free 64.13% 61.03% 64.32% 65.30% 52.09% 63.72% 65.40%
RACL 64.49% 60.46% 64.67% 65.70% 65.73% 50.62% 66.58%
AdvRush 64.43% 60.98% 64.78% 65.24% 64.55% 64.23% 53.07%
Table 2: Evaluation of robust accuracy on CIFAR-10 under black-box attacks. Adversarial examples from the source model are generated with PGD. The best result in each row is in bold. The robust accuracy of each architecture under white-box attack is highlighted in gray.

5.2 White-box Attacks

We evaluate the adversarial robustness of architectures standard and adversarially trained on CIFAR-10 using various white-box attacks. Standard-trained architectures are evaluated under FGSM attack, while adversarially-trained architectures are evaluated under FGSM [19], PGD, PGD [39], and the standard group of AutoAttack (APGDCE, APGDT, FABT, and Square) [11]. White-box attack evaluation results are presented in Table 1. After both training schemes, AdvRush achieves the highest standard and robust accuracies, indicating that the AdvRush architecture is in fact equipped with higher intrinsic robustness. Even with the cell-based constraint, the robust accuracy of AdvRush under AutoAttack is higher than that of RobNet-free, which removes the cell-based constraint [23]. In addition, the high robust accuracy of AdvRush after AutoAttack evaluation implies that the AdvRush architecture does not unfairly benefit from obfuscated gradients.

5.3 Black-box Attacks

To evaluate the robustness of the searched architecture under a black-box setting, we conduct transfer-based black-box attacks among adversarially-trained architectures. Black-box evaluation results are presented in Table 2. Clearly, regardless of the source model used to synthesize adversarial examples, AdvRush is most resilient against tranfer-based black-box attacks. When considering each model pair, AdvRush generates stronger adversarial examples than its counterpart; for instance, AdvRush DARTS achieves the attack success rate (., 100% - robust accuracy) of 35.22%, while DARTS AdvRush achieves the attack success rate of 33.11%.

5.4 Transferability to Other Datasets

We transfer the architecture searched on CIFAR-10 to other datasets to evaluate its general applicability. The standard and the robust accuracy evaluation results on CIFAR-100, SVHN, and Tiny-ImageNet are presented in Table 3, which shows that the AdvRush architecture is highly transferable. Notice that on SVHN dataset, AdvRush experiences an extremely low drop in accuracy under both FGSM and PGD attacks. This result may imply that on easier datasets, such as SVHN, having a robust architecture could be sufficient for achieving high robustness, even without advanced adversarial training methods. Please refer to the Appendix for the full evaluation results.

Dataset Model Clean FGSM PGD
CIFAR-100 ResNet-18 55.57% 26.03% 21.44%
DenseNet-121 62.33% 34.68% 28.67%
PDARTS 58.41% 30.35% 25.83%
AdvRush 58.73% 39.51% 30.15%
SVHN ResNet-18 92.06% 88.73% 69.51%
DenseNet-121 93.72% 91.78% 76.51%
PDARTS 95.10% 93.01% 89.58%
AdvRush 96.53% 94.95% 91.14%
ResNet-18 36.26% 16.08% 13.94%
Tiny- DenseNet-121 47.56% 22.98% 18.06%
ImageNet PDARTS 45.94% 24.36% 22.74%
AdvRush 45.42% 25.20% 23.58%
Table 3: Evaluation of robust accuracy on various datasets under white-box attacks. We transfer the AdvRush architecture searched on CIFAR-10 to CIFAR-100, SVHN, and Tiny-ImageNet and evaluate its robustness against FGSM and PGD attacks. All attacks are -bounded with total perturbation scale of 0.031.

6 Discussion

6.1 Effect of Regularization Strength

The regularization strength is empirically set to be 0.01 to match the scale of and . The search results of other values of in Table 4. Regardless of the change in , the robust accuracy of the searched architecture is higher than that of the other tested architectures (Table 1). For large , however, the searched architecture experiences a significant drop in standard accuracy. Therefore, we conclude that AdvRush is not unduly sensitive to the tuning of , as long as it is sufficiently small. We track the change in and for different values of and plot the result in Figure 2; clearly, large causes to explode, thereby disrupting the search process. Please refer to the Appendix for the full ablation results.

Clean FGSM PGD PGD
0.001 (x 0.1) 85.65% 60.04% 52.70% 52.39%
0.005 (x 0.5) 85.68% 60.31% 52.93% 52.61%
0.01 (baseline) 87.30% 60.87% 53.07% 52.80%
0.02 (x 2) 83.15% 59.34% 53.42% 53.19%
0.1 (x 10) 83.03% 59.69% 53.67% 52.20%
Table 4: Effect of the change in the magnitude of . Baseline refers to the AdvRush with default of 0.01. The best result in each column is in bold, and the second best result is underlined.

6.2 Comparison against Supernet Adv. Training

Introducing a curve regularizer to the update rule of architectural parameters can be considered as being analogous to adversarial training of architectural parameters. Therefore, we compare AdvRush against adversarial training of the architectural parameters using two adversarial losses: 7-step PGD and FGSM. Since adversarial training can be introduced with or without warming up and of the supernet, we test both scenarios. The search results can be found in Table 5. It appears that neither one of the adversarial losses is effective at inducing additional robustness in the searched architecture. We conjecture that the inner and the outer objectives of the bi-level optimization collide with each other when trying to fit clean and perturbed data alternatively, thereby disrupting the search process. In the following section, we show in detail why the architectures searched through adversarial training of the supernet are significantly less robust than the family of AdvRush architectures. The failure of adversarial losses upholds the particular adequacy of the curve regularizer in AdvRush for discovering a robust neural architecture.

Figure 2: Search epoch vs. (left) and (right). is plotted in linear scale, and is plotted in logarithmic scale. It is clear that large causes explosion in .
Search Clean FGSM PGD PGD
FGSM 0 50 83.04% 56.30% 48.76% 48.47%
0 60 82.82% 55.55% 48.41% 48.04%
50 10 82.78% 54.17% 46.48% 46.03%
PGD 0 50 81.13% 53.59% 46.37% 45.95%
0 60 81.87% 53.46% 46.22% 45.82%
50 10 82.26% 54.87% 46.81% 46.63%
AdvRush 87.30% 60.87% 53.07% 52.80%
Table 5: AdvRush compared to various standard adversarial training methods. and denote the number of epochs with and without the adversarial loss term, respectively.
Figure 3: Loss landscape visualization for (a) standard-trained architectures and (b) adversarially-trained architectures. and denote perturbations in normal and random directions, respectively. The loss landscapes of AdvRush are visibly smoother than those of PDARTS.
Search Arch Params # of Operations Width Depth HRS ()
{M.P.; A.P.; S.; S.C.; D.C.} {N, R} {N, R} Std. Tr. Adv. Tr.
PDARTS 3.4M {0; 1; 3; 9; 2} {2.5c, 3c} {4, 3} 69.92 64.10
AdvRush Arch 0 4.2M {0; 2; 4; 9; 1} {4c; 3c} {2, 3} 71.09 66.01
Arch 1 4.2M {0; 0; 5; 8; 3} {4c, 3c} {2, 3} 63.50 65.25
Arch 2 4.2M {0; 0; 4; 10; 2} {3.5c, 3.5c} {3, 3} 71.06 65.44
Arch 3 3.8M {0; 0; 6; 7; 3} {4c, 3c} {2, 3} 64.24 64.05
Arch 4 3.7M {0; 0; 5; 8; 3} {4c, 4c} {2, 2} 66.27 65.20
Arch 5 2.9M {4; 0; 3; 2; 7} {4c, 3.5c} {2, 3} 61.57 61.44
Arch 6 3.0M {4; 0; 3; 1; 8} {4c, 3c} {2, 3} 67.36 61.10
Supernet Arch 7 2.3M {5; 1; 6; 1; 3} {4c, 3c} {2, 3} 44.55 59.53
Adv. Tr. Arch 8 2.3M {1; 0; 4; 0; 11} {3c, 3c} {4, 3} 38.87 59.01
Arch 9 2.1M {1; 0; 6; 0; 9} {3.5c, 3c} {4, 3} 34.52 59.08
Arch 10 2.3M {0; 5; 7; 1; 3} {4c, 2.5c} {2, 3} 38.15 59.66
Table 6:

Architecture analysis. {M.P.; A.P.; S.; S.C.; D.C.} each refers to max pooling, average pooling, skip connect, seperable convolution, and dilated convolution. The width and the depth of an architecture are measured following Shu

et al. [61], and N, R denote the normal and the reduction cell. Std. Tr. and Adv. Tr. refer to the HRS score of an architecture after standard and adversarial training.

6.3 Examination of the Searched Architecture

To show that the AdvRush architecture in fact has a relatively smooth input loss landscape, we compare the input loss landscapes of the AdvRush and the PDARTS architectures after standard training and adversarial training in Figure 3. The loss landscapes are visualized using the same technique as utilized by Moosavi et al. [47]. Degrees of perturbation in input data for standard-trained and adversarially-trained architectures are set differently to account for the discrepancy in their sensitivity to perturbation. Independent of the training method, the input loss landscape of the AdvRush architecture is visibly smoother. The visualization results provide strong empirical support for AdvRush by demonstrating that the search result is aligned with our theoretical motivation.

We analyze the architectures used in our experiments to provide a meaningful insight into what makes an architecture robust. To begin with, we find that architectures that are based on the DARTS search space are generally more robust than hand-crafted architectures. The DARTS search space is inherently designed to yield an architecture with dense connectivity and complicated feature reuse. We believe that the complex wiring pattern of the DARTS search space allows derived architectures to be more robust than others, as observed by Guo et al. [23].

Furthermore, we compare the twelve architectures searched from the DARTS search space, through Harmonic Robustness Score (HRS) [13], a recently-introduced metric for measuring the trade-off between standard and robust accuracies. Details regarding the calculation of HRS can be found in the Appendix. In Table 6, we report the HRS of each architecture, along with the summary of its architectural details. Please refer to the Appendix for the visualization of each architecture. In general, robust architectures with high HRS have fewer parameter-free operations (pooling and skip connect) and more separable convolution operations. As a result, they tend to have more parameters than non-robust ones; this result coincides with the observations in Madry et al. [39] and Su et al. [63].

Also, the fact that Arch 5 & 6, in spite of having a fewer number of parameters, have comparable HRS to some of the larger architectures leads us to believe that the diversification of operations contributes to improving the robustness. The operational diversity is once again observed in PDARTS and Arch 0, both of which have high HRS. Lastly, no clear relationship between the width and the depth of an architecture and its robustness can be found, indicating that these two factors may have less influence over robustness.

7 Conclusion

In this work, AdvRush, a novel adversarial robustness-aware NAS algorithm, is proposed. The objective function of AdvRush is designed to prefer a candidate architecture with a smooth input loss landscape. The theoretical motivation behind our approach is validated by strong empirical results. Possible future works include the study of a robust neural architecture for multimodal datasets and expansion of the search space to include more diversified operations.

References

  • [1] B. Baker, O. Gupta, N. Naik, and R. Raskar (2017) Designing neural network architectures using reinforcement learning. In International Conference on Learning Representations, Cited by: §2.2.
  • [2] G. Bender, P. Kindermans, B. Zoph, V. Vasudevan, and Q. V. Le (2018) Understanding and simplifying one-shot architecture search. In Proceedings of the 35th International Conference on Machine Learning, pp. 550–559. Cited by: §2.2.
  • [3] W. Brendel, J. Rauber, and M. Bethge (2018) Decision-based adversarial attacks: reliable attacks against black-box machine learning models. Cited by: §2.1.
  • [4] S. Bubeck, Y. T. Lee, E. Price, and I. Razenshteyn (2019) Adversarial examples from computational constraints. In Proceedings of the 36th International Conference on Machine Learning, pp. 831–840. Cited by: §1.
  • [5] N. Carlini and D. Wagner (2017) Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (S&P), pp. 39–57. Cited by: §2.1.
  • [6] P. Chen, H. Zhang, Y. Sharma, J. Yi, and C. Hsieh (2017) Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In

    Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

    ,
    pp. 15–26. Cited by: §2.1.
  • [7] X. Chen and C. Hsieh (2020) Stabilizing differentiable architecture search via perturbation-based regularization. Cited by: §2.2.
  • [8] X. Chen, L. Xie, J. Wu, and Q. Tian (2019) Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In Proceedings of the IEEE International Conference on Computer Vision, pp. 1294–1303. Cited by: §1, §1, §2.2, §2.2, §5.1.
  • [9] Y. Chen, T. Yang, X. Zhang, G. Meng, X. Xiao, and J. Sun (2019) DetNAS: backbone search for object detection. In Advances in Neural Information Processing Systems, pp. 6638–6648. Cited by: §2.2.
  • [10] X. Chu, T. Zhou, B. Zhang, and J. Li (2020) Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search. In European Conference On Computer Vision, Cited by: §2.2, §2.2, §5.1.
  • [11] F. Croce and M. Hein (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the 37th International Conference on Machine Learning, Cited by: §2.1, §5.2.
  • [12] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei (2009) Imagenet: a large-scale hierarchical image database. In

    2009 IEEE conference on computer vision and pattern recognition

    ,
    pp. 248–255. Cited by: §5.
  • [13] C. Devaguptapu, D. Agarwal, G. Mittal, and V. N. Balasubramanian (2020) An empirical study on the robustness of nas based architectures. arXiv preprint arXiv:2007.08428. Cited by: §6.3.
  • [14] E. Dohmatob (2019) Generalized no free lunch theorem for adversarial robustness. In Proceedings of the 36th International Conference on Machine Learning, pp. 1646–1654. Cited by: §1.
  • [15] M. Dong, Y. Li, Y. Wang, and C. Xu (2020) Adversarially robust neural architectures. arXiv preprint arXiv:2009.00902. Cited by: §2.2, §5.1.
  • [16] X. Dong and Y. Yang (2019) Searching for a robust neural architecture in four gpu hours. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1761–1770. Cited by: §2.2.
  • [17] A. Fawzi, S. Moosavi-Dezfooli, P. Frossard, and S. Soatto (2018) Empirical study of the topology and geometry of deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3762–3770. Cited by: §4.2.
  • [18] T. Garipov, P. Izmailov, D. Podoprikhin, D. P. Vetrov, and A. G. Wilson (2018) Loss surfaces, mode connectivity, and fast ensembling of dnns. In Advances in Neural Information Processing Systems, pp. 8789–8798. Cited by: §3.2.
  • [19] I. J. Goodfellow, J. Shlens, and C. Szegedy (2015) Explaining and harnessing adversarial examples. Cited by: §2.1, §2.1, §5.2.
  • [20] I. J. Goodfellow, O. Vinyals, and A. M. Saxe (2015) Qualitatively characterizing neural network optimization problems. In International Conference on Learning Representations, Cited by: §3.1.
  • [21] C. Guo, J. R. Gardner, Y. You, A. G. Wilson, and K. Q. Weinberger (2019) Simple black-box adversarial attacks. pp. 2484–2493. Cited by: §2.1.
  • [22] J. Guo, K. Han, Y. Wang, C. Zhang, Z. Yang, H. Wu, X. Chen, and C. Xu (2020) Hit-detector: hierarchical trinity architecture search for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11405–11414. Cited by: §2.2.
  • [23] M. Guo, Y. Yang, R. Xu, Z. Liu, and D. Lin (2020) When nas meets robustness: in search of robust architectures against adversarial attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 631–640. Cited by: §1, §2.2, §5.1, §5.2, §6.3.
  • [24] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §5.1.
  • [25] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger (2017) Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708. Cited by: §5.1.
  • [26] A. Ilyas, L. Engstrom, A. Athalye, and J. Lin (2018) Black-box adversarial attacks with limited queries and information. In Proceedings of the 35th International Conference on Machine Learning, pp. 2137–2146. Cited by: §2.1.
  • [27] S. Jetley, N. Lord, and P. Torr (2018) With friends like these, who needs adversaries?. In Advances in Neural Information Processing Systems, pp. 10749–10759. Cited by: §4.2.
  • [28] C. Jiang, H. Xu, W. Zhang, X. Liang, and Z. Li (2020) SP-nas: serial-to-parallel backbone search for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11863–11872. Cited by: §2.2.
  • [29] Y. Jiang, C. Hu, T. Xiao, C. Zhang, and J. Zhu (2019)

    Improved differentiable architecture search for language modeling and named entity recognition

    .
    In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3576–3581. Cited by: §2.2.
  • [30] H. Kannan, A. Kurakin, and I. Goodfellow (2018)

    Adversarial logit pairing

    .
    arXiv preprint arXiv:1803.06373. Cited by: §2.1.
  • [31] J. Kim, J. Wang, S. Kim, and Y. Lee (2020)

    Evolved speech-transformer: applying neural architecture search to end-to-end automatic speech recognition

    .
    Proc. Interspeech 2020, pp. 1788–1792. Cited by: §2.2.
  • [32] S. Kotyan and D. V. Vargas (2020) Towards evolving robust neural architectures to defend from adversarial attacks. In

    Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion

    ,
    pp. 135–136. Cited by: §2.2.
  • [33] A. Krizhevsky, G. Hinton, et al. (2009) Learning multiple layers of features from tiny images. Cited by: §5.
  • [34] A. Kurakin, I. Goodfellow, and S. Bengio (2017) Adversarial examples in the physical world. Cited by: §2.1.
  • [35] S. Lee, H. Lee, and S. Yoon (2020) Adversarial vertex mixup: toward better adversarially robust generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 272–281. Cited by: §1.
  • [36] H. Li, Z. Xu, G. Taylor, C. Studer, and T. Goldstein (2018) Visualizing the loss landscape of neural nets. In Advances in Neural Information Processing Systems, pp. 6389–6399. Cited by: §3.1.
  • [37] C. Liu, L. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and L. Fei-Fei (2019) Auto-deeplab: hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 82–92. Cited by: §2.2.
  • [38] H. Liu, K. Simonyan, and Y. Yang (2019) DARTS: differentiable architecture search. In International Conference on Learning Representations, Cited by: §2.2, §4.1, §4, §5.1, §5.1.
  • [39] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu (2018) Towards deep learning models resistant to adversarial attacks. Cited by: §2.1, §5.1, §5.2, §6.3.
  • [40] D. Meng and H. Chen (2017) Magnet: a two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp. 135–147. Cited by: §2.1.
  • [41] J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff (2017) On detecting adversarial perturbations. In International Conference on Learning Representations, Cited by: §2.1.
  • [42] T. Miyato, S. Maeda, M. Koyama, and S. Ishii (2018)

    Virtual adversarial training: a regularization method for supervised and semi-supervised learning

    .
    Vol. 41, pp. 1979–1993. Cited by: §2.1.
  • [43] T. Mo, Y. Yu, M. Salameh, D. Niu, and S. Jui (2020) Neural architecture search for keyword spotting. arXiv preprint arXiv:2009.00165. Cited by: §1, §2.2.
  • [44] S. Moon, G. An, and H. O. Song (2019)

    Parsimonious black-box adversarial attacks via efficient combinatorial optimization

    .
    In International Conference on Machine Learning, pp. 4636–4645. Cited by: §2.1.
  • [45] S. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard (2017) Universal adversarial perturbations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1765–1773. Cited by: §2.1.
  • [46] S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard (2016) Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2574–2582. Cited by: §2.1.
  • [47] S. Moosavi-Dezfooli, A. Fawzi, J. Uesato, and P. Frossard (2019) Robustness via curvature regularization, and vice versa. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9078–9086. Cited by: §2.1, §4.2, §6.3.
  • [48] V. Nekrasov, H. Chen, C. Shen, and I. Reid (2019) Fast neural architecture search of compact semantic segmentation models via auxiliary cells. In Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 9126–9135. Cited by: §1, §2.2.
  • [49] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng (2011) Reading digits in natural images with unsupervised feature learning. Cited by: §5.
  • [50] (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning, pp. 274–283. Cited by: §2.1.
  • [51] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami (2017) Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519. Cited by: §2.1.
  • [52] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami (2016) The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372–387. Cited by: §2.1.
  • [53] N. Papernot and P. McDaniel (2017)

    Extending defensive distillation

    .
    arXiv preprint arXiv:1705.05264. Cited by: §2.1.
  • [54] H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean (2018) Efficient neural architecture search via parameter sharing. In Proceedings of the 35th International Conference on Machine Learning, pp. 4095–4104. Cited by: §2.2.
  • [55] C. Qin, J. Martens, S. Gowal, D. Krishnan, K. Dvijotham, A. Fawzi, S. De, R. Stanforth, and P. Kohli (2019) Adversarial robustness through local linearization. In Advances in Neural Information Processing Systems, pp. 13847–13856. Cited by: §2.1.
  • [56] X. Qu, J. Wang, and J. Xiao (2020) Evolutionary algorithm enhanced neural architecture search for text-independent speaker verification. Proc. Interspeech 2020, pp. 961–965. Cited by: §2.2.
  • [57] E. Real, A. Aggarwal, Y. Huang, and Q. V. Le (2019)

    Regularized evolution for image classifier architecture search

    .
    In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 4780–4789. Cited by: §2.2.
  • [58] E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le, and A. Kurakin (2017) Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 2902–2911. Cited by: §2.2.
  • [59] A. S. Ross and F. Doshi-Velez (2018) Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Association for the Advancement of Artificial Intelligence, Cited by: §2.1.
  • [60] P. Samangouei, M. Kabkab, and R. Chellappa (2018) Defense-gan: protecting classifiers against adversarial attacks using generative models. In International Conference on Learning Representations, Cited by: §2.1.
  • [61] Y. Shu, W. Wang, and S. Cai (2019) Understanding architectures learnt by cell-based neural architecture search. In International Conference on Learning Representations, Cited by: Table 6.
  • [62] Y. Song, T. Kim, S. Nowozin, S. Ermon, and N. Kushman (2018) Pixeldefend: leveraging generative models to understand and defend against adversarial examples. In International Conference on Learning Representations, Cited by: §2.1.
  • [63] D. Su, H. Zhang, H. Chen, J. Yi, P. Chen, and Y. Gao (2018) Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 631–648. Cited by: §6.3.
  • [64] J. Su, D. V. Vargas, and K. Sakurai (2019) One pixel attack for fooling deep neural networks. Vol. 23, pp. 828–841. Cited by: §2.1.
  • [65] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1.
  • [66] M. Tan, R. Pang, and Q. V. Le (2020) Efficientdet: scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790. Cited by: §1, §2.2.
  • [67] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry (2019)

    Robustness may be at odds with accuracy

    .
    In International Conference on Learning Representations, Cited by: §1.
  • [68] C. Xiao, B. Li, J. Zhu, W. He, M. Liu, and D. Song (2018) Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610. Cited by: §2.1.
  • [69] C. Xie, Y. Wu, L. v. d. Maaten, A. L. Yuille, and K. He (2019) Feature denoising for improving adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 501–509. Cited by: §1, §2.1.
  • [70] S. Xie, H. Zheng, C. Liu, and L. Lin (2018) SNAS: stochastic neural architecture search. In International Conference on Learning Representations, Cited by: §2.2.
  • [71] W. Xu, D. Evans, and Y. Qi (2018) Feature squeezing: detecting adversarial examples in deep neural networks. In Network and Distributed Systems Security Symposium, Cited by: §2.1.
  • [72] Y. Xu, L. Xie, X. Zhang, X. Chen, G. Qi, Q. Tian, and H. Xiong (2019) Pc-darts: partial channel connections for memory-efficient architecture search. In International Conference on Learning Representations, Cited by: §2.2.
  • [73] F. Yu, Z. Qin, C. Liu, L. Zhao, Y. Wang, and X. Chen (2019) Interpreting and evaluating neural network robustness. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 4199–4205. Cited by: §3.2, §3.2.
  • [74] H. Zhang and J. Wang (2019) Defense against adversarial attacks using feature scattering-based adversarial training. In Advances in Neural Information Processing Systems, pp. 1831–1841. Cited by: §2.1.
  • [75] H. Zhang, Y. Yu, J. Jiao, E. P. Xing, L. E. Ghaoui, and M. I. Jordan (2019) Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th International Conference on Machine Learning, pp. 7472–7482. Cited by: §1, §2.1.
  • [76] M. Zhang, H. Li, S. Pan, X. Chang, and S. Su (2020) Overcoming multi-model forgetting in one-shot nas with diversity maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7809–7818. Cited by: §2.2.
  • [77] Y. Zhang, Z. Qiu, J. Liu, T. Yao, D. Liu, and T. Mei (2019) Customizable architecture search for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11641–11650. Cited by: §1, §2.2.
  • [78] P. Zhao, P. Chen, P. Das, K. N. Ramamurthy, and X. Lin (2020) Bridging mode connectivity in loss landscapes and adversarial robustness. In International Conference on Learning Representations, Cited by: §1, §1, §3.2, §3.2, Theorem 1.
  • [79] B. Zoph and Q. V. Le (2017) Neural architecture search with reinforcement learning. In International Conference on Learning Representations, Cited by: §1, §2.2.
  • [80] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le (2018) Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 8697–8710. Cited by: §2.2.

Appendices

A1 AdvRush Implementation Details

Following DARTS, the search space of AdvRush includes following operations:

  • Zero Operation

  • Skip Connect

  • 3x3 Average Pooling

  • 3x3 Max Pooling

  • 3x3 Separable Conv

  • 5x5 Separable Conv

  • 3x3 Dilated Conv

  • 5x5 Dilated Conv

The pseudo code is presented in Algorithm 1.

Input :  = total number of epochs for search
= number of epochs to warm-up
= regularization strength
Output :  = final architecture
Initialize For = from to :    If        Update using (SGD)        Update using (Adam)    Else        Update using (SGD)        Update using (Adam)    End End Derive through discretization rule of DARTS from
Algorithm 1 AdvRush (Search)

A2 Hyperparameters and Datasets

For adversarial training, a different set of hyperparameters for each dataset. Hyperparameter settings for the datasets used in our experiments are provided in Table A1. Details regarding the datasets are provided in Table A4.

CIFAR SVHN Tiny-ImageNet
Optimizer SGD SGD SGD
Momentum 0.9 0.9 0.9
Weight decay 1e-4 1e-4 1e-4
Epochs 200 200 90
LR 0.1 0.01 0.1
LR decay (100, 150) (100, 150) (30, 60)
Table A1: Details of hyperpameters used for adversarial training on different datasets. Learning rate is decayed by the factor of 0.1 at selected epochs. CIFAR refers to both CIFAR-10 and -100.

A3 Difference in

In Figure A1, we visualize how the difference in update rule between DARTS and AdvRush affects the value of . We use of for AdvRush, and is introduced at the epoch. Notice that AdvRush experiences a steeper drop in ; this phenomenon implies that the final supernet of AdvRush is indeed smoother than that of DARTS. Since the only difference in the two compared search algorithms is the learning objective of , it is safe to assume that this extra smoothness is induced solely by the difference in .

Figure A1: The difference in value of between DARTS and AdvRush. is introduced at Epoch , noted in a gray dotted line. The red arrow points out the gap in between the two algorithms, caused by different update rules.

A4 More Input Loss Landscapes

We provide additional input loss landscapes of other tested architectures after standard training (Figure A4) and adversarial training (Figure A5). All loss landscapes are drawn using the CIFAR-10 dataset. In both figures, and denote perturbations in normal and random directions, respectively. Degrees of perturbation in input data for standard trained and adversarially trained architectures are set differently to account for the discrepancy in their sensitivity to perturbation.

A5 Architecture Visualization

The architectures used for the analysis in Section 6.3 of the main text are visualized in Figure A2 and A3.

A6 HRS Calculation

HRS is calculated as follows:

(A1)

where C denotes the clean accuracy, and R denotes the robust accuracy. For standard-trained architectures, we use the robust accuracy under FGSM attack for R because the robust accuracy of standard-trained architectures under PGD attack reaches near-zero. For adversarially-trained architectures, we use the robust accuracy under PGD attack for R. C, R, and HRS values for all architectures used in Section 6.3 of the main text can be found in Table A2.

A7 Extended Results on Other Datasets

In addition to the results in the main text, we conduct PGD attack with different number of iterations and evaluate the robust accuracy on CIFAR-100, SVHN, and Tiny-ImageNet. We use varying iterations from to . The results are visualized in Figure A6. AdvRush consistently outperforms other architectures, regardless of the strength of the adversary.

A8 Full Ablation Results

In Table A3, the full ablation analysis of AdvRush including the robust accuracies under AutoAttack is presented. No matter the value of used, architectures searched with AdvRush achieve high robust accuracies under AutoAttack. Such results suggest that AdvRush does not require excessive tuning of to search for a robust neural architecture.

Arch Std. Tr. Adv. Tr.
Clean FGSM HRS Clean PGD HRS
PDARTS 97.49% 54.51% 69.92 85.37% 51.32% 64.10
Arch 0 97.58% 55.91% 71.09 87.30% 53.07% 66.01
Arch 1 95.59% 47.54% 63.50 85.65% 52.70% 65.25
Arch 2 95.55% 56.57% 71.06 85.68% 52.93% 65.44
Arch 3 95.24% 48.46% 64.24 83.15% 53.42% 64.05
Arch 4 95.45% 50.75% 66.27 83.03% 53.67% 65.20
Arch 5 95.05% 45.53% 61.57 83.04% 48.76% 61.44
Arch 6 94.93% 52.20% 67.36 82.82% 48.41% 61.10
Arch 7 94.01% 29.19% 44.55 82.78% 46.48% 59.53
Arch 8 93.24% 24.55% 38.87 81.13% 46.37% 59.01
Arch 9 93.08% 21.19% 34.52 81.87% 46.22% 59.08
Arch 10 93.88% 23.94% 38.15 82.26% 46.81% 59.66
Table A2: Effect of the change in the magnitude of . Baseline refers to the AdvRush with default of 0.01. The best result in each column is in bold, and the second best result is underlined.
Figure A2: Architectures searched by PDARTS and AdvRush. A normal cell maintains the dimension of input feature maps, while a reduction cell reduces it. The entire neural network is constructed by stacking 18 normal cells and 2 reduction cells following the DARTS convention. The reduction cell is placed at the and points of the network.
Figure A3: Architectures searched by adversarial training of the supernet. A normal cell maintains the dimension of input feature maps, while a reduction cell reduces it. The entire neural network is constructed by stacking 18 normal cells and 2 reduction cells following the DARTS convention. The reduction cell is placed at the and points of the network.
Figure A4: Input loss landscapes after standard training on CIFAR-10.
Figure A5: Input loss landscapes after adversarial training on CIFAR-10.
Figure A6: Robust accuracies on (a) CIFAR-100, (b) SVHN, and (c) Tiny-ImageNet under different iterations of PGD attack.
-value Clean FGSM PGD PGD APGDCE APGDT FABT Square AA
0.001 (x 0.1) 85.65% 60.04% 52.70% 52.39% 51.39% 49.16% 49.13% 49.13% 49.13%
0.005 (x 0.5) 85.68% 60.31% 52.93% 52.61% 51.77% 49.63% 49.62% 49.62% 49.62%
0.01 (baseline) 87.30% 60.87% 53.07% 52.80% 50.05% 50.04% 50.04% 50.04% 50.04%
0.02 (x 2) 83.15% 59.34% 53.42% 53.19% 52.22% 49.60% 49.60% 49.60% 49.60%
0.1 (x 10) 83.03% 59.69% 53.67% 52.20% 51.22% 49.12% 49.11% 49.11% 49.11%
Table A3: Effect of the change in the magnitude of . Baseline refers to AdvRush with default of 0.01. The best result in each column is in bold, and the second best result is underlined.
Dataset # of Train Data # of Validation Data # of Test Data # of Classes Image Size
CIFAR-10 50,000 - 10,000 10
CIFAR-100 50,000 - 10,000 100
SVHN 73,257 - 26,032 10
Tiny-ImageNet 100,000 5,000 5,000 200
Table A4: Summary of datasets used for the experiments. CIFAR-10, CIFAR-100, and SVHN, which are available directly from torchvision, are split only into train and test sets by default.