1 Introduction
In recent years, Convolutional Neural Networks (CNN) have greatly advanced performance in various vision tasks, including image recognition
[11, 14, 26], object detection [25, 7], and semantic segmentation [4, 19], etc. However, it has been observed [31, 9]that adding human imperceptible perturbations to input image can cause CNN to make incorrect predictions even if the original image can be correctly classified. These intentionally generated images are usually called adversarial examples
[9, 15, 31].A recent development [18] has demonstrated that adversarial examples generated from certain models (especially for those learned by iterationbased methods [15, 6]) are less transferable to other models. In other words, those adversarial examples easily overfit to a specific network and achieve much lower attack rate in blackbox settings (i.e., attackers have no knowledge of the models they may attack). To remedy this, an ensemble [18] of multiple networks has been suggested to improve the transferability.
However, ensemblebased attacks suffer from expensive computational overhead, making it difficult to efficiently learn transferable adversarial examples. First, in order to acquire good (i.e., low test error) and diverse (i.e., converge at different local minima) models, people usually independently train them from scratch. To leverage their complementarity, existing methods adopt an intensive aggregation way to fuse the outputs of those networks (e.g
., logits). Consequently, attacking methods in
[17] ensemble at most 10 networks restricted by the computational complexity.In this work, we propose a highly efficient alternative called Ghost Networks to address this issue. The basic principle is to generate a huge number of virtual models built on a network trained from scratch (a base network or a base model). The word “virtual” means that those ghost networks are not stored or trained, which incurs extra time and space cost. Instead, they are simply generated by imposing erosion on certain intermediate structures of the base network and then used onthefly. In this case, with an increasing number of models we have, it is apparent that a standard ensemble [18] would be problematic owing to its complexity. Accordingly, we propose Longitudinal Ensemble, a specialized fusion method for ghost networks, which conducts an implicit ensemble during attack iterations. Consequently, transferable adversarial examples can be easily generated without sacrificing computational efficiency.
To summarize, the contributions of our work can be divided into three folds: 1) Our work is the first one to explore network erosion to learn transferable adversarial examples, which does not solely rely on multinetwork ensemble. 2) We observe that the number of different networks actually used for ensemble (intrinsic networks) is essential for the transferability. However, it is less necessary to train different models independently. Instead, ghost networks can be a competitive alternative with extremely low complexity. 3) It is generic. Though appearing to be an ensemblebased method for multimodel attacks, it can be also applied to singlemodel attacks where only one trained model is accessible. Furthermore, it is also compatible with different network structures, attack methods, and adversarial settings.
Extensive experimental results demonstrate that our work is a computationally cheap plugin which improves the transferability of adversarial examples. In particular, by reproducing NIPS 2017 adversarial competition [17], our work outperforms the No.1 attack submission by a large margin, which demonstrates its effectiveness and efficiency.
2 Related Work
Deep networks have been shown to be vulnerable to adversarial examples, i.e., maliciously perturbed inputs designed to mislead a model [2, 31, 9, 22].
The transferability of adversarial examples refers to the property of the same input image easily misclassified by different models. This was first investigated by Szegedy et al. [31] on MNIST, which later led to the development of blackbox attack. Afterward, Liu et al. [18]
proposed ensemblebased approaches which demonstrate transferability on largescale datasets like ImageNet
[5].Optimizationbased methods (e.g., CarliniWagner attack [3]) and iterationbased methods (e.g., IFGSM [15]) tend to overfit a specific network structure, and thus leading to weak transferability [16]. On the other hand, singlestep gradientbased methods, such as Fast Gradient Sign Method (FGSM) [9], are good at learning stronger transferable adversarial examples but are less successful for the whitebox attacks. Taking advantages of [15] and [9], Dong et al. [6] proposed a momentum iterative method to generate adversarial examples with stronger transferability. In order to further avoid overfitting specific models, it uses an ensemble of trainedfromscratch models.
However, how to efficiently learn transferable adversarial examples remains a challenging task. Some works [1, 24, 35] suggest retraining neural networks, e.g., generative adversarial learning [8], can achieve high transferability. Moreover, Papernot et al. [23] proposed a querybased method to improve blackbox attack performance. However, this requires massive information from the target model. In conclusion, acquiring and integrating information from various models to approximate the target model is the key to achieving better transferability. However, most existing works are inefficient and inadequate to learn adversarial examples with strong transferability. Our work addresses this issue with high efficiency.
3 Ghost Networks
The goal of this work is to learn adversarial examples, with a particular attention on the transferability. Given a clean image , we want to find an adversarial example , which is still visually similar to after adding adversarial noise but fools the classifier. In order to improve the transferability of , we choose to simultaneously attack multiple models. However, unlike existing works [6, 18], we propose Ghost Networks, an highly efficient algorithm that can do both generation and combination an ensemble of diverse models to learn transferable adversarial examples.
We introduce two strategies of generating ghost networks in Sec. 3.1 and Sec. 3.2, respectively, then present a customized fusion method named longitudinal ensemble in Sec. 3.3.
3.1 Dropout Erosion
Revisiting Dropout. Dropout [27]
has been one of the most popular techniques in deep learning. By randomly dropping out units from the model during the training phase, dropout can effectively prevent deep neural networks from overfitting. Some recent works
[26, 28, 30, 29] achieve stateoftheart performances on benchmark datasets by applying dropout to a layer of highlevel features.Let be the activation in the th layer, at training time, the output after the dropout layer can be mathematically defined as
(1) 
where denotes an elementwise product and
denotes the Bernoulli distribution with the probability
of being . At test time, units in are always present, thus to keep the output is the same as the expected output at training time, is set to be .Perturbing Dropout. Dropout provides an efficient way of approximately combining different neural network architectures and thereby prevents overfitting. Inspired by this, we propose to generate ghost networks by inserting the dropout layer. In order to make these ghost networks as diverse as possible, we densely apply dropout to every block throughout the base network. Therefore, diversity is not limited to highlevel features but applies to all feature levels.
Let be the function between the th and th layer, i.e., , then the output after applying dropout erosion is
(2) 
where , and has the same meaning as in Eq. (1), indicating the probability that is preserved. To keep the expected input of consistency after erosion, the activation of should be divided by .
During the inference, the output feature after th dropout layer () is
(3) 
where denotes the composite function, more specifically, .
By combining Eq. (2) and Eq. (3), we observe that when (means ), all elements in are equal to . In this case, we do not impose any perturbations to the base network. When gradually increases to (meaning decreases to ), the ratio of the number of elements that are dropped out is . In other words, of elements can be successfully backpropagated. Hence, larger implies a heavier erosion on the base network. Therefore, we define to be the magnitude of erosion.
When perturbing the dropout layer, the gradient in backpropagation can be written as
(4) 
As shown in Eq. (4), deeper networks with larger are influenced more easily according to the product rule. Sec. 4.2 will experimentally analyze the impact of .
Generating Ghost Network. The generation of ghost networks via perturbing dropout layer proceeds in three steps: 1) randomly sample a parameter set from the Bernoulli distribution ; 2) apply Eq. (2) to the base network with the parameter set and get the perturbed network; 3) we independently sample for times and obtain a pool of ghost networks which can be used for adversarial attacks.
3.2 Skip Connection Erosion
Revisiting Skip Connection. He et al. [10, 11] propose skip connection in CNN, which makes it feasible to train very deep neural networks.
The standard residual block in [11] is defined by
(5) 
where and are the input and output to the th residual block with the weights . denotes the residual function. As suggested in [11], it is crucial to uses the identity skip connection, i.e., , to facilitate the residual learning process, otherwise the network may not converge to a good local minima.
Perturbing Skip Connection. Following the principle of skip connection, we propose to perturb skip connection to generate ghost networks.
More specifically, the network weights are first learned using identity skip connections, then switched to the randomized skip connection (see Fig. 2). To achieve this end, we apply randomized modulating scalar to the th residual block, by
(6) 
where
is drawn from the uniform distribution
. One may have observed several similar formulations on skip connection to improve the classification performance, e.g., gated inference in ConvNetAIG [33] and lesion study in [34]. However, our work focuses on attacking the model with a randomized perturbation on skip connection, i.e., the model is actually not trained via Eq. (6).During inference, the output feature after th layer () is
(7) 
The gradient in backpropagation is then written as
(8) 
Similar to the analysis in Sec. 3.1, we conclude from Eq. (7) and Eq. (8) that a larger will have a greater influence on the base network and deeper networks are easily influenced.
Generating Ghost Network. The generation of ghost networks via perturbing the skip connections is similar to the generating procedure via perturbing the dropout layer. The only difference is that in the first step, we need to sample a set of modulating scalars from the uniform distribution for each skip connection.
3.3 Longitudinal Ensemble
Existing iterationbased ensembleattack approaches [6, 18] require averaging the outputs (e.g., logits, classification probabilities, losses) of different networks. However, such a standard ensemble [18] is too costly and inefficient for us because we can readily obtain a huge candidate pool of qualified neural models by using Ghost Networks.
To remedy this, we propose longitudinal ensemble, a specialized fusion method for Ghost Networks, which constructs an implicit ensemble of the ghost networks by randomizing the perturbations during the iterations of adversarial attack (e.g., IFGSM [15] and MIFGSM [6]). Suppose we have a base model , which can then generate a pool of networks , where is the model number. The key step of longitudinal ensemble is that at the th iteration, we attack the ghost network only. In comparison, for each iteration, the standard ensemble requires fusing the gradients of all the models in the model pool , which requires more computational cost. We illustrate the difference between the standard ensemble and longitudinal ensemble method in Fig. 3.
The longitudinal ensemble shares the same prior as [6, 18] that if an adversarial example is generated by attacking multiple networks, then it is more likely to transfer to other networks. However, the longitudinal ensemble removes duplicated computations by only sampling one model from the model pool rather than using all models in each iteration.
Three noteworthy comments should be made here. First, the ghost networks are never stored or trained. As a result, this incurs neither additional time nor space complexity. Second, it is obvious from Fig. 3 that attackers can combine the standard ensemble and the longitudinal ensemble of ghost networks. Finally, it is easy to extend longitudinal ensemble to multimodel attack by considering each base model as a branch (see experimental evaluations for details).
4 Experiments
In this section, we give a comprehensive experimental evaluation of the proposed Ghost Networks. In order to distinguish models trained from scratch and the ghost networks we generate, we call the former one the base network or base model in the rest of this paper.
Due to space limitations, we will give a more detailed evaluation in the supplementary material.
4.1 Experimental Setup
Base Networks. base models are used in our experiments, including normally trained models^{1}^{1}1Available at https://github.com/tensorflow/models/tree/master/research/slim, i.e., Resnet v250 (Res50) [11], Resnet v2101 (Res101) [11], Resnet v2152 (Res152) [11], Inception v3 (Incv3) [28], Inception v4 (Incv4) [30] and Inception Resnet v2 (IncResv2) [30], and adversariallytrained models [32]^{2}^{2}2Available at https://github.com/tensorflow/models/tree/master/research/adv_imagenet_models, i.e., Incv3_{ens3}, Incv3_{ens4} and IncResv2_{ens}.
Datasets. Because it is less meaningful to attack images that are originally misclassified, we select images from the ILSVRC 2012 validation set following [36], which can be correctly classified by all the base models.
Attacking Methods. We employ two iterationbased attack methods to evaluate the adversarial robustness, i.e., Iterative Fast Gradient Sign Method (IFGSM) and Momentum Iterative Fast Gradient Sign Method (MIFGSM). Both of them are variants of Fast Gradient Sign Method (FSGM) [9], and are available at cleverhans library [21].
IFGSM was proposed by Kurakin et al. [15], and learns the adversarial example by
(9) 
where
is the loss function of a network with parameter
. is the clip function which ensures the generated adversarial example is within the ball of the original image with groundtruth label . is the iteration number and is the step size. MIFGSM was proposed by Dong et al. [6], and integrates the momentum term into the attack process to stabilize the update directions and escape from poor local maxima. At the th iteration, the accumulated gradient is calculated by:(10) 
where is the decay factor of the momentum term. The sign of the accumulated gradient is then used to generate the adversarial example, by
(11) 
4.2 Analysis of Ghost Networks
As analyzed above, in order to generate adversarial examples with good transferability, there are generally two requirements for the intrinsic models. First, each individual model should have a low test error. Second, different models should be as diverse as possible (i.e., converge at different local minima). To show the generated ghost networks are qualified for adversarial attack, we conduct an experiment on the whole ILSVRC 2012 validation set [5].
Descriptive Capacity. In order to quantitatively measure the descriptive capacity of the generated ghost networks, we plot the relationship between the magnitude of erosion and top1 classification accuracy.
We apply dropout erosion in Sec. 3.1 to nonresidual networks (Incv3 and Incv4) and skip connection erosion in Sec. 3.2 to residual networks (Res50, Res101, Res152 and IncResv2). Fig. 4 and Fig. 4 present the change curves of the dropout erosion and skip connection erosion, respectively.
It is not surprising to observe that the classification accurayies of different models are negatively correlated to the magnitude of erosion . By choosing the performance drop approximately equal to 10% as a threshold, we can determine the value of individually for each network. Although the performances of the ghost networks are slightly worse than those achieved by the independently trained base networks, the ghost networks still preserve low error rates. As emphasized throughout this paper, it is extremely cheap to generate a huge number of ghost networks.
In our following experiments, we set to , , , , and for Incv3, Incv4, Res50, Res101, Res152 and IncResv2, respectively unless otherwise specified.
Model Diversity. To measure the diversity, we use Res50 as the backbone model. We denote the base Res50 described in Sec. 4.1 as Res50A, and independently train two additional models with the same architecture, denoted by Res50B and Res50C. Meanwhile, we apply skip connection erosion to Res50A, then obtain three ghost networks denoted as Res50SA, Res50SB and Res50SC, respectively.
We employ the JensenShannon Divergence (JSD) as the evaluation metric for model diversity. Concretely, we compute the pairwise similarity of the output probability distribution (
i.e., the predictions after softmax layer) for each pair of networks as in
[12]. Given any image, let and denote the softmax outputs of two networks, then JSD is defined as(12) 
where is the average of and , i.e., .
is the KullbackLeibler divergence.
In Fig. 5, we report the averaged JSD for any pairs of networks over the ILSVRC 2012 validation set. As can be drawn, the diversity between ghost networks is comparable or even larger than independently trained networks.
Based on the analysis of descriptive capacity and model diversity, we can see that the generated ghost networks can provide accurate enough yet diverse descriptions of the data manifold, which is beneficial to learn transferable adversarial examples as we will experimentally prove below.
4.3 Singlemodel Attack
Attack  Methods  Res50  Res101  Res152  IncResv2  Incv3  Incv4  

W  B  W  B  W  B  W  B  W  B  W  B  
IFGSM [15]  Exp. S1  99.5  16.3  99.4  17.8  98.4  16.7  94.8  8.3  99.8  5.3  99.5  7.3 
Exp. S2  98.7  8.4  78.8  6.1  92.4  6.4  95.9  5.7  67.6  1.7  39.6  1.9  
Exp. S3 (ours)  99.7  23.4  99.6  23.7  99.4  21.1  96.5  11.2  97.0  6.3  86.8  10.0  
Exp. S4  99.6  28.8  99.7  29.9  99.6  25.6  98.7  13.1  98.9  6.3  96.2  9.3  
Exp. S5 (ours)  99.6  35.9  99.7  35.9  99.6  60.1  98.7  14.6  99.9  12.3  98.5  19.4  
MIFGSM [6]  Exp. S1  99.4  29.4  99.2  31.3  98.3  29.6  94.0  20.0  99.8  13.7  99.5  18.4 
Exp. S2  99.4  17.4  99.2  19.9  98.6  17.9  94.1  15.2  85.9  5.6  76.5  7.2  
Exp. S3 (ours)  99.7  39.4  99.8  40.1  99.5  38.0  95.9  26.8  98.0  17.6  90.6  22.4  
Exp. S4  99.6  44.5  99.7  43.2  99.3  41.9  98.5  30.4  99.7  17.9  98.3  25.6  
Exp. S5 (ours)  99.6  50.6  99.7  51.4  98.6  64.9  98.3  33.3  99.8  28.3  97.8  37.4 
Firstly, we evaluate the ghost networks in singlemodel attack, where attackers can only access one base model trained from real data. To demonstrate the effectiveness of our method, we design five experimental comparisons, as
Exp. S1: We attack the base model by the two attack methods (IFGSM [15] or MIFGSM [6]) as two baselines.
Exp. S2: We apply erosion to the base model and obtain one ghost network . Then, an adversarial attack is conducted to to generate the adversarial examples.
Exp. S3: We independently apply erosion times to get a pool of ghost networks , then utilize the proposed longitudinal ensemble to efficiently fuse them during adversarial attack.
Exp. S4: Similar to Exp. S3, the only difference is that we use the standard ensemble method proposed in [18] to fuse the ghost networks.
Exp. S5: ghost networks are generated, which are fused in a manner, that is, we do a standard ensemble of models for each iteration of attack and a longitudinal ensemble of models.
We attack normallytrained networks, and test on all the models ( adversariallytrained network are included). The attack rate is shown in Table 1. Due to the space limitation, we report the average performances for blackbox attack, rather than each individual performance on each testing model (all the individual cases are shown in Fig. 6 and the supplementary material.).
As can be drawn from Table 1, a single ghost network is worse than the base network (Exp. S2 vs. Exp. S1), due to the fact that the descriptive power of ghost network is inferior to base network. However, by leveraging the longitudinal ensemble, our work achieves a much higher attack rate at most settings, especially for the blackbox atack. For example, when attacking Res50 in blackbox attack, Exp. S3 outperforms Exp. S1 by with IFGSM and by with MIFGSM. This observation firmly demonstrates the effectiveness of ghost networks in learning transferable adversarial examples. It should be mentioned that the computational cost almost remains the same as Exp. S1 for two reasons. First, the ghost networks used in Exp. S3 are not trained but eroded from the base model and used onthefly. Second, multiple ghost networks are fused via the longitudinal ensemble, instead of the standard ensemble method in [18].
In fact, the proposed ghost networks can be also fused via the standard ensemble method, as shown in Exp. S4. In this case, we can report a higher attack rate at the sacrifice of computational efficiency. For instance, Exp. S4 reports attack rate of by attacking Res50 with IFGSM in blackbox setting, an improvement of over Exp. S3.
This observation, from another point of view, inspires us to combine the standard ensemble and the longitudinal ensemble as shown in Exp. S5. As we can see, Exp. S5 consistently beats all the compared methods in all the blackbox settings. Of course, Exp. S5 is as computational expensive as Exp. S4. However, the additional computational overhead stems from the standard ensemble, rather than longitudinal ensemble proposed in this work.
Note that in all the experiments presented in Table 1, we use only one individual base model. Even in the case of Exp. S3, all the tobefused models are ghost networks. However, the generated ghost networks are never stored or trained, which means no extra space complexity are needed. Therefore, one can clearly observe the benefit of ghost networks. Especially when comparing Exp. S5 and Exp. S1, ghost networks can achieve a dramatic improvement in blackbox attack.
Based on the experimental results above, we arrive a similar conclusion as [18], that is, the number of intrinsic models is essential to improve the transferability of adversarial examples. However, a different conclusion is that it is less necessary to independently train different models. Instead, ghost networks can be a computationally cheap alternative which can enable a good performance. When the number of intrinsic models increases, the attack rate will increase. We will further exploit this in multimodel attack.
In Fig. 6, we select two base models, i.e., Res50 and Incv3, to attack and present their individual performances when testing on all the base models. One can easily observe the positive effect of ghost networks on improving the transferability of adversarial examples.
4.4 Multimodel Attack
In this subsection, we evaluate ghost networks in multimodel setting, where attackers have access to multiple networks trained independently.
4.4.1 Same Architecture and Different Parameters
We first evaluate a simple setting of multimodel attack, where the base models share the same network architecture but have different weights. The same three Res50 models as in Sec. 4.2 are used, denoted as , and . Then, we denote the th ghost network generated upon as .
Exp. M1: A standard ensemble of the base model for three times. This is simply equivalent to singlemodel attack, which can serve as a weak baseline.
Exp. M2: A standard ensemble of the base models and and , which can serve as a strong baseline.
Exp. M3: A standard ensemble of and and , which simply replaces the base model in Exp. 1 with three ghost networks associated to it.
Exp. M4: A standard ensemble of and and , which replaces the base networks used in Exp. 2 with ghost networks, each one associated to a base model.
Exp. M5: ghost networks are generated upon the base model . They are fused in a manner, that is, we do a standard ensemble of models for each iteration of attack, and a longitudinal ensemble of models in total.
Exp. M6: For each base model, we generate ghost networks. At the th iteration of attack, we do a standard ensemble of , then do a longitudinal ensemble of for the th base model.
The adversarial example generated by each method are used to test all the models. We report the average attack rates in Table 2. It is easy to understand that Exp. M2 performs better than Exp. M1, Exp. M3 and Exp. M4 as it has three independently trained models. However, by comparing Exp. M5 with Exp. M2, we observe a significant improvement of attack rate. For example, By using MIFGSM as the attack method, Exp. M5 beats Exp. M2 by . Although Exp. M5 only has base model and Exp. M2 has , Exp. M5 actually fuses intrinsic models. Such a result further supports our previous claim that the number of intrinsic models is essential but it is less necessary to independently obtain them by training from scratch. Similarly, Exp. M6 yields the best performance as it has independently trained models and intrinsic models.
Methods  Attack Rate  Model Number  

IFGSM  MIFGSM  #Base  #Intrinsic  
Exp. M1  25.51  37.22  1  1 
Exp. M2  33.63  46.83  3  3 
Exp. M3  28.88  37.23  1  3 
Exp. M4  26.28  40.79  3  3 
Exp. M5  38.29  52.53  1  30 
Exp. M6  41.14  54.29  3  30 
Settings  Methods  Res50  Res101  Res152  IncResv2  Incv3  Incv4 

Ensemble  IFGSM  98.08  98.06  98.46  99.22  98.78  99.02 
IFGSM + ours  92.86  93.04  92.62  96.02  95.46  96.82  
MIFGSM  97.62  99.46  97.86  98.98  98.32  98.84  
MIFGSM + ours  93.98  93.88  93.66  96.96  95.92  97.08  
Holdout  IFGSM  71.08  71.16  67.92  46.60  59.98  50.86 
IFGSM + ours  80.22  79.80  77.02  60.20  73.18  67.84  
MIFGSM  79.32  79.14  77.26  64.24  72.22  66.64  
MIFGSM + ours  87.14  86.14  84.64  74.18  82.06  79.18  
Incv3_{ens3}  IFGSM  13.34  13.40  13.46  13.36  15.42  14.06 
IFGSM + ours  21.38  22.00  21.78  20.98  24.06  21.36  
MIFGSM  26.32  25.74  26.56  25.48  29.72  27.36  
MIFGSM + ours  34.10  34.50  35.00  33.78  39.78  36.64  
Incv3_{ens4}  IFGSM  7.10  6.96  6.92  6.54  8.22  7.30 
IFGSM + ours  11.30  11.74  11.56  10.10  12.98  10.98  
MIFGSM  13.96  13.52  13.68  12.72  16.50  14.80  
MIFGSM + ours  17.82  17.68  17.78  16.06  22.16  18.82  
IncResv2_{ens}  IFGSM  11.36  10.92  11.34  10.94  12.40  11.52 
IFGSM + ours  18.42  18.26  18.66  17.94  20.08  17.40  
MIFGSM  22.40  22.06  22.58  22.40  25.12  23.02  
MIFGSM + ours  29.32  28.98  29.58  29.00  32.60  30.48 
Methods  Blackbox Attack  Whitebox Attack  

TsAIL  iyswim  Anil Thomas  Average  Incv3_adv  IncResv2_ens  Incv3  Average  
No.1 Submission  13.60  43.20  43.90  33.57  94.40  93.00  97.30  94.90 
No.1 Submission+ours  14.80  52.28  51.68  39.59  97.62  96.00  95.48  96.37 
4.4.2 Different Architectures
Besides the baseline comparison above, we then evaluate ghost networks in the multimodel setting following [18]. We attack an ensemble of out of normallytrained models in this experiment, then test on the ensembled network (whitebox setting) and the holdout network (blackbox setting). We also test on the adversariallytrained networks to evaluate the transferability of the generated adversarial examples in blackbox attack.
The results are summarized in Table 3, our method achieves comparable attack rates on the ensembled network (whitebox setting) than IFGSM and MIFGSM. However, the performances in blackbox attack are significantly improved. For example, when holding out Res50, our method improves the performance of IFGSM from to , and that of MIFGSM from to . When testing on the three adversariallytrained networks, the improvement is more notable. These results further testify to the ability of ghost networks to learn transferable adversarial examples.
4.5 NIPS 2017 Adversarial Challenge
Finally, we evaluate our method in a benchmark test of the NIPS 2017 Adversarial Challenge [17]. For performance evaluation, we use the top3 defense submissions (blackbox models), i.e., TsAIL^{3}^{3}3https://github.com/lfz/GuidedDenoise, iyswim^{4}^{4}4https://github.com/cihangxie/NIPS2017_adv_challenge_defense and Anil Thomas^{5}^{5}5https://github.com/anlthms/nips2017/tree/master/mmd, and three official baselines (whitebox models), i.e., Incv3_{adv}, IncResv2_{ens} and Incv3. The test dataset contains images with the same 1000class labels as ImageNet [5].
Following the experimental setting of the No.1 attack submission [6], we attack on an ensemble of Incv3, IncResv2, Incv4, Res152, Incv3_{ens3}, Incv3_{ens4}, IncResv2_{ens} and Incv3_{adv} [16]. The ensemble weights are set to / equally for the first seven networks and / for Incv3_{adv}. The total iteration number is set to , and the maximum perturbation is randomly selected from . The step size .
The results are summarized in Table 4. Consistent with previous experiments, we observe that by applying ghost networks, the performance of the No. 1 submission can be significantly improved, especially with blackbox attack. For example, the average performance of blackbox attack is changed from to , an improvement of . The most remarkable improvement is achieved when testing on iyswim, where ghost networks leads to an improvement of . This suggests that our proposed method can generalize well to other defense mechanisms.
5 Conclusion
This work focuses on learning transferable adversarial examples for adversarial attack. We propose, for the first time, to exploit network erosion to generate a kind of virtual models called ghost networks. Ghost networks, together with the coupled longitudinal ensemble strategy, require almost no additional time and space consumption, therefore can be a rather efficient tool to improve existing methods in learning transferable adversarial examples. Extensive experiments (more in the supplementary material) have firmly demonstrated the efficacy of ghost networks. Note that the ghost networks in our work are generated by perturbing the dropout layer and skip connectection. However, it would be interesting to see the effect if other typical layers (e.g
[13], relu
[20]) in neural networks are perturbed. We leave these issues as future work.References

[1]
S. Baluja and I. Fischer.
Learning to attack: Adversarial transformation networks.
In AAAI, 2018. 
[2]
B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov,
G. Giacinto, and F. Roli.
Evasion attacks against machine learning at test time.
In ECMLPKDD, 2013.  [3] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In IEEE S&P, 2017.
 [4] L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. TPAMI, 40(4):834–848, 2018.
 [5] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, and L. FeiFei. Imagenet: A largescale hierarchical image database. In CVPR, 2009.
 [6] Y. Dong, F. Liao, T. Pang, H. Su, X. Hu, J. Li, and J. Zhu. Boosting adversarial attacks with momentum. In CVPR, 2018.
 [7] R. Girshick. Fast rcnn. In ICCV, 2015.
 [8] I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014.
 [9] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. In ICLR, 2015.
 [10] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
 [11] K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In ECCV, 2016.
 [12] G. Huang, Y. Li, G. Pleiss, Z. Liu, J. E. Hopcroft, and K. Q. Weinberger. Snapshot ensembles: Train 1, get m for free. In ICLR, 2017.
 [13] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML, 2015.
 [14] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
 [15] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial examples in the physical world. In ICLR Workshop, 2017.
 [16] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial machine learning at scale. In ICLR, 2017.
 [17] A. Kurakin, I. Goodfellow, S. Bengio, Y. Dong, F. Liao, M. Liang, T. Pang, J. Zhu, X. Hu, C. Xie, et al. Adversarial attacks and defences competition. arXiv preprint arXiv:1804.00097, 2018.
 [18] Y. Liu, X. Chen, C. Liu, and D. Song. Delving into transferable adversarial examples and blackbox attacks. In ICLR, 2017.
 [19] J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
 [20] V. Nair and G. E. Hinton. Rectified linear units improve restricted boltzmann machines. In ICML, 2010.
 [21] N. Papernot, F. Faghri, N. Carlini, I. Goodfellow, R. Feinman, A. Kurakin, C. Xie, Y. Sharma, T. Brown, A. Roy, A. Matyasko, V. Behzadan, K. Hambardzumyan, Z. Zhang, Y.L. Juang, Z. Li, R. Sheatsley, A. Garg, J. Uesato, W. Gierke, Y. Dong, D. Berthelot, P. Hendricks, J. Rauber, R. Long, and P. McDaniel. cleverhans v2.1.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768, 2018.
 [22] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. In EuroS&P, 2016.
 [23] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE S&P, 2016.
 [24] O. Poursaeed, I. Katsman, B. Gao, and S. Belongie. Generative adversarial perturbations. In CVPR, 2017.
 [25] S. Ren, K. He, R. Girshick, and J. Sun. Faster rcnn: Towards realtime object detection with region proposal networks. In NIPS, 2015.
 [26] K. Simonyan and A. Zisserman. Very deep convolutional networks for largescale image recognition. In ICLR, 2015.
 [27] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. JMLR, 15(1):1929–1958, 2014.

[28]
C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi.
Inceptionv4, inceptionresnet and the impact of residual connections on learning.
In AAAI, 2017.  [29] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.

[30]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna.
Rethinking the inception architecture for computer vision.
In CVPR, 2016.  [31] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In ICLR, 2014.
 [32] F. Tramèr, A. Kurakin, N. Papernot, D. Boneh, and P. McDaniel. Ensemble adversarial training: Attacks and defenses. In ICLR, 2018.
 [33] A. Veit and S. Belongie. Convolutional networks with adaptive inference graphs. In ECCV, 2018.
 [34] A. Veit, M. J. Wilber, and S. Belongie. Residual networks behave like ensembles of relatively shallow networks. In NIPS, 2016.
 [35] C. Xiao, B. Li, J.Y. Zhu, W. He, M. Liu, and D. Song. Generating adversarial examples with adversarial networks. In IJCAI, 2018.
 [36] C. Xie, Z. Zhang, J. Wang, Y. Zhou, Z. Ren, and A. Yuille. Improving transferability of adversarial examples with input diversity. arXiv preprint arXiv:1803.06978, 2018.
Comments
There are no comments yet.