ABBA: Saliency-Regularized Motion-Based Adversarial Blur Attack

02/10/2020 ∙ by Qing Guo, et al. ∙ 6

Deep neural networks are vulnerable to noise-based adversarial examples, which can mislead the networks by adding random-like noise. However, such examples are hardly found in the real world and easily perceived when thumping noises are used to keep their high transferability across different models. In this paper, we identify a new attacking method termed motion-based adversarial blur attack (ABBA) that can generate visually natural motion-blurred adversarial examples even with relatively high perturbation, allowing much better transferability than noise-based methods. To this end, we first formulate the kernel-prediction-based attack where an input image is convolved with kernels in a pixel-wise way, and the misclassification capability is achieved by tuning the kernel weights. To generate visually more natural and plausible examples, we further propose the saliency-regularized adversarial kernel prediction where the salient region serves as a moving object, and the predicted kernel is regularized to achieve naturally visual effects. Besides, the attack can be further enhanced by adaptively tuning the translations of object and background. Extensive experimental results on the NeurIPS'17 adversarial competition dataset validate the effectiveness of ABBA by considering various kernel sizes, translations, and regions. Furthermore, we study the effects of state-of-the-art GAN-based deblurring mechanisms to our methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 3

page 4

page 5

page 6

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: Two adversarial examples of MIFGSM [Dong et al.2018], Gaussian Blur [Rauber, Brendel, and Bethge2017] and our ABBA. MIFGSM produces apparent noise on all 3 cases. Gaussian blur-based method loses most of the details carrying out the attack. Our ABBA generates visually natural motion blur with high attack success rate.

To this date, deep learning empowered applications have permeated all aspects of our daily life from autonomous driving / ADAS which utilizes computer vision capabilities such as object recognition / segmentation, scene understanding, and 3D point cloud processing,

etc., to our everyday interactions with smart devices using speech recognition, natural language understanding, face / gesture recognition, etc. Together with increasingly accessible labeled training data, faster parallel computing devices, and more sophisticated deep learning model designs by virtue of AutoML and neural architecture search (NAS), the deep learning based prediction models are improving at an unprecedented fast pace.

However, perhaps one of the newly identified ‘Achilles’ heel’ of deep learning is the theoretically obtainable adversarial examples. Take object recognition task for example, when having access to the network parameters and architectures, the perpetrator can perform white-box adversarial attack by adding small noise perturbation to the input image. Such an additive noise perturbation can be optimally obtained by satisfying the adversarial objective, i.e., leading towards erroneous network output decision. Most commonly used white-box adversarial attack methods including the basic iterative method (BIM) [Kurakin, Goodfellow, and Bengio2017], the C&W method [Carlini and Wagner2017], the fast gradient signed method (FGSM) [Goodfellow, Shlens, and Szegedy2014], and the momentum iterative fast gradient sign method (MI-FGSM), etc.

If the attacker is able to drastically alter the content of the image, it is quite obvious that the network will easily output erroneously. To account for a successful adversarial attack, the perturbation noise has to be small in magnitude as measure by the -norm of the noise, and hopefully imperceptible to human. Therefore, the catch is to maximize the attack success rates while maintaining the stealth nature of the attack noise. For this reason, current state-of-the-art adversarial noises can only be obtained theoretically and rarely in natural environment, which does not yet pose imminent threat to the camera systems. There has been some attempts [Kurakin, Goodfellow, and Bengio2017] to physically fashion adversarial examples such as by putting up stickers or printed patterns on the stop sign, etc. Again, these artifacts for adversarial attacks are unlikely to be found ‘naturally’ in the real-world environment and during the natural image capturing and processing steps.

In this work, we have proposed a new type of adversarial attack, which is termed motion-based adversarial blur attack (ABBA), that can generate visually natural and plausible motion blurred adversarial examples.

More specifically, we first formulate the kernel-prediction-based attack where an input image is convolved with kernels in a pixel-wise way, and the misclassification ability is achieved by tuning the kernel weights. In order to generate more natural and imperceptible examples, we further propose the saliency-regularized adversarial kernel prediction where the salient region serves as a moving object, and the predicted kernel is regularized to achieve naturally visual effects. Besides, the attack can be further enhanced by adaptively tuning the translations of object and background. Such saliency-regularized motion blur happens only locally and consistently and although it has a larger perturbation magnitude within the bound of an object, due to the blur effect, the overall image is outstandingly visually realistic and indistinguishable from an actual moving object captured by a slightly longer exposure, a scenario that can actually happen in the real world.

We have showcased the effectiveness of the proposed ABBA through extensive experiments, benchmarked against various noise-based attacks on both attack success rates and transferability. The contributions of this work can be briefly summarized as follows. (1) To the best of our knowledge, this is the very first attempt to investigate kernel-based adversarial attack. (2) We have proposed a motion-based adversarial blur attack as a new type of attack mode to be added to the adversarial attack family. (3) In order to produce more visually plausible blur attack, we have introduced a saliency regularizer that forces consistent blur patterns within the boundary of the objects (or background in some cases). (4) Under established optimal settings of various baseline methods and ours, the proposed method achieves better attack success rate and transferability. (5) Furthermore, our proposed method has showcased higher robustness against GAN-based deblur mechanism, as compared to standard image motion blur.

Related Work: Since the inception of using adversarial examples to attack a deep neural network both theoretically [Goodfellow, Shlens, and Szegedy2014] and physically [Kurakin, Goodfellow, and Bengio2017], there has been huge amount of research towards developing adversarial attack and defense mechanisms. The basic iterative method (BIM) [Kurakin, Goodfellow, and Bengio2017], the C&W method [Carlini and Wagner2017], the fast gradient signed method (FGSM) [Goodfellow, Shlens, and Szegedy2014], and the momentum iterative fast gradient sign method (MI-FGSM) [Dong et al.2018], etc., are a few popular ones among early adopters in the research community. Building upon these ideas, researchers have been continuously pushing the envelope in many ways. For example, serious attempts have been made to integrate momentum term into the iterative process for the attacks [Dong et al.2018]. By doing so, the momentum can help stabilize the update directions which begets more transferable adversarial examples and poses more threats to adversarially trained defense mechanism. More recently, [Dong et al.2019] propose to optimize the noise perturbation over an ensemble of translated images, making the generated adversarial examples more robust against white-box models being attacked while achieving better transferability.

Mainstream adversarial attack is an additive noise pattern which is learnable given the model parameters under a whitebox setting. Perhaps the prevalence is partially due to the fact that the adversarial noise with the ‘addition’ operation is relatively straightforward to optimize for. Of course, there are many other ways to alter a clean image beyond the addition operation which are all potential candidate for coming up with new types of adversarial attack modes. One caveat of additive noise attack is the lack of balance between being visually plausible and imperceptible while having high attack success rate. Usually it has to sacrifice one for the other. Researchers are looking beyond additive noise attack to seek novel attack modes that strike a better balance between visual plausibility and performance. For example, multiplicative attack [Ye et al.2019], deformation attack [Xiao et al.2018, Alaifari, Alberti, and Gauksson2019, Wang et al.2019], and semantic manipulation attack [Bhattad et al.2020], etc.

We are proposing a new type of motion-based adversarial blur attack that can generate visually natural and plausible motion blurred adversarial examples. We draw inspirations from kernel prediction work such as [Mildenhall et al.2018, Niklaus, Mai, and Liu2017a, Niklaus, Mai, and Liu2017b] as well as motion blur generation [Brooks and Barron2019]. Technical details will be presented in the next section. Also, one desired property of the proposed method is the immunity or robustness against the SOTA deblurring techniques such as [Kupyn et al.2018, Kupyn et al.2019]. In an effort to understand blackbox network better, an image saliency paradigm is proposed [Fong and Vedaldi2017] to learn where an algorithm looks by discovering which parts of an image most affect output score when perturbed in terms of Gaussian blur, replacing the region with constant value and injecting noise. The localization of blur region is done through adaptive iteration whilst ours is saliency regularized which leads to visually plausible motion blurred image. Perhaps the major difference is that their Gaussian blur kernel is fixed while ours is learnable to maximally jeopardize the image recognition network.

2 Methodology

2.1 Additive-Perturbation-Based Attack

Let be a real example, e.g

., images in the ImageNet dataset, and

denotes its ground truth label. A classifier denoted as

predicts the label of . An attack method aims to generate an adversarial example denoted as that can fool the classifier to predict an incorrect label with imperceptible perturbation. Existing attack methods mainly focus on the additive adversarial perturbation that is added to the real example to get

(1)

where

is generated by maximizing a loss function

with an constrained term

(2)

where is the norm and the norm is widely used. As an optimization problem, the gradient descent method is commonly employed to solve the problem and inspire several successful attack methods, e.g., FGSM, BIM, MIFGSM, DIM, TIMIFGSM, etc.

2.2 Kernel-Prediction-Based Attack

Figure 2: From left to right: original image, adversarial examples generated by our kernel-prediction-based attack and motion-based adversarial blur attack (ABBA).

Besides the addition in Eq. (1), there are various techniques can process images for different objectives, e.g., Gaussian filter for image denoising, Laplacian filter for image sharpening, and guided filter for edge-preserving smooth [He, Sun, and Tang2013], which are all kernel-based techniques that processes each pixel of the image with a hand-crafted or guided kernel. In general, compared with the addition, the kernel-based operation can realize more image processing tasks via different kinds of kernels.

More recently, several works [Bako et al.2017, Niklaus, Mai, and Liu2017a, Niklaus, Mai, and Liu2017b] found that the kernel weights can be carefully predicted for more advanced tasks with high performance, e.g

., high quality nosie-free rendering and video frame interpolation. Inspired by these works, we propose the kernel-prediction-based attack. Specifically, we process each pixel with a kernel

(3)

where denotes the th pixel in or , is set of pixels in and . The kernel has the size of and determines the weights of pixels in . In general, we have

to ensure the generated image lies within the respective neighborhood of the input image and a softmax activation function is usually adopted for this requirement 

[Bako et al.2017].

To better understand Eq. (3), we discuss two simple cases: 1) when we let be a square neighborhood of the pixel and the kernel of each pixel is a fixed Gaussian kernel, is the Gaussian-blurred . Similarly, we can obtain defocus-blurred image with disk kernel. 2) when we set and , the perturbation of becomes more imperceptible as length of decreases.

To achieve high attack success rate, we need to optimize all kernels, i.e., , according to the loss function of image classification and a constrained term

(4)

We can calculate the gradient of the loss function with respect to all kernels and realize the gradient-based whitebox attack. As a result, the attack method can be integrated into any gradient-based additive-perturbation attack methods, e.g., FGSM, BIM, MIFGSM, etc.

The kernel-prediction-based attack can obtain a significantly high success rate on blackbox attacks, which, however, leads to easily perceptible unnatural image, since the kernel of each pixel is tuned independently and generate noise-like perturbation. To reach the balance between high attack success rate and natural visual effect, we propose to regularize the kernels to produce visually natural motion blur via the guidance of a visual saliency map.

2.3 Motion-Based Adversarial Blur Attack

Motion blur is a frequently occurring effect during image capture. We will first introduce how to generate visually natural motion-blurred adversarial examples by regularizing the kernel-prediction-based attack. Then we further show the workflow of our attack in Fig. 3 for better understanding.

2.3.1 Saliency-Regularized Adversarial Kernel Prediction

Figure 3: Pipeline of our motion-based adversarial blur attack, i.e., Eq. (2.3.1). First, We use the PoolNet [Liu et al.2019] to extract the salient object in the image and obtain the object and background regions. Then, the translation parameters and are divided to parts to simulate the motion process and generate

images with the Spatial Transformer Network 

[Jaderberg et al.2015]. Finally, we get the adversarial example by adding these images with adversarial kernels as weights for each pixel. The adversarial kernels and translation parameters are tuned to realize effective whitebox attack by optimizing Eq. (6). denotes the element-wise product.

Motion blur is generated during the exposure time by integrating the light from a moving object. To synthesize the motion blur, we should know where is the object and specify how it moves.

To this end, given an image, we first use the state-of-the-art saliency detection method, i.e., PoolNet [Liu et al.2019], to extract the salient object from and assume it is moving at the time of capturing the image. The saliency map is a binary image and indicates the region of the salient object, as shown in Fig. (3). We then specify translation transformations to the object and background, respectively, which are denoted as and that are simplified as and where and are the translation parameters. Since the motion blur is the integration of all light during the object moving process, we divide the motion represented by and into sub-motions to simulate blur generation. The sub-motions are represented by and . Then, we redefine Eq. (3) as

(5)

where contains the th pixel in all translated examples. Compared with the attack in Section 2.2, the perturbation amplitude is affected by the kernel and translation parameters. We define the objective function as

(6)

where controls the maximum translations of the object / background and is upper bounded by where is the dimension of the input image. Here, we use the spatial transformer network [Jaderberg et al.2015] to realize translation according to and , enabling the gradient propagate to all kernels. There are two main differences about the constrained terms c.f. Eq. (4): 1) The translation parameters are added to guide the generation of the adversarial example. 2) The kernels are set to be the same within the same region, which is needed to generate visually natural motion blur, since pixels in the object region usually have the same motion.

2.3.2 Attacking Algorithm

In this section, we summarize the workflow of the attacking algorithm which is shown in Fig. 3. Given an image, we first calculate its saliency map via PoolNet and obtain . Then, we initialize and set all kernels by which makes .

The attack method can be integrated into any gradient-based additive-perturbation attack methods, e.g., FGSM, BIM, MIFGSM, etc. In this work, we not only show the effectiveness of the attack but also evaluate the transferability across models. Hence, we equip our method in MIFGSM which is more effective in achieving high transferability across models. The gradient is calculated from the current image . Then, we propagate the gradient through the spatial transformer network and obtain the gradient , and . The kernel and translation parameters are updated for realizing iteration attack. In the following experimental parts, we implement our methods by setting the hyper-parameters in Eq. (6) as with the iteration 10 and step size . We fix the kernel size to be 15.0. We will discuss the effect of these parameters in the next section.

3 Experimental Results

In this section, we conduct extensive experiments to demonstrate the effectiveness of the proposed method and cover the following research questions: RQ1: How do the kernel sizes, translations, motion directions, and blur regions affect the attacking success rate? RQ2: If the transferability of the proposed method across models is better than state-of-the-art attack methods and why? RQ3: As a blurring method, if state-of-the-art deblurring methods could defend it easily?

Figure 4: The success rate of ABBA w.r.t. the variation of both and in Eq. (6) where is within with step size and is in with step size .

3.1 Experimental Settings

Dataset and Models: We use NeurIPS’17 adversarial competition dataset [Kurakin et al.2018], compatible with ImageNet, to conduct all experiments. To validate the transferability of proposed method, we consider four normally trained models, i.e., Inception v3 (Inc-v3) [Szegedy et al.2016], Inception v4 (Inc-v4), Inception ResNet v2 (IncRes-v2) [Szegedy et al.2017], and Xception [Chollet2017]. In addition, we also compare on four defense models: Inc-v3, Inc-v3, and IncRes-v2 from [Tramèr et al.2018] and high-level representation guided denoiser (HGD) [Liao et al.2018] which is the rank 1 submission in NeurIPS’17 defense competition. All adversarial examples are generated from Inc-v3 treating it as a whitebox model.

Baselines: We consider two kinds of baselines. The first one is additive-perturbation-based attacks, e.g., FGSM [Goodfellow, Shlens, and Szegedy2014], MIFGSM [Dong et al.2018], DIM [Xie et al.2019a], TIMIFGSM [Dong et al.2019], and interpretation-based noise [Fong and Vedaldi2017]. The results of DIM and TIMIFGSM are cited from [Dong et al.2019], since we use the same dataset with them. The second kind contains three blur-based attacks including the interpretation-based blur [Carlini and Wagner2017], Gaussian blur [Rauber, Brendel, and Bethge2017] and Defocus blur.

For the setup of hyperparameters of first group attacks, we set the maximum perturbation to be

among all experiments with pixel values in which is the default setup of foolbox [Rauber, Brendel, and Bethge2017]. For iterative attack methods, we set the iteration number to be 10 and step size as 0.03. The weight decay factor for MIFGSM is set as 1.0. For blur-based baselines, we set the standard variation of Gaussian blur and the kernel size of Defocus blur to be 15.0, which is the same with our method for a fair comparison.

3.2 RQ1: Effect of Kernel Size, Translations, Motion Direction and Blur Regions

Effect of blur kernel size and translations: we calculate the success rate of our method with different and in Eq. (6). Specifically, we try with the range and in . As shown in Fig. 4, the success rates for the whitebox and blackbox attacks are gradually higher as the increase of and . The highest success rate is , , and on Inc-v3, Inc-v4, IncRes-v2, and Xception, respectively, when we set and . The success rate becomes zero when and . We also visualize adversarial examples of an image that has been successfully attacked on all and . Clearly, as and increase, the visual effects of adversarial examples are gradually worse and the perturbations are more ease to be perceived. According to numerous attacking on different images, we choose the and at the following experiments to balance the success rate and visual effects.

Figure 5: Up: two examples of ABBA, ABBA, ABBA, ABBA, and ABBA. Bottom: Success rates of our method with respect to the object motion directions.

Effect of motion directions: we fix and and tune the motion direction of object and background by setting different x-axis and y-axis translations. For each object motion direction, we calculate the mean and standard variation of the success rates on different background moving directions. As shown in Fig. 5 (B), the success rate gradually increases as the object motion direction becomes larger in while decreasing as the direction is smaller in . The success rate variation has symmetrical trend in the range of . Such results are caused by the used for constraining the translation where the motion direction is directly related to the translation. For example, with the motion direction of and the constraints of x-axis and y-axis translation , the allowed translation of object is that is the maximum allowed translation among different motion directions. As a result, the success rate reach the highest value around .

Effect of blurred regions and importance of adaptive translations: we construct five variants of our method: 1) ABBA allows the kernel of each pixel to be tuned independently and is a kind of kernel-prediction-based attack defined in Eq. (3) and (4). 2) ABBA only adds motion blur to the object region by fixing the kernels of background pixels to be . 3) ABBA only adds motion blur to the background region by fixing the kernels of object pixels to be . 4) ABBA adds motion blur to the whole image while forcing object and background to share the kernels and translations. 5) ABBA is our final version where the translations, i.e., , and kernels are jointly tuned by optimizing Eq. (6).

As reported in Tab. 1 and cases shown in Fig. 5 (U), ABBA gets the highest whitebox and blackbox success rate among all variants, which, however, changes the original image apparently and looks unnatural. ABBA and ABBA have the worst success rate on all models although they tend to generate visually natural motion blur. ABBA and ABBA have good balance between the attack success rate and visual effects. In particular, ABBA that jointly tunes the object and background translations can obtain much better transferability across normal trained and defense-based models. Note, when compared with the results using fixed motion directions in Fig. 5 (B), ABBA has the highest success rate among all motion direction, which further demonstrates that adaptive translations help achieve effective attack.

3.3 RQ2: Comparison with Baselines

Attack results: The comparison results over baselines are shown in Tab. 1. We discuss them in two aspects: 1) the comparison with additive-perturbation-based attacks. 2) the advantages over existing blur-based attacks.

For the first aspect, compared with early additive-perturbation-based attacks, e.g., FGSM, MIFGSM, and Interp, our method ABBA achieves the highest success rate on all blackbox models and defences models, which demonstrate the higher transferability of ABBA over these methods. We also compare them for attacking object or background regions. Clearly, the success rates of all methods decrease significantly and the decreasing values of our method are larger than others, which means our method relies on more image information to realize effective attacking. Compared with two state-of-the-art methods, i.e., DIM and TIMIFGSM, ABBA has lower transferability when attacking Inc-v4 and IncRes-v2 while having higher success rate than DIM on all defence models and TIMIFGSM on Inc-v3 and IncRes-v2. Note, if we allow our method to tune kernel for each pixel independently, our method, i.e., ABBA, outperforms DIM and TIIMIFGSM on all defence models.

For the second aspect, ABBA gets higher success rate than GaussBlur, DefocusBlur, and Interp on all models. The high success rate is achieved by optimizing the pixel-wise kernel with Eq. (6) while the kernels of the three baselines are shared by all pixels and cannot be tuned.

Attacking results for adv. examples from Inc-v3 Defence results to adv. examples from Inc-v3
Inc-v3 Inc-v4 IncRes-v2 Xception Inc-v3 Inc-v3 IncRes-v2 HGD
90.1 20.7 13.8 26.6 18.3 16.1 11.5 8.2
58.6 11.9 7.0 12.6 10.9 12.0 7.3 3.5
63.7 11.4 6.9 13.5 12.2 12.9 7.6 2.7
95.8 23.9 20.6 28.2 17.7 16.2 9.6 2.9
81.3 11.8 7.4 11.5 11.1 12.3 6.1 1.8
95.5 13.6 10.3 16.7 13.9 13.3 8.8 1.9
34.7 22.7 18.4 26.1 23.6 23.8 19.3 16.9
13.6 6.0 5.2 7.1 8.6 7.8 6.3 4.6
18.8 10.8 9.2 12.0 13.0 13.1 10.9 8.7
30.0 16.8 11.1 18.8 17.5 18.3 15.0 12.9
10.0 3.0 2.9 3.6 5.2 4.6 3.8 2.7
16.9 9.2 7.0 10.5 10.1 10.3 9.2 7.8
ABBA 89.2 65.5 65.8 71.2 69.8 72.5 68.0 63.1
ABBA 21.0 4.9 4.2 7.0 10.1 10.5 8.3 4.9
ABBA 30.9 11.6 10.1 12.9 1.2 0.8 1.2 0.5
ABBA 62.4 29.8 28.8 34.1 43.2 43.8 38.9 28.4
ABBA 65.6 31.2 29.7 33.5 46.6 48.7 41.2 31.0
34.7 3.6 0.5 3.4 7.1 7.1 4.3 1.4
95.8 20.5 15.6 22.9 16.8 16.1 9.4 3.3
98.3 73.8 67.8 - 35.8 35.1 25.8 25.7
97.8 47.1 46.4 - 46.9 47.1 37.4 38.3
Table 1: Adversarial comparison results on NeurIPS’17 adversarial competition dataset according to the success rate. There are two comparison groups. For the first one, we compare the effects of attacking different regions, i.e., object or background regions, of inputs for FGSM, MIFGSM, GaussBlur, DefocusBlur, and our ABBA. In addition to above methods, the second group comparison contains Interpretation-based blur (), Interpretation-based noise ([Fong and Vedaldi2017], DIM [Xie et al.2019a], and TIMIFGSM [Dong et al.2019]. The results of DIM and TIMIFGSM are cited from [Dong et al.2019] where the Xception model is not included. We highlight the top three results with red , green , and blue , respectively.

Interpretable explanation of high transferability:

Figure 6: The left subfigure shows the interpretable maps of six adversarial examples generated by FGSM, MIFGSM, and ABBA, respectively, with four models. The right subfigure shows the transferability & consistency distributions of adversarial examples generated by the three attacks.

We modify the method in [Fong and Vedaldi2017] that generates an interpretable map for a classification model with a given perturbation. Then, we observe that the transferability of an adversarial example generated by an attack correlates with the consistency of interpretable maps of different models. Specifically, given an adversarial example generated by an attack and the original image , we can calculate an interpretable map for by optimizing

(7)

where denotes the score at label that is the ground truth label of and is the total-variation norm. Intuitively, optimizing Eq. (7) is to find the region that causes misclassification. We optimize Eq. (7) via gradient decent in 150 iterations and fix and .

We can calculate four interpretable maps for each adversarial example base on four models, i.e., Inc-v3, Inc-v4, IncRes-v2, and Xception, as shown in Fig. 6(L). We observe that the interpretable maps of our method have similar distributions across the four models while the maps of FGSM and MIFGSM do not exhibit this phenomenon. To further validate this observation, we calculate the standard variation across the four maps at each pixel and get a value by mean pooling. We normalize the value and regard it as the consistency measure for the four maps. As shown in Fig. 6(R), the consistency of adversarial examples of our method is generally higher than that of FGSM and MIFGSM.

We further study the transferability of an adversarial example across models. Given an adversarial example from Inc-v3 and a model , we calculate a score to measure the transferability under this model: where is the label having maximum score among non-ground-truth labels. If means the adversarial example fool successfully, and vice versa. As shown in Fig. 6(R), the transferability of adversarial examples of our method is generally higher than that of FGSM and MIFGSM.

Base on the observations in Fig. 6(R), we conjecture that higher consistency across models leads to higher transferability. This is reasonable since the high consistency of interpretable maps means the adversarial example has the same effect to different models and the optimized perturbation on the whitebox model can be easily transferred to other models.

3.4 RQ3: Effect of Deblurring Methods

Figure 7: The relative decrease of attack success rate before and after deblurring. Two state-of-the-art deblurring methods, i.e., DeblurGAN [Kupyn et al.2018] and DeblurGANv2 [Kupyn et al.2019], are used to deblur the our adversarial blurred examples and normal motion blurred images.

Here, we discuss the effect of SOTA deblurring methods to our adversarial examples and ‘normal’ motion blurred images. The ‘normal’ motion blur is usually synthesized by averaging neighbouring video frames [Nah, Hyun Kim, and Mu Lee2017], which is equivalent to set the kernel weights as to average the translated examples in Eq. (2.3.1). We can regard such normal motion blur as an attack and add them to all images in the testing data. The corresponding success rate can be calculated. We use DeblurGAN [Kupyn et al.2018] and DeblurGANv2 [Kupyn et al.2019] to handle our adversarial examples and the normal motion blurred images and calculate the relative decrease of the success rate, i.e., where and represent the success rate before and after deblurring.

Smaller means the attack is more robust to deblurring methods. As shown in Fig. 7, in general, compared with the normal motion blurred images, our adversarial examples are harder to be handled by the state-of-the-art deblurring methods. For DeblurGANv2, the relative decrease of normal motion blur is usually larger than 0.5 when kernel size is in [15, 35], which means DeblurGANv2 can effectively remove the normal motion blur and improve the classification accuracy. In contrast, the of our method is always smaller than the of normal motion blur and gradually decreases in the range of [15, 35], which means it is more difficult to use DeblurGANv2 to defend our attack as kernel size is larger. We can find similar results on DeblurGAN where the difference of between our method and normal motion blur is much smaller than that on DeblurGANv2. When kernel size equals to 5 or 10 which leads the light motion blur, we usually have for the normal motion blur, which means deblurring can improve its attack success rate and harm the image classification accuracy. Similar results are also observed in [Guo et al.2019a].

4 Conclusions

We have identified and investigated a new type of adversarial attack mode based on motion blur, which is termed motion-based adversarial blur attack (ABBA). We first propose the kernel-prediction-based attack that processes each pixel with a kernel that can be optimized by the supervision of the misclassification objective. Base on this, we further propose the saliency-regularized adversarial kernel prediction to make the motion blurred image visually more natural and plausible. Our method has achieved high success rate and transferability on blackbox and defense models with visually natural motion blur on the NeurIPS’17 adversarial competition dataset.

In the future, we can adapt our method to other tasks, e.g., visual object tracking [Guo et al.2019b, Guo et al.2020, Feng et al.2019, Guo et al.2017b, Guo et al.2017a], or other learning techniques, e.g

., deep reinforcement learning 

[Sun et al.2020]. In addition, this work may also help uncover the disagreements or uncertainty of deep neural networks on different motion blur levels [Xie et al.2019b, Zhang et al.2020].

References

  • [Alaifari, Alberti, and Gauksson2019] Alaifari, R.; Alberti, G. S.; and Gauksson, T. 2019. ADef: an iterative algorithm to construct adversarial deformations. In ICLR.
  • [Bako et al.2017] Bako, S.; Vogels, T.; McWilliams, B.; Meyer, M.; Novák, J.; Harvill, A.; Sen, P.; DeRose, T.; and Rousselle, F. 2017. Kernel-predicting convolutional networks for denoising monte carlo renderings. ACM Trans. Graph. 36(4):97:1–97:14.
  • [Bhattad et al.2020] Bhattad, A.; Chong, M. J.; Liang, K.; Li, B.; and Forsyth, D. 2020. Unrestricted adversarial examples via semantic manipulation. In ICLR.
  • [Brooks and Barron2019] Brooks, T., and Barron, J. T. 2019. Learning to synthesize motion blur. In CVPR, 6833–6841.
  • [Carlini and Wagner2017] Carlini, N., and Wagner, D. 2017. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), 39–57.
  • [Chollet2017] Chollet, F. 2017. Xception: Deep learning with depthwise separable convolutions. In CVPR.
  • [Dong et al.2018] Dong, Y.; Liao, F.; Pang, T.; Su, H.; Zhu, J.; Hu, X.; and Li, J. 2018. Boosting adversarial attacks with momentum. In CVPR, 9185–9193.
  • [Dong et al.2019] Dong, Y.; Pang, T.; Su, H.; and Zhu, J. 2019. Evading defenses to transferable adversarial examples by translation-invariant attacks. In CVPR, 4307–4316.
  • [Feng et al.2019] Feng, W.; Han, R.; Guo, Q.; Zhu, J. K.; and Wang, S. 2019. Dynamic saliency-aware regularization for correlation filter based object tracking. IEEE TIP.
  • [Fong and Vedaldi2017] Fong, R. C., and Vedaldi, A. 2017. Interpretable explanations of black boxes by meaningful perturbation. In ICCV, 3449–3457.
  • [Goodfellow, Shlens, and Szegedy2014] Goodfellow, I. J.; Shlens, J.; and Szegedy, C. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.
  • [Guo et al.2017a] Guo, Q.; Feng, W.; Zhou, C.; Huang, R.; Wan, L.; and Wang, S. 2017a. Learning dynamic Siamese network for visual object tracking. In ICCV.
  • [Guo et al.2017b] Guo, Q.; Feng, W.; Zhou, C.; Pun, C.; and Wu, B. 2017b. Structure-regularized compressive tracking with online data-driven sampling. IEEE TIP 26(12):5692–5705.
  • [Guo et al.2019a] Guo, Q.; Feng, W.; Chen, Z.; Gao, R.; Wan, L.; and Wang, S. 2019a. Effects of blur and deblurring to visual object tracking. In arXiv:1908.07904.
  • [Guo et al.2019b] Guo, Q.; Xie, X.; Ma, L.; Li, Z.; Xue, W.; Feng, W.; and Liu, Y. 2019b. Spark: Spatial-aware online incremental attack against visual tracking. In arXiv:1910.08681.
  • [Guo et al.2020] Guo, Q.; Han, R.; Feng, W.; Chen, Z.; and Wan, L. 2020. Selective spatial regularization by reinforcement learned decision making for object tracking. IEEE TIP 29:2999–3013.
  • [He, Sun, and Tang2013] He, K.; Sun, J.; and Tang, X. 2013. Guided image filtering. TPAMI 35(6):1397–1409.
  • [Jaderberg et al.2015] Jaderberg, M.; Simonyan, K.; Zisserman, A.; and kavukcuoglu, k. 2015. Spatial transformer networks. In NIPS.
  • [Kupyn et al.2018] Kupyn, O.; Budzan, V.; Mykhailych, M.; Mishkin, D.; and Matas, J. 2018.

    Deblurgan: Blind motion deblurring using conditional adversarial networks.

    In CVPR.
  • [Kupyn et al.2019] Kupyn, O.; Martyniuk, T.; Wu, J.; and Wang, Z. 2019. Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In ICCV.
  • [Kurakin et al.2018] Kurakin, A.; Goodfellow, I.; Bengio, S.; Dong, Y.; Liao, F.; Liang, M.; Pang, T.; Zhu, J.; Hu, X.; Xie, C.; et al. 2018. Adversarial attacks and defences competition. CoRR.
  • [Kurakin, Goodfellow, and Bengio2017] Kurakin, A.; Goodfellow, I.; and Bengio, S. 2017. Adversarial examples in the physical world. arXiv:1607.02533.
  • [Liao et al.2018] Liao, F.; Liang, M.; Dong, Y.; Pang, T.; Hu, X.; and Zhu, J. 2018. Defense against adversarial attacks using high-level representation guided denoiser. In CVPR.
  • [Liu et al.2019] Liu, J.-J.; Hou, Q.; Cheng, M.-M.; Feng, J.; and Jiang, J. 2019. A simple pooling-based design for real-time salient object detection. In CVPR.
  • [Mildenhall et al.2018] Mildenhall, B.; Barron, J. T.; Chen, J.; Sharlet, D.; Ng, R.; and Carroll, R. 2018. Burst denoising with kernel prediction networks. In CVPR, 2502–2510.
  • [Nah, Hyun Kim, and Mu Lee2017] Nah, S.; Hyun Kim, T.; and Mu Lee, K. 2017.

    Deep multi-scale convolutional neural network for dynamic scene deblurring.

    In CVPR, 3883–3891.
  • [Niklaus, Mai, and Liu2017a] Niklaus, S.; Mai, L.; and Liu, F. 2017a. Video frame interpolation via adaptive convolution. In CVPR, 2270–2279.
  • [Niklaus, Mai, and Liu2017b] Niklaus, S.; Mai, L.; and Liu, F. 2017b. Video frame interpolation via adaptive separable convolution. In ICCV, 261–270.
  • [Rauber, Brendel, and Bethge2017] Rauber, J.; Brendel, W.; and Bethge, M. 2017.

    Foolbox: A python toolbox to benchmark the robustness of machine learning models.

  • [Sun et al.2020] Sun, J.; Zhang, T.; Xie, X.; Ma, L.; Zheng, Y.; Chen, K.; and Liu, Y. 2020. Stealthy and efficient adversarial attacks against deep reinforcement learning. In AAAI.
  • [Szegedy et al.2016] Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; and Wojna, Z. 2016. Rethinking the inception architecture for computer vision. In CVPR.
  • [Szegedy et al.2017] Szegedy, C.; Ioffe, S.; Vanhoucke, V.; and Alemi, A. A. 2017.

    Inception-v4, inception-resnet and the impact of residual connections on learning.

    In AAAI.
  • [Tramèr et al.2018] Tramèr, F.; Kurakin, A.; Papernot, N.; Boneh, D.; and McDaniel, P. D. 2018. Ensemble adversarial training: Attacks and defenses. In ICLR.
  • [Wang et al.2019] Wang, R.; Juefei-Xu, F.; Xie, X.; Ma, L.; Huang, Y.; and Liu, Y. 2019. Amora: Black-box adversarial morphing attack. arXiv preprint arXiv:1912.03829.
  • [Xiao et al.2018] Xiao, C.; Zhu, J.-Y.; Li, B.; He, W.; Liu, M.; and Song, D. 2018. Spatially transformed adversarial examples. In ICLR.
  • [Xie et al.2019a] Xie, C.; Zhang, Z.; Zhou, Y.; Bai, S.; Wang, J.; Ren, Z.; and Yuille, A. L. 2019a. Improving transferability of adversarial examples with input diversity. In CVPR.
  • [Xie et al.2019b] Xie, X.; Ma, L.; Wang, H.; Li, Y.; Liu, Y.; and Li, X. 2019b. Diffchaser: Detecting disagreements for deep neural networks. In IJCAI.
  • [Ye et al.2019] Ye, S.; Tan, S. H.; Xu, K.; Yanzhi Wang, C. B.; and Ma, K. 2019. Brain-inspired reverse adversarial examples. In arXiv:1905.12171.
  • [Zhang et al.2020] Zhang, X.; Xie, X.; Ma, L.; Du, X.; Hu, Q.; Liu, Y.; Zhao, J.; and Sun, M. 2020. Towards characterizing adversarial defects of deep learning software from the lens of uncertainty. In ICSE.