Region-Wise Attack: On Efficient Generation of Robust Physical Adversarial Examples

12/05/2019 ∙ by Bo Luo, et al. ∙ The Chinese University of Hong Kong 0

Deep neural networks (DNNs) are shown to be susceptible to adversarial example attacks. Most existing works achieve this malicious objective by crafting subtle pixel-wise perturbations, and they are difficult to launch in the physical world due to inevitable transformations (e.g., different photographic distances and angles). Recently, there are a few research works on generating physical adversarial examples, but they generally require the details of the model a priori, which is often impractical. In this work, we propose a novel physical adversarial attack for arbitrary black-box DNN models, namely Region-Wise Attack. To be specific, we present how to efficiently search for region-wise perturbations to the inputs and determine their shapes, locations and colors via both top-down and bottom-up techniques. In addition, we introduce two fine-tuning techniques to further improve the robustness of our attack. Experimental results demonstrate the efficacy and robustness of the proposed Region-Wise Attack in real world.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Deep neural networks (DNNs) have achieved the state-of-the-art performance in many areas, such as image classification [He et al.2016, Zoph et al.2018] and autonomous driving [Chen et al.2016, Yang et al.2018]. However, they are found vulnerable to adversarial example attacks that fool them into making adversarial decisions by slightly manipulating their inputs [Papernot et al.2016, Carlini and Wagner2017, Xie et al.2019, Yuan et al.2019], which are serious threats to safety-critical systems.

Instead of attacking the digital inputs to the DNNs, physical adversarial attacks manipulate the objects in the real world directly to achieve malicious objective. The first such kind of attack was proposed in [Sharif et al.2016]

, in which attackers wear a malicious eye-glasses to fool the face recognition system to make misclassifications. In 

[Eykholt et al.2018], a physical adversarial attack was implemented against road sign recognition systems by generating sticker perturbations and attaching them onto traffic signs. These attacks are launched under white-box settings, in which attackers need to know the details of the attacked DNN model (e.g., architectures and trained parameters). This is often not practical due to the difficulty to obtain the parameters used in real world systems.

In this work, we propose to generate physical adversarial examples by searching for effective perturbations in continuous image regions with simple queries of the targeted DNN model without knowing its details. The proposed attack, namely Region-Wise Attack, is able to efficiently (i.e., with a limited number of queries) determine the locations, shapes and colors of the required perturbations to launch the attack. To further increase the robustness of Region-Wise Attack, physical misplacement and photography independent fine-tuning mechanisms are introduced to tolerate possible variations in real world. By sticking the generated perturbations onto targeted objects, we perform experiments on CIFAR-10, GTSRB data sets and real world road signs, and our results demonstrate the efficacy and robustness of the proposed attack.

The main contributions of this paper include:

  • We propose a novel physical adversarial attack for arbitrary black-box DNNs by generating region-wise perturbations.

  • We present how to efficiently find the shapes, locations and colors of the region-wise perturbations via both top-down and bottom-up methods.

  • To increase the attack robustness, we introduce two fine-tuning mechanisms to tolerate possible physical misplacement and photographic variations in real world.

The remainder of this paper is organized as follows. Section 2 introduces related works and motivates this work. Then, we detail the proposed Region-Wise Attack in Section 3. Next, we present experimental results in Section 4. Finally, Section 5 concludes this paper.

2 Related Work and Motivation

Figure 1: The overview of our proposed attack method.

In many machine learning systems, attackers can only manipulate the objects so that the taken images are perturbed as expected. This kind of attacks is called physical attacks.

[Sharif et al.2016] targets to attack the face recognition systems by generating adversarial eye-glasses. When people wear them, the system will make misclassifications. [Eykholt et al.2018] generates adversarial stickers to mimic city graffiti and then attach these stickers onto road signs to fool self-driving systems. In [Komkov and Petiushko2019], they propose to craft adversarial pathes to mislead the object detection systems. Recently, [Thys, Van Ranst, and Goedemé2019] generates rectangular perturbations on the hat to fool face ID systems.

All the above physical attacks are under white-box settings, which is often not practical, as it is difficult to obtain the DNN parameters of real-world systems. There are also many related black-box attacks proposed in the literature. In [Papernot et al.2017], they first train a substitute network of the similar functionality with the targeted DNN, and then craft adversarial examples against the substitute network under white-box settings. The generated adversarial examples are used to attack the targeted DNN based on their transferability. The ZOO attack [Chen et al.2017] crafts adversarial examples by approximating the model gradients, observing the output changes when varying the inputs. They use zeroth order optimization method to approximate the model gradients rather than training a substitute model. Recently, [Alzantot et al.2018]

proposes the Genattack to find the adversarial perturbations with genetic algorithms. These attacks craft pixel-wise perturbations and generally require extensive computational resources to achieve better transferability. More importantly, they are less effective for launching physical attacks due to inevitable physical misplacement and photographic variations in real world.

Motivated by the above, we propose novel techniques to efficiently craft region-wise perturbations to launch robust physical adversarial example attacks for arbitrary black-box DNN systems, in which we treat them as an oracle and query the outputs for specific inputs.

3 Crafting Region-wise Perturbations

Crafting region-wise perturbations under black-box settings is very difficult, as we have to determine the perturbation color, shape and location. The search space is extremely large. The perturbation color refers to how to change the values of the selected regions. For example, we can set the selected perturbation region of the input image to the black color and then attach this black stick onto the object to launch the attack. The perturbations with different colors, shapes or locations will certainly result in different attack effects.

The overview of our proposed attack method is shown in Figure 1, in which we first generate coarse-grained region-wise perturbations with our top-down or bottom-up methods iteratively under the black-box setting. Then, to improve the attack robustness in the physical world, we discuss two fine-tuning techniques to tolerate possible misplacement. After that, the crafted region-wise perturbations are attached onto the physical objects so that the taken images can mislead the attacked DNNs. The key steps in our method are the top-down, bottom-up techniques and fine-tuning mechanisms. Next, we first formulate the attack as an optimization problem and then introduce these two key steps in more detail.

Problem Formulation

As discussed earlier, we need to determine three factors when generating region-wise perturbations, the perturbation colors, shapes and locations. Besides, we should consider their perceptual imperceptibility, because the attacks will fail if users find large perturbations and suspect that their systems are under attack. Previous black-box attacks usually use the -norm to evaluate the perceptual similarity between the perturbed image and the original one. However, this is not appropriate, as they assume the perceptual effect of each pixel is equally important. In real world, when seeing an image, human eyes only pay attention to some specific regions and ignore the unimportant ones [Legge and Foley1980]. In this paper, we propose to use the saliency model to evaluate the perceptual effects of region-wise perturbations. It is recently proposed to mimic human perception mechanism by outputting a saliency map indicating the attractiveness of each pixel [Cornia et al.2018, Bylinskii et al.2018]. The larger the value, the more sensitive for human eyes.

Based on the saliency model, we design a metric to evaluate the perceptual effects of region-wise perturbations, called perturbation attention (), which is calculated by the following equation:

(1)

where is the saliency map given by the saliency model in [Cornia et al.2018]. denotes the perturbation regions and represents the perturbed image. The numerator is the total values of all pixels in the perturbation region and the denominator represents the total of all pixels in the whole perturbed image. Therefore, the less the value of , the more imperceptibility of the attack.

We formulate the attack as an optimization problem, finding the optimal color, shape and location of the region-wise perturbation to fool DNNs under the constraint that the perturbation attention given by Equation (1) is bounded. For targeted attacks, the attack goal is to fool DNNs into misclassifying the input to a targeted malicious label. Our formulation for targeted attacks is to maximize the prediction probability of the targeted label, as shown below:

(2)

is the prediction probability of the targeted malicious label when the input image is perturbed with . , and are the candidate sets for perturbation color, shape and location, respectively. is the maximum perturbation attention allowed when not being suspected by users.

Similarly, for un-targeted attacks, the attack goal is to fool DNNs into making misclassifications. Our formulation is to minimize the prediction probability of the true label so that DNNs will make mistakes when the probability of another label becomes the largest. It is formulated as follows:

(3)

is the prediction probability of the true label when given the perturbed image .

However, solving the above optimization problems is very challenging due to the following three reasons: firstly, we can only craft the region-wise perturbations under black-box settings, understanding the behavior of the targeted DNN by querying it. Secondly, we should consider the perturbation attention constraint so that the perturbations are not suspected by users. Thirdly, the search space of the colors, shapes and locations of the region-wise perturbations is extremely large.

The Top-Down and Bottom-up Methods

Considering the challenges in solving the above optimization problems, in this section, we introduce two methods to efficiently determine the shapes, locations and colors of region-wise perturbations.

Top-down Method for Un-targeted Attacks

For un-targeted adversarial example attacks, the attack goal is to fool the model into making misclassifications. The simplest way to achieve this objective is to perturb the whole object region in the image so that the model cannot recognize it. However, this is not valid, as perturbing the whole object region will certainly violate the perturbation attention constraint and be suspected by users. Therefore, to successfully launch the attack, we need to shrink the perturbation region so that the perturbation attention constraint can be satisfied. This is the key idea of our top-down method that iteratively shrinks the perturbation regions. The attack will fail if we cannot find a region-wise perturbation that can attack successfully under the constraints.

In our top-down method, firstly, the perturbation region is initialized as the whole input image, which will violate the perturbation constraint. Secondly, we shrink the perturbation region into some smaller candidate regions. Thirdly, we select the most adversarial ones from the candidate regions by querying the attacked DNN. The second and third steps are repeated until the perturbation attention constraint is satisfied. In this way, we can efficiently find the regions to perturb in a limited number of queries. The key steps in this process is how to shrink the perturbation regions and how to select the most adversarial ones, which is detailed in the following sections.

Region Shrinking

In each iteration, we will generate some smaller candidate perturbations based on the current region-wise perturbation, and then evaluate their individual attack capability by querying the targeted DNN. Some principles should be considered in the region shrinking process. Firstly, the union of all the generated smaller candidate regions should cover all parts of the original region. If not, some parts in the original region will not be explored and we may miss the optimal regions. Secondly, we cannot generate too many candidates, otherwise, the number of querying the targeted DNN will be large. Considering these two principles, we propose to divide the original perturbation region horizontally and vertically in each iteration to get four different candidate regions, the left-half, right-half, top-half and bottom half parts, respectively. In each iteration, the region will shrink to its half size, and no parts in the original region are missed.

Figure 2: The continuous and discontinuous shrinking.

Although this shrinking method is efficient, only requiring four queries in each iteration for each color, the limitation is obvious that the generated region is continuous. It only generates one region for an input. However, the important features for classifying an object may locate in separate regions. Therefore, to increase the attack success rate, it is necessary to generate discontinuous candidate regions. Instead of only keeping four continuous candidates, we also keep the top-left and bottom-right parts, or the top-right and bottom-left parts. Figure 

2 shows an example for the continuous and discontinuous region shrink strategies, in which the black region will be perturbed.

Region Selection

After generating the candidate perturbation regions, we need to evaluate their attack capability and select the most adversarial ones as the search space for the next iteration. According to our optimization goal for the un-targeted attack, the region that achieves the minimum prediction probability for the true label should be selected. It can be formulated as follows:

(4)

where is a candidate region-wise perturbation in iteration , is the image with perturbation .

It should be noted that we only consider to find the region shape and location. For the perturbation color, if we explore more color candidates in the region selection step, we will have a higher chance to find a better perturbation. However, exploring more color candidates will certainly increase the attack cost and it is necessary to obtain a good tradeoff between the attack cost and performance.

Bottom-up Method for Targeted Attacks

Although the top-down method is effective for un-targeted attacks, it is ineffective for the targeted attacks. The reason is that large perturbation regions can only fool DNNs into making mistakes but may not necessarily make them misclassify to the targeted malicious label. As a result, the top-down method that searches from the large regions to smaller ones is not efficient in this case. To solve this problem, we propose a bottom-up technique that starts from small perturbation regions and then extends them iteratively based on their adversarial capabilities.

In the bottom-up method, firstly, we search the initial effective perturbation regions among the whole input image with a fine-grained manner. Secondly, we expand the fine-grained perturbation regions to different directions, and generate several candidate regions. Thirdly, we evaluate these expanded regions based on their attack capabilities and select the most adversarial ones for the next iteration. The second and third steps are continued until the perturbation attention constraint is violated. Next, we introduce these three steps in detail.

Perturbation Region Initialization

The goal of the perturbation region initialization is to find small appropriate regions in the input images as initial locations for perturbations. We select initial perturbation regions that achieve the top prediction probabilities of the targeted label. If , we only select the region with the largest prediction probability of the targeted label and the perturbation is continuous. However, the important features for achieving the attack goal may locate in separate regions. To increase the attack capability, it is necessary to select several initial regions to generate discontinuous region-wise perturbations, which will increase the overhead for the attacks. As a result, attackers can choose the large whenever possible based on their attack cost budgets.

We assume that the initial perturbation region is with the size of

. The exhaustive search is used to find the most adversarial regions, where the stride size is

(). For example, if size of the original search space is and the initial perturbation region is with the stride 3, we have to evaluate 9 different perturbation regions, and any two search regions have no overlaps.

Region Expansion

In each iteration, we will expand the perturbation region similar to the shrinking step in the top-down method. We expand the current perturbation regions along four directions (top, bottom, left and right directions). After each iteration, we expand the original region into its twice size and get four candidate regions for the next region selection step.

Region Selection

In each iteration, the region that achieves the maximum prediction probability of the targeted label is selected. It can be formulated as follows:

(5)

where is the image with region-wise perturbation . denotes the classification probability of the targeted label given the perturbed input.

It should be noted that the bottom-up method can be used for un-targeted attacks, but it is not as efficient as the top-down method according to our experiments. So in this paper, we only adopt it for targeted attacks.

Fine-tuning Mechanisms to Increase Attack Robustness

After generating the region-wise perturbations, we need to launch the physical attack by sticking them onto the real objects. As in the physical world, there are many unavoidable variations, the adversarial examples generated from input images under certain photography conditions may fail to attack after sticking. In this section, we introduce two fine-tuning techniques to increase the attack robustness in the physical world.

Physical Misplacement Fine-tuning

In practical situations, it is very difficult, if not impossible, to precisely stick the perturbations onto the real objects as we optimized. For example, it may translate several pixels. As a result, to improve the attack robustness, it is necessary to fine-tune the perturbation locations to tolerate possible physical misplacement.

Our key idea is to explore the neighborhood of the generated perturbations to find the best location that can tolerate the majority of the physical misplacement. If we move the perturbation pixels every time within the neighborhood, for each , we have at most eight directions, left, right, top, bottom and top-left, bottom-left, top-right, bottom-right, for physical misplacement. If the perturbation is still adversarial when suffering from the physical misplacement, it means that the perturbation can tolerate this misplacement. The more times a perturbation can tolerate the misplacement, the more robustness it is. Based on this idea, we fine-tune the perturbation to tolerate the possible physical misplacement according to the following equation:

(6)

where is the generated perturbation moving pixels along the direction . is the set of values that can choose in the neighborhood of . is the eight directions that can move to. The numerator denotes the number of successful attacks for all the input in when suffering from the physical misplacement. The denominator is the total number of images in . According to this equation, will achieve the best misplacement tolerance when moving pixels to the direction , which is the final perturbation location after fine-tuning.

Photography Independent Fine-tuning

In the physical world, many inevitable photographic variations, such as photographic angles, distances and so on, may degrade the attack robustness, as the taken images will be different due to these variations. In this section, we propose a photography independent fine-tuning mechanism to improve the attack robustness. The key idea is to generate the input images under varying photography conditions and craft the perturbation that can attack successfully under most conditions. To achieve this goal, we modify the region selection standard in the top-down and bottom-up methods. Instead of evaluating the adversarial capability of a region on one input image, we evaluate it with a number of inputs. In this way, the crafted region-wise perturbations can largely increase the attack robustness in the physical world.

The new region selection standard for the un-targeted attacks is reformulated as follows:

(7)

where is an image of the object under one specific situation and is a set of images taken in varying conditions. We select that can largely degrade the average of the classification probability of the true label for all the inputs in .

Similarly, for the targeted attacks, the perturbation selection standard is reformulated as follows:

(8)

The perturbation that can largely increase the average of the classification probability of the targeted malicious label for all the inputs in will be selected.

In our method, the input image set is obtained by the data augmentation through physical and synthetic transformations. For example, if we target to attack the stop sign, we will generate by taking photos of the stop sign under various conditions, such as changing photographic angles, lights or object distances. For the synthetic transformation, we perform common digital image transformations, such as changing the contrast or rotating some angles.

4 Experimental Results

Our experiments are performed on CIFAR-10 [Krizhevsky, Nair, and Hinton2014] and GTSRB [Stallkamp et al.2012] data sets (real-world road signs). The CIFAR-10 contains 60000 images that represent 10 different natural objects and the GTSRB contains 50000 images representing 43 kinds of road signs. The model architectures for these data sets are CNNs, where we use the VGG-10 model for CIFAR-10 with an accuracy of 92.8%. The GTSRB and real world signs are classified by an eight-layer CNN with an accuracy of 98.8% on the GTSRB test sets. We conduct all the experiments on the platform with an AMD Threadripper 1950X CPU and four NVIDIA GTX 1080 GPUs.

Top-down and Bottom-up Results

In this section, we evaluate our top-down and bottom-up methods with digital attacks on CIFAR-10 and GTSRB data sets.

Attack Success Rate VS. Perturbation Attention

Figure 3: Attack success rate vs. the max. allowed perturbation attention on CIFAR-10 data set.

Figure 3 shows the attack success rate (ASR) achieved with our top-down and bottom-up methods under different perturbation attentions constraints for CIFAR-10 data set. The perturbation color set contains 20 colors randomly selected from the 216 “web-safe” colors. For the top-down method in Figure 3(a), we can see that the ASR with continuous and discontinuous region shrinking techniques are 95% and 100% when the maximum allowed perturbation attention is 0.2. When decreases, the attack success rate also decreases. The reason is that it is harder to fool the DNN with smaller perturbations.

For the bottom-up results for targeted attacks in Figure 3(b), when the maximum allowed perturbation attention is 0.01, the ASR for continuous and discontinuous region expansion techniques are low, about 38% and 52%. As the perturbations are small at early iterations, it is difficult to attack successfully. When increases from 0.01 to 0.05, the ASR increases quickly. However, the ASR decreases when is larger than 0.05. This is because too large perturbations can fool DNNs into making mistakes, but may not necessarily make them misclassify to the targeted malicious labels, which justifies the usage of the bottom-up method for targeted attacks. For both Figure 3(a) and (b), the discontinuous region shrinking or expansion perform better than the continuous ones, achieving higher ASR with the same . Therefore, it is essential to generate discontinuous regions to improve ASR.

Figure 4: ASR vs. the size of color set .
Figure 5: Perturbation attention of crafted adversarial examples.

Attack Success Rate VS. Color Set Size

Figure 4 shows the ASR with different color set sizes for GTSRB and CIFAR-10 data sets, in which the colors are randomly selected from the total 216 colors. We set the perturbation constraint by generating discontinuous region-wise perturbations. We can see that the ASR increases when the color set size increases. As we explore more colors with more attack overhead, we have a higher chance to find a better color for perturbations. However, the ASR increases very slowly when the color set size is greater than 40 for GTSRB, and 80 for CIFAR-10. That is to say, we achieve less benefit when exploring more colors. As the images in CIFAR-10 are more colorful, it is necessary to explore more colors to find the best color for CIFAR-10. Attacking GTSRB model is more difficult than CIFAR-10 model. We explain that the images in CIFAR-10 are more difficult to classify than those in GTSRB, with the classification accuracy of 92.8% and 98.8% respectively, as a result, it is easier for attackers to fool CIFAR-10 models.

In Figure 5, we present the successfully crafted adversarial examples from GTSRB data set for un-targeted attacks. We can see that when the perturbation attention is 0.2, the perturbation region size is relative large, which may cause users to suspect that the system has been attacked. When the perturbation attention decreases, the perceptual effects for human eyes also decrease. For example, when the perturbation attention is 0.03, users are less likely to suspect possible attacks.

Compare to Baseline Method

As there no existing work targeting to generate region-wise perturbations under the black-box setting, we design a simple baseline that assumes the perturbation region is with the size of and performs an exhaustive search to find the best location for the perturbations. The initial perturbation size is the same with our top-down and bottom-up methods. In each iteration, we search all the locations of the input image and then add 1 to for targeted attacks and minus 1 from for un-targeted attacks. We will skip the exhaustive searching when the perturbation attention is much larger than the constraint. We set and explore 40 random colors.

Exhaustive Search Top-down (Continuous) Top-down (Discontinuous)
ASR Runtime ASR Runtime ASR Runtime
CIFAR-10 93.0% 178.6 Secs 92.6% 3.2 Secs 97.4% 8.1 Secs
GTSRB 92.0% 282.8 Secs 91.7% 4.6 Secs 96.5% 10.3 Secs
Table 1: The performance of our top-down method compared to the exhaustive search method.
Exhaustive Search Bottom-up (Continuous) Bottom-up (Discontinuous)
ASR Runtime ASR Runtime ASR Runtime
CIFAR-10 83.0% 252.4 Secs 82.3% 5.2 Secs 89.4% 11.3 Secs
GTSRB 82.0% 336.3 Secs 81.2% 8.7 Secs 87.2% 16.8 Secs
Table 2: The performance of our bottom-up method compared to the exhaustive search method.
“STOP Sign” “Straight Drive” “Speed Limited”
Bottom-up Top-down Bottom-up Top-down Bottom-up Top-down
Without Fine-tunes 73.4% 82.6% 75.3% 83.2% 74.2% 82.6%
With Fine-tunes 87.7% 95.4% 89.2% 95.8% 87.9% 94.9%
Table 3: The ASR of physical attacks with and without fine-tuning mechanisms.

Table 1 and  2 show the ASR and the average runtime for attacking one image of different methods. Our continuous region shrinking and expansion achieve similar ASR with the much less runtime, which justifies the efficiency of our proposed method. For each , the exhaustive search will have the complexity of about , where is the size of the image. As it has to explore different values, the runtime is further amplified. However, our method only requires about runtime regardless of . It largely reduces the cost for finding region-wise perturbations. The discontinuous techniques obtain higher ASR than the exhaustive method. For example, the ASR for CIFAR-10 can be 97.4%, while the baseline can only achieve 93%. We analyze this phenomenon that although the baseline uses an exhaustive search method, it fixes the shapes of the region-wise perturbation. Our method optimizes the locations and the shapes simultaneously, therefore, we may find better perturbations. Moreover, generating more than one perturbation region helps further improve the ASR.

Robustness of Physical Attacks

In this section, we conduct the physical attacks on three real world road signs, “Stop”,“Keep Straight” and“60 Speed Limited”, respectively. We first take images of these road signs and then generate the region-wise perturbations with the top-down and bottom-up methods with and without fine-tuning techniques. The perturbations are attached on the road signs, and then we take 100 images with different distances and angles. For each image, we add some transformations to achieve 1000 images in total for evaluation.

The results of ASR on attacking three real-world road signs are shown in Table 3, where we can see that the ASR with fine-tuning is much better than that without fine-tuning. For example, the ASR is improved from 82.6% to 95.4% when attacking “Stop Sign” with the top-down and fine-tuning techniques. It indicates that the fine-tuning mechanisms are necessary for improving the robustness of physical attacks. Moreover, the ASR of the bottom-up method for the targeted attack is smaller than that of the top-down method for the un-targeted attack. This corresponds with the discovery in previous work that launching targeted attacks is more difficult than un-targeted attacks.

Figure 6 shows the images taken under different conditions. The values below the image denote the prediction probability of the true label given by the GTSRB model. We can see that our region-wise perturbations can efficiently fool the model into misclassifications with small perturbations, decreasing the prediction probability from about 0.98 to 0.06 for the “Stop Sign”.

Figure 6: Crafted region-wise adversarial examples.

5 Conclusion

In this paper, we propose Region-Wise Attack, a novel technique to efficiently generate robust physical adversarial examples with region-wise perturbations under the black-box setting. Top-down and bottom-up methods are presented to efficiently find the appropriate colors, locations and shapes of the perturbations. To further increase the attack robustness, we introduce two fine-tuning techniques to tolerate possible variations in the real world. Experimental results on CIFAR-10, GTSRB and real-world attacks show that our proposed method can achieve very high attack success rate with much less runtime when compared to baseline solutions.

References

  • [Alzantot et al.2018] Alzantot, M.; Sharma, Y.; Chakraborty, S.; and Srivastava, M. 2018. Genattack: Practical black-box attacks with gradient-free optimization. arXiv preprint arXiv:1805.11090.
  • [Bylinskii et al.2018] Bylinskii, Z.; Judd, T.; Oliva, A.; Torralba, A.; and Durand, F. 2018.

    What do different evaluation metrics tell us about saliency models?

    IEEE transactions on pattern analysis and machine intelligence 41(3):740–757.
  • [Carlini and Wagner2017] Carlini, N., and Wagner, D. 2017. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), 39–57. IEEE.
  • [Chen et al.2016] Chen, X.; Kundu, K.; Zhang, Z.; Ma, H.; Fidler, S.; and Urtasun, R. 2016. Monocular 3d object detection for autonomous driving. In

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    , 2147–2156.
  • [Chen et al.2017] Chen, P.-Y.; Zhang, H.; Sharma, Y.; Yi, J.; and Hsieh, C.-J. 2017. Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In

    Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

    , 15–26.
    ACM.
  • [Cornia et al.2018] Cornia, M.; Baraldi, L.; Serra, G.; and Cucchiara, R. 2018.

    Predicting human eye fixations via an lstm-based saliency attentive model.

    IEEE Transactions on Image Processing 27(10):5142–5154.
  • [Eykholt et al.2018] Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.; Xiao, C.; Prakash, A.; Kohno, T.; and Song, D. 2018.

    Robust physical-world attacks on deep learning visual classification.

    In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1625–1634.
  • [He et al.2016] He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  • [Komkov and Petiushko2019] Komkov, S., and Petiushko, A. 2019. Advhat: Real-world adversarial attack on arcface face id system. arXiv preprint arXiv:1908.08705.
  • [Krizhevsky, Nair, and Hinton2014] Krizhevsky, A.; Nair, V.; and Hinton, G. 2014. The cifar-10 dataset. online: http://www. cs. toronto. edu/kriz/cifar. html 55.
  • [Legge and Foley1980] Legge, G. E., and Foley, J. M. 1980. Contrast masking in human vision. Josa 70(12):1458–1471.
  • [Papernot et al.2016] Papernot, N.; McDaniel, P.; Jha, S.; Fredrikson, M.; Celik, Z. B.; and Swami, A. 2016. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, 372–387. IEEE.
  • [Papernot et al.2017] Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z. B.; and Swami, A. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, 506–519. ACM.
  • [Sharif et al.2016] Sharif, M.; Bhagavatula, S.; Bauer, L.; and Reiter, M. K. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1528–1540. ACM.
  • [Stallkamp et al.2012] Stallkamp, J.; Schlipsing, M.; Salmen, J.; and Igel, C. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural networks 32:323–332.
  • [Thys, Van Ranst, and Goedemé2019] Thys, S.; Van Ranst, W.; and Goedemé, T. 2019. Fooling automated surveillance cameras: adversarial patches to attack person detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 0–0.
  • [Xie et al.2019] Xie, C.; Zhang, Z.; Zhou, Y.; Bai, S.; Wang, J.; Ren, Z.; and Yuille, A. L. 2019. Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2730–2739.
  • [Yang et al.2018] Yang, Z.; Zhang, Y.; Yu, J.; Cai, J.; and Luo, J. 2018. End-to-end multi-modal multi-task vehicle control for self-driving cars with visual perceptions. In 2018 24th International Conference on Pattern Recognition (ICPR), 2289–2294. IEEE.
  • [Yuan et al.2019] Yuan, X.; He, P.; Zhu, Q.; and Li, X. 2019. Adversarial examples: Attacks and defenses for deep learning. IEEE transactions on neural networks and learning systems.
  • [Zoph et al.2018] Zoph, B.; Vasudevan, V.; Shlens, J.; and Le, Q. V. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8697–8710.