Patch Attack for Automatic Check-out

05/19/2020 ∙ by Aishan Liu, et al. ∙ 5

Adversarial examples are inputs with imperceptible perturbations that easily misleading deep neural networks(DNNs). Recently, adversarial patch, with noise confined to a small and localized patch, has emerged for its easy feasibility in real-world scenarios. However, existing strategies failed to generate adversarial patches with strong generalization ability. In other words, the adversarial patches were input-specific and failed to attack images from all classes, especially unseen ones during training. To address the problem, this paper proposes a bias-based framework to generate class-agnostic universal adversarial patches with strong generalization ability, which exploits both the perceptual and semantic bias of models. Regarding the perceptual bias, since DNNs are strongly biased towards textures, we exploit the hard examples which convey strong model uncertainties and extract a textural patch prior from them by adopting the style similarities. The patch prior is more close to decision boundaries and would promote attacks. To further alleviate the heavy dependency on large amounts of data in training universal attacks, we further exploit the semantic bias. As the class-wise preference, prototypes are introduced and pursued by maximizing the multi-class margin to help universal training. Taking AutomaticCheck-out (ACO) as the typical scenario, extensive experiments including white-box and black-box settings in both digital-world(RPC, the largest ACO related dataset) and physical-world scenario(Taobao and JD, the world' s largest online shopping platforms) are conducted. Experimental results demonstrate that our proposed framework outperforms state-of-the-art adversarial patch attack methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 2

page 9

page 11

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Figure 1: In the real-world scenario like Automatic Check-Out, items (e.g., fruits and chocolates) are often tied with patch-like stickers or tags.

Deep learning has demonstrated remarkable performance in a wide spectrum of areas, including computer vision [11], speech recognition [16]

and natural language processing

[23]. Recently, deep learning strategies have been introduced into the check-out scenario in supermarkets and grocery stores to revolutionize the way people shopping (e.g., Amazon Go). Automatic Check-Out (ACO) [26, 13, 3]

is a visual item counting system that takes images of shopping items as input and generates output as a tally of different categories. Customers are not required to put items on the conveyer belt and wait for salesclerks to scan them. Instead, they can simply collect the chosen items and a deep learning based visual recognition system will classify them and automatically process the purchase.

Though showing significant achievements in our daily lives, unfortunately, deep learning is vulnerable to adversarial examples [8, 24]. These small perturbations are imperceptible to human but easily misleading DNNs, which creates potential security threats to practical deep learning applications, e.g.

, auto-driving and face recognition systems

[14]. In the past years, different types of techniques have been developed to attack deep learning systems [8, 24, 27]. Though challenging deep learning, adversarial examples are also valuable for understanding the behaviors of DNNs, which could provide insights into the blind-spots and help to build robust models [29, 28].

Besides the well-designed perturbations, the adversarial patch serves as an alternative way to generate adversarial examples, which can be directly localized in the input instance [1, 9, 14]. In contrast, adversarial patches enjoy the advantages of being input-independent and scene-independent. In real-world scenarios, patches could be often observed which are quasi-imperceptible to humans. For example, as shown in Fig.1, the tags and brand marks on items in the supermarket. Thus, it is convenient for an adversary to attack a real-world deep learning system by simply generate and stick adversarial patches on the items. However, existing strategies [1, 5] generate adversarial patches with weak generalization abilities and are not able to perform universal attacks [17]. In other words, these adversarial patches are input-specific and fail to attack images from all classes, especially unseen ones during training.

To address the problem, this paper proposes a bias-based framework to generate class-agnostic universal adversarial patches with strong generalization ability, which exploits both the perceptual and semantic bias. Regarding the perceptual bias, since DNNs are strongly biased towards textural representations and local patches [29, 7], we exploit the hard examples which convey strong model uncertainties and extract a textural patch prior from them by adopting the style similarities. We believe that the textural prior is more close to decision boundaries which would promote the universal attack to different classes. To further alleviate the heavy dependency on large amounts of data in training universal attacks[19], we further exploit the semantic bias. As models have preference and bias towards different features for different classes, prototypes, which contain strong class-wise semantics, are introduced as the class-wise preference and pursued by maximizing the multi-class margin. We generate prototypes to help universal training and reduce the amount of training data required. Extensive experiments including both the white-box and black-box settings in both the digital-world (RPC, the largest ACO related dataset) and physical-world scenario (Taobao and JD, the world’s largest online shopping platforms) are conducted. Experimental results demonstrate that our proposed framework outperforms state-of-the-art adversarial patch attack methods.

To the best of our knowledge, we are the first to generate class-agnostic universal adversarial patches by exploiting the perceptual and semantic biases of models. With strong generalization ability, our adversarial patches could attack images from unseen classes of the adversarial patch training process or target models. To validate the effectiveness, we choose the automatic check-out scenario and successfully attack the Taobao and JD platform, which are among the world’s largest e-commerce platforms and the ACO-like scenarios.

2 Related work

2.1 Adversarial Attacks

Adversarial examples, which are intentionally designed inputs misleading deep neural networks, have recently attracted research focus [8, 24, 12]. Szegedy et al. [24] first introduced adversarial examples and used the L-BFGS method to generate them. By leveraging the gradients of the target model, Goodfellow et al. [8] proposed the Fast Gradient Sign Method (FGSM) which could generate adversarial examples quickly.

To improve the generalization ability to different classes, Moosavi et al. [17] first proposed an algorithm to compute universal adversarial perturbations for DNNs for object recognition tasks. Mopuri et al. [18]

proposed a data-free objectives to generate universal adversarial perturbations by maximizing the neuron activation. Further, Reddy

et al. [19] generated data-free universal adversarial perturbations using class impressions.

Besides, adversarial patch [1], with noise confined to a small and localized patch, emerged for its easy accessibility in real-world scenarios. Karmon et al. [9]

created adversarial patches using an optimization-based approach with a modified loss function. In contrast to the prior research, they concentrated on investigating the blind-spots of state-of-the-art image classifiers. Eykholt

et al. [5] adopted the traditional perturbation techniques to generate attacking noises, which can be mixed into the black and white stickers to attack the recognition of the stop sign. To improve visual fidelity, Liu et al. [14] proposed the PS-GAN framework to generate scrawl-like adversarial patches to fool autonomous-driving systems. Recently, adversarial patches have been used to attack person detection systems and fool automated surveillance cameras [25].

2.2 Automatic Check-out

The bedrock of an Automatic Check-out system is visual item counting that takes images of shopping items as input and generates output as a tally of different categories [26]. However, different from other computer vision tasks such as object detection and recognition, the training of deep neural networks for visual item counting faces a special challenge of domain shift. Wei et al. [26] first tried to solve the problem using the data argumentation strategy. To improve the realism of the target images, through a CycleGAN framework [30], images of collections of objects are generated by overlaying images of individual objects randomly. Recently, Li et al. [13] developed a data priming network by collaborative learning to determine the reliability of testing data, which could be used to guide the training of the visual item tallying network.

3 Proposed Framework

In this section, we will first give the definition of the problem and then elaborate on our proposed framework.

3.1 Problem Definition

Assuming is the feature space with the number of features. Supposing ( ,) is the

-th instance in the data with feature vector

and the corresponding class label. The deep learning classifier attempts to learn a classification function : . Specifically, in this paper we consider the visual recognition problem.

An adversarial patch is a localized patch that is trained to fool the target model to wrong predictions. Given an benign image with its ground truth label , we form an adversarial example which is composed of the original image , an additive adversarial patch and a location mask {0,1} :

(1)

where is the element-wise multiplication. Simply, we will use the below equation:

(2)

The prediction result of by model is = (). The adversarial patch makes the model predict the incorrect label, namely .

To perform universal attacks, we generate a universal adversarial patch that could fool the classifier on items sampled from distribution from almost all classes:

(3)

3.2 The Framework

We propose a bias-based attack framework to generate universal adversarial patches with strong generalization ability. The overall framework can be found in Fig.2.

Figure 2: Our bias-based framework to generate universal adversarial patches. We first generate a perceptually biased prior by fusing textural features from multiple hard examples. Then, we generate semantically biased prototypes to help training universal adversarial patches with a target model

Recent studies have revealed that DNNs are strongly biased towards texture features when making predictions [7]. Deep learning models are still performing well on patch-shuffled images where local object textures are not destroyed drastically [29]. Thus, we first exploit the perceptual bias of deep models by generating perceptually biased priors from multiple hard example set = =. Textural features are extracted by an attention module to fuse a more powerful prior . We believe the fused prior are more close to decision boundaries of different classes and would boost universal attacks.

Meanwhile, as models have preferences and impressions for different classes, we further exploit the semantic bias of models for each class. To alleviate the heavy dependency on a large amount of data suffered to train universal attacks, we generate semantically biased prototypes to help training. As the class-wise preference, prototypes contain rich semantics for each class. Thus, prototypes are generated by maximizing the multi-class margin and used to represent instances from each class. Training with prototypes would reduce the amount of training data required. Thus, we generate prototypes {, , , } and use them as training data to learn our final adversarial patch from .

3.3 Perceptually Biased Prior Generation

Motivated by the fact that deep learning models are strongly biased towards textural features, we first proposed to extract textural features as priors. To fully exploit the statistic uncertainty of models, we borrow textural features from hard examples.

Hard examples appear as instances that are difficult for models to classify correctly. Techniques like hard example mining are used to improve training [6], in which “hard” hence informative examples are spotted and mined. Given a hard example with ground truth label , assuming that is the prediction of the model . The hard example suffices the constraint that or with relatively low classification confidence. Obviously, a hard example is an instance lying closely to model decision boundaries, and are more likely to cross the prediction surfaces. Thus, using the features from a hard example to train adversarial patches is like “standing on the shoulders of a giant”, which would be beneficial to overcome local-optima and gain strong attacking abilities.

To further motivate universal attacks, we extract textural features from multiple hard examples with different labels and fuse them together into a stronger prior. Intuitively, by studying features from multiple hard examples with different labels, our prior would contain more uncertainties for different classes. However, simply learning at pixel-level makes it difficult to extract and fuse textural features. Thus, we introduce the style loss which specifically measures the style differences and encourages the reproduction of texture details:

(4)

where

is the Gram matrix of the features extracted from certain layers of the network.

is the activation of a specific filter at position in the layer . is the fused example, and is the hard example where = 1, 2, …, .

Besides, entropy has been widely used to depict the uncertainty of a system or distribution. To further improve universal attacks to different classes, we introduce the class-wise uncertainty loss. we increase model prediction uncertainties by minimizing the negative of entropy. Thus, the fused example would be much closer to decision boundaries and obtain low confidence for different classes. It can be written as:

(5)

where denotes the model confidence of the -th class with the fused input .

Thus, to fully exploit the perceptual bias, we optimize the fusion loss function as follows:

(6)

where controls the balance between the two terms.

However, the fused example has a different size with our patches. Thus, an attention module has been introduced to eliminate redundant pixels and generate a textural prior from the fused example .

(7)

where is a visual attention module that selects a set of suitable visual pixels from the fused sample. These pixels contain the highest stimuli towards model predictions and would be used as textural priors.

Inspired by [20], given a hard example , we compute the gradient of normalized feature maps of a specific hidden layer in the model w.r.t. . These gradients are global-average-pooled to get the weight matrix which is a weighted combination of feature maps to the hard example :

(8)

where represents the weight at position , is the pixel value in position of -th feature map, and represents the total feature map number. Note that and where are the width and height of , respectively. Then, we can combine the pixels with the highest weight to get our textural prior .

3.4 Training with Semantically Biased Prototypes

With the textural priors generated at the previous stage, we continue to optimize and generate our adversarial patch. To generate universal adversarial perturbations, most of the strategies require a lot of training data [19]. To alleviate the heavy dependency on large amounts of training data, we further exploit the semantic bias.

Prototype is a kind of “quintessential” observations that best represent and contain the strongest semantics for clusters or classes in a dataset [10]. Prototypes have provided quantitative benefits to interpret and improve deep learning models. For example, “black people” is a representative feature for class asketall and most prototypes (images) in this class contain at least one black person [22]. Thus, we further exploit the semantic bias of models (i.e., prototypes)for each class. In this stage, we generate class prototypes and use them during training to effectively reduce the amount of training data required.

Thus, inspired by [21], to generate prototypes

representing the semantic preference of a model for each class, we maximize the logits of one specific class. Formally, let

denote the logits of class , computed by the classification layer of the target model. By optimizing the MultiMarginLoss, the prototype of class is obtained:

(9)

where is the input and satisfies the constraint of an RGB image, denotes the total number of classes and is a threshold that controls the multi-class margin. In practice, Adam optimizer is applied to find the optimal prototype of class with = 1 and = 10.

To generate adversarial patches misleading to deep models, we introduce the adversarial attack loss. Specifically, we push the prediction label of the adversarial example (i.e., a clean prototype appended with the adversarial patch ) away from its original prediction label . Therefore, adversarial attack loss can be defined as:

(10)

where is the adversarial patch which is initialized as the textural prior , is the prediction of the target model to the input, is the adversarial example which is composed of the prototype and adversarial patch , means the class, and denotes the class label of .

Moreover, recent studies showed that adversarial examples are ineffective to environmental conditions, e.g., different views, illuminations, etc. In the ACO scenario, the items are often scanned from different views with different lighting conditions, which would impair the attack ability of our patches. Thus, we further introduce the idea of expectation of transformations to enhance the attack success rate of our adversarial patches in different conditions, as shown in the expectation of conditions in Eqn (10).

In conclusion, we first exploit the perceptual bias of models and extract a textural prior from hard examples by adopting the style similarities. To further alleviate the heavy dependency on large amounts of data in training universal attacks, we further exploit the semantic bias. As the class-wise preference, prototypes are introduced and pursued by maximizing the multi-class margin. Using the textural prior as initialization, we train our adversarial patches using the prototypes as training data. The illustration of our two-staged adversarial patch attack algorithm can be found in Algorithm 1.

Input: hard example set , target model
Output: bias-based patch
Stage1 : Perceptually Biased Prior Generation
initial by randomly select a hard example from ;
for 

the number of fusion epochs

 do
       for  steps do
             sample a minibatch of hard examples from ;
             optimize to minimize ;
            
      
obtain the prior patch through attention by Eqn (8);
Stage2: Training with Semantically Biased Prototype
get class prototypes set by Eqn (9);
for the number of training epochs do
       for  steps do
             sample a minibatch of prototypes from ;
             optimize the adversarial patch to minimize with prototypes;
            
      
Algorithm 1 Bias-based Universal Adversarial Patch Attack

4 Experiments

In this section, we will illustrate the attack effectiveness of our proposed method in different settings in the ACO scenario.

4.1 Dataset and Evaluation Metrics

As for the dataset, we use RPC [26], which is the largest grocery product dataset so far for the retail ACO task. It contains 200 product categories and 83,739 images, including 53,739 single-product exemplary images. Each image is a particular instance of a type of product, which is then divided into 17 sub-categories (e.g., puffed food, instant drink, dessert, gum, milk, personal hygiene, etc.). Note that the single-product images are taken in an isolated environment and each of them is photographed from four directions to capture multi-views. Fig.3 shows some images from the RPC dataset.

To evaluate our proposed method, we choose classification accuracy as the metric. Specifically, we further report -1, -3 and -5 accuracy in our experiments. Note that the goal of adversarial attacks is compromising the performance of the model, i.e.

, leading to worse values of the evaluation metrics above.

Figure 3: We show 4 images from the category instant noodles in the RPC dataset.

4.2 Experimental Settings

The input image is resized to and the patch size is fixed at . The size of patches only accounts for 0.38% of the size of images. To optimize the loss, we use Adam optimizer with a learning rate of 0.01, a weight decay of 10

, and a maximum of 50 epochs. We use 200 hard examples to optimize our fused prior. All of our code is implemented in Pytorch. The training and inference processes are finished on an NVIDIA Tesla k80 GPU cluster.

As for the compared methods, we choose the state-of-art adversarial patch attack methods including AdvPatch [1], RP [4], and PSGAN [14]. We follow their implementations and parameter settings. Similar to [17], we use 50 item samples per class (10,000 in total) as the training data for the compared methods. We also extract 15 prototypes for each class (3,000 in total) as the training data for our method. With respect to the models, we follow [13] for the ACO task and use ResNet-152 as the backbone. To further improve the attack success rate of adversarial patches against different environments, we introduce transformations as follows:

- Rotation. The rotation angle is limited in .

- Distortion. The distortion rate, i.e., the control argument, moves in .

-Affine Transformation. The affine rate changes between and .

4.3 Digital-world Attack

In this section, we evaluate the performance of our generated adversarial patches on the ACO task in the digital-world in both white-box and black-box settings. We also use a white patch to test the effectiveness of the adversarial attack (denoted as “White”).

(a) White-box Attack
(b) Training Process
Figure 4: (a) shows the White-box attack experiment in the digital-world with ResNet-152. Our method generates the strongest adversarial patches with the lowest classification accuracy. (b) denotes the training process of different methods

As for the white-box attack, we generate adversarial patches based on a ResNet-152 model and then attack it. As shown in Fig.4(a), our method outperforms other compared strategies with large margins. In other words, our adversarial patches obtain stronger attacking abilities with lower classification accuracy contrast to others, i.e., 5.42% to 21.10%, 19.78%, and 38.89% in -1 accuracy.

As for the black-box attack, we generate adversarial patches based on ResNet-152, then use them to attack other models with different architectures and unknown parameters (i.e., VGG-16, AlexNet, and ResNet-101). As illustrated in Table 1, our generated adversarial patches enjoy stronger attacking abilities in the black-box setting with lower classification accuracy for different models.

Besides the classification accuracy, Fig.4(b) shows the training process of adversarial patches using different methods. After several training epochs, the attacking performance of our generated patches becomes stable and keeps the best among all. However, the performance of other methods still vibrates sharply. It is mainly due to the weak generalization ability of other methods. Thus, they achieve different accuracy when attacking different classes showing sharp viberations.

Model Method -1 -3 -5
VGG-16 AdvPatch 73.82 90.73 94.99
RP 81.25 94.65 97.10
PSGAN 74.69 91.25 96.12
Ours 73.72 91.53 95.43
AlexNet AdvPatch 51.11 72.37 79.79
RP 68.27 86.49 91.08
PSGAN 49.39 72.85 82.94
Ours 31.68 50.92 60.19
ResNet-101 AdvPatch 56.19 80.99 91.52
RP 73.52 93.75 98.13
PSGAN 51.26 79.22 90.47
Ours 22.24 51.32 60.28
Table 1: Black-box attack experiment in the digital-world with VGG-16, AlexNet, and ResNet-101. Our method generates adversarial patches with strong transferability among different models

4.4 Real-world Attack

To further validate the practical effectiveness of our generated adversarial patches, a real-world attack experiment is conducted on several online shopping platforms to simulate the ACO scenario. We use Taobao () and JD (), which are among the biggest e-commerce platforms in the world. We take 80 pictures of 4 different real-world products with different environmental conditions (i.e., angles {-30°, -15°, 0°, 15°, 30°} and distances {0.3m, 0.5m, 0.7m, 1m}). The -1 classification accuracy of these images is 100% on Taobao and 95.00% on JD, respectively. Then, we print our adversarial patches by an HP Color LaserJet Pro MFP M281fdw printer, stick them on the products and take photos with the combination of different distances and angles using a Canon D60 camera. A significant drop in accuracy on both platforms can be witnessed with low classification accuracy (i.e., 56.25% on Taobao, 54.68% on JD). The results demonstrate the strong attacking ability of our adversarial patches in real-world scenarios on practical applications. Visualization results can be found in Fig.5.

(a) Taobao
(b) JD
Figure 5: Attack Taobao and JD platform with our adversarial patches. The milk in (a) and the plastic cup in (b) are recognised as the decorations and the aluminum foil when we stick our adversarial patches, respectively

4.5 Generalization Ability

In this section, we further evaluate the generalization ability of adversarial patches on unseen classes. We perform two experiments using the backbone model (ResNet-152), including attacking unseen item classes of adversarial patch training process and target models. For attacking unseen classes of the patch training process, we first train patches on a subset of the dataset, i.e., only images from 100 classes are used w.r.t. 200 classes (we use prototypes for our method and item images for compared methods). According to the results in Table 2, our framework generates adversarial patches with strong generalization ability and outperforms other compared methods with huge margins (i.e., 7.23% to 40.28%, 31.62%, and 60.87%). Meanwhile, we also tested the generalization ability on classes that have never been “seen” by the target model. Specifically, we train our patches on the RPC dataset and test them on the Taobao platform. We select 4 items and stick adversarial patches on them and take 64 pictures. The categories of the items are unseen to target models (not in the 200 classes for ResNet-152), but known to the Taobao platform. Interestingly, after attacks, the - accuracy on Taobao is 64.52%. Though our patches are not trained to attack some specified classes, they still generalize well to these unseen classes. Thus, we can draw the conclusion that our framework generates universal adversarial patches with strong generalization abilities to even unseen classes.

Method AdvPatch RP PSGAN Ours
-1 40.28 60.87 31.62 7.23
Table 2: Attack on unseen classes. Our method generates adversarial patches with the strongest generalization abilities showing lowest accuracy compared with other methods

4.6 Analysis of Textural Priors

Since textural priors have improved universal attacks, a question emerges: “Why and how the textural priors are beneficial to universal adversarial attacks?” Thus, in this section, we further study the effectiveness of our textural priors.

4.6.1 Training from Different Priors

To demonstrate the effectiveness of our textural priors, we begin to study by initializing patches through different priors, e.g., white patch, Gaussian noise, hard example, PD-UA [15], simple fusion, and our textural prior (denoted as “ours”). In contrast to our textual prior, we use the same amount of simple examples to generate the simple version of fused prior (denoted as “SimpleFusion”). Other experimental settings are the same as the settings of the digital-world attack. The visualization of them can be found in Fig.6(a). We train 6 adversarial patches with all the same experimental settings except for the initialization. The corresponding accuracy after attacking are 17.67%, 18.96%, 16.11%, 21,10%, 24.09%, 5.42%. It shows that our fused priors offer adversarial patches the strongest attacking ability.

(a) Different Priors
(b) Boundary Distance
Figure 6: (a) shows different priors we used to generate adversarial patches. They are white patch, Gaussian noise, hard example, PD-UA, simple fusion, and our textural prior respectively, from up to down, left to right. (b) is the decision boundary distance analysis, where our fused prior achieves the smallest decision boundary distances for each class

4.6.2 Decision Boundary Distance Analysis

The minimum distance to the decision boundary among the data points reflects the model robustness to small noises [2]. Similarly, the distance to decision boundaries for an instance characterizes the feasibility performing attack from it. Due to the computation difficulty of decision boundary distance for deep models, we calculate the distance of an instance to specified classes w.r.t. the model prediction to represent the decision boundary distance. Given a learnt model and point with class label (), for each direction (,

) we estimate the smallest step numbers moved as the distance. We use the

norm Projected Gradient Descent (PGD) until the model’s prediction changes, i.e., .

As shown in Fig.6(b), our textural priors obtain the minimum distance in each direction compared to other initialization strategies. It explains the reason that our textural prior performs stronger adversarial attacks because it is more close to the decision boundaries.

4.7 Ablation Study

In this section, we investigate our method through ablation study.

4.7.1 The Effectiveness of Class Prototypes

In this section, we evaluate the effectiveness of the prototypes using ResNet-152. We first study it by using different amounts of prototypes. Specifically, we mix class prototypes and item images from the RPC dataset in different ratios. We use them to train different adversarial patches and assess their attacking ability. Note that the total number of the training data is fixed as 1000. As shown in Table 5, with the increasing number of prototypes, adversarial patches becoming stronger (i.e., lower accuracy).

Besides, we further investigate the amount of data required with our framework to train adversarial patches by solely using prototypes or item images, respectively. Specifically, we first train adversarial patches with 1000 prototypes as Ours. Then, we randomly select 1000, 2000, 4000, 10000 item images from the RPC dataset to train adversarial patches, respectively (denoted by Ours, Ours, Ours, and Ours). The results in Table 3 show that to achieve the approximate attacking ability in Ours setting, a lot more items images are required. It indicates the representative ability of class prototypes for different classes. Thus, we can conclude that our class prototypes are beneficial to improve the attack ability and reduce the amount of training data required.

Settings Ours Ours Ours Ours Ours
- 6.51 12.43 6.57 6.10 5.40
Table 3: The -1 accuracy of the adversarial patches obtained using different amount of training data. To achieve the approximate attacking ability, a lot more items images are required compared to prototypes.
Mixture settings -
(#Prototypes : #Item images)
1000 : 0 6.51
750 : 250 7.81
500 : 500 10.03
250 : 750 11.55
0 : 1000 12.43
Table 5: Ablation study on transformation module. Note that setting w/ means the patch is generated with transformation in training progress, and the setting w/o is the opposite. All of the generated patches are tested in the digital world.
Transformation Settings -
Rotation w/ 11.98
w/o 13.01
Distortion w/ 22.26
w/o 30.70
Affine w/ 16.38
w/o 21.10
Table 4: Training with a different mixture of class prototypes and original item images using the same amount of training data. Obviously, training with class prototypes give adversarial patches with the strongest attack ability.

4.7.2 Transformation Module

Studies have shown that adversarial examples are ineffective to environmental conditions, e.g., different rotations, illuminations, etc. In the ACO scenario, items are often scanned from different views with different lighting conditions. Thus, we introduce a transformation module to reduce the impact of environmental conditions to the the attack ability. Here, we study the effectiveness of different transformation types we used in the module. Specifically, we employ ResNet-152 as the target model and execute only one kind of transformation in each experiment. As shown in Table 5, enabling transformations would increase attacking ability in ACO scenario with lower accuracy (i.e.,11.98% to 13.01% in rotation setting, 22.26% to 30.70% in distortion setting, 16.38% to 21.10% in affine setting).

5 Conclusions

In this paper, we proposed a bias-based attack framework to generate class-agnostic universal adversarial patches, which exploits both the perceptual and semantic bias of models. Regarding the perceptual bias, since DNNs are strongly biased towards textures, we exploit the hard examples which convey strong model uncertainties and extract a textural patch prior from them by adopting the style similarities. The patch prior is more close to decision boundaries and would promote attacks. To further alleviate the heavy dependency on large amounts of data in training universal attacks, we further exploit the semantic bias. As the class-wise preference, prototypes are introduced and pursued by maximizing the multi-class margin to help universal training. Taking ACO as the typical scenario, extensive experiments are conducted which demonstrate that our proposed framework outperforms state-of-the-art adversarial patch attack methods.

Since our adversarial patches could attack the ACO system, it is meaningful for us to study how and why DNNs make wrong predictions. Our framework provides an effective path to investigate model blind-spots. Beyond, it could also be beneficial to improve the robustness of ACO systems in practice.

Model biases, especially texture-based features, have been used to perform adversarial attacks. In contrast, we are also interested in improving model robustness from the perspective of model bias. Can we improve model robustness by eliminating the textural features from the training data? We leave it as future work.

References

  • [1] T. B. Brown, D. Mané, A. Roy, M. Abadi, and J. Gilmer (2017) Adversarial patch. arXiv preprint arXiv:1712.09665. Cited by: §1, §2.1, §4.2.
  • [2] C. Cortes and V. Vapnik (1995) Support-vector networks. MACH LEARN. Cited by: §4.6.2.
  • [3] P. Ekanayake, Z. Deng, C. Yang, X. Hong, and J. Yang (2019) Naïve approach for bounding box annotation and object detection towards smart retail systems. In SpaCCS, Cited by: §1.
  • [4] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song (2017) Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945. Cited by: §4.2.
  • [5] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song (2018) Robust physical-world attacks on deep learning models. In CVPR, Cited by: §1, §2.1.
  • [6] P. Felzenszwalb, D. McAllester, and D. Ramanan (2008) A discriminatively trained, multiscale, deformable part model. In CVPR, Cited by: §3.3.
  • [7] R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel (2018) ImageNet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231. Cited by: §1, §3.2.
  • [8] I. J. Goodfellow, J. Shlens, and C. Szegedy (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572. Cited by: §1, §2.1.
  • [9] D. Karmon, D. Zoran, and Y. Goldberg (2018) Lavan: localized and visible adversarial noise. arXiv preprint arXiv:1801.02608. Cited by: §1, §2.1.
  • [10] B. Kim, C. Rudin, and J. A. Shah (2014) The bayesian case model: a generative approach for case-based reasoning and prototype classification. In NeurIPS, Cited by: §3.4.
  • [11] A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012)

    Imagenet classification with deep convolutional neural networks

    .
    In NeurIPS, Cited by: §1.
  • [12] A. Kurakin, I. Goodfellow, and S. Bengio (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533. Cited by: §2.1.
  • [13] C. Li, D. Du, L. Zhang, T. Luo, Y. Wu, Q. Tian, L. Wen, and S. Lyu (2019) Data priming network for automatic check-out. arXiv preprint arXiv:1904.04978. Cited by: §1, §2.2, §4.2.
  • [14] A. Liu, X. Liu, J. Fan, Y. Ma, A. Zhang, H. Xie, and D. Tao (2019) Perceptual-sensitive GAN for generating adversarial patches. In AAAI, Cited by: §1, §1, §2.1, §4.2.
  • [15] H. Liu, R. Ji, J. Li, B. Zhang, Y. Gao, Y. Wu, and F. Huang (2019) Universal adversarial perturbation via prior driven uncertainty approximation. In ICCV, Cited by: §4.6.1.
  • [16] A. Mohamed, G. E. Dahl, and G. Hinton (2011)

    Acoustic modeling using deep belief networks

    .
    IEEE T AUDIO SPEECH. Cited by: §1.
  • [17] S. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard (2017) Universal adversarial perturbations. In CVPR, Cited by: §1, §2.1, §4.2.
  • [18] K. R. Mopuri, A. Ganeshan, and V. B. Radhakrishnan (2018) Generalizable data-free objective for crafting universal adversarial perturbations. IEEE T PATTERN ANAL. Cited by: §2.1.
  • [19] K. Reddy Mopuri, P. Krishna Uppala, and R. Venkatesh Babu (2018) Ask, acquire, and attack: data-free uap generation using class impressions. In CVPR, Cited by: §1, §2.1, §3.4.
  • [20] R. R. Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra (2016) Grad-cam: why did you say that?. arXiv preprint arXiv:1611.07450. Cited by: §3.3.
  • [21] K. Simonyan, A. Vedaldi, and A. Zisserman (2013) Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034. Cited by: §3.4.
  • [22] P. Stock and M. Cisse (2018) Convnets and imagenet beyond accuracy: explanations, bias detection, adversarial examples and model criticism. In ECCV, Cited by: §3.4.
  • [23] I. Sutskever, O. Vinyals, and Q. Le (2014) Sequence to sequence learning with neural networks. NeurIPS. Cited by: §1.
  • [24] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199. Cited by: §1, §2.1.
  • [25] S. Thys, W. Van Ranst, and T. Goedemé (2019) Fooling automated surveillance cameras: adversarial patches to attack person detection. In CVPRW, Cited by: §2.1.
  • [26] X. Wei, Q. Cui, L. Yang, P. Wang, and L. Liu (2019) RPC: a large-scale retail product checkout dataset. arXiv preprint arXiv:1901.07249. Cited by: §1, §2.2, §4.1.
  • [27] C. Xiao, B. Li, J. Zhu, W. He, M. Liu, and D. Song (2018) Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610. Cited by: §1.
  • [28] C. Zhang, A. Liu, X. Liu, Y. Xu, H. Yu, Y. Ma, and T. Li (2019) Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. arXiv preprint arXiv:1909.06978. Cited by: §1.
  • [29] T. Zhang and Z. Zhu (2019) Interpreting adversarially trained convolutional neural networks. arXiv preprint arXiv:1905.09797. Cited by: §1, §1, §3.2.
  • [30] J. Zhu, T. Park, P. Isola, and A. A. Efros (2017)

    Unpaired image-to-image translation using cycle-consistent adversarial networks

    .
    In ICCV, Cited by: §2.2.