1. Introduction
In the last few year, Deep Neural Networks (DNNs) have been tremendously popular in a various domains including image processing and computer vision. However, recently, the robustness of DNNs has been questioned when facing adversarial inputs. The performance of DNNs can significantly drop even on slightly perturbed instances
(Szegedy et al., 2013). For the task of image classification, attackers put constraints on perturbations such that they remain unnoticeable to the human eye, but they are still able to greatly deteriorate the performance of the model.Utilizing machine learning methods which are vulnerable to adversarial attacks in system where safety and security are critical factors may cause serious problems. Therefore, it is crucial to have a robust model against adversaries, especially in securitysensitive domains like autonomous driving and medical imaging. To address this concern, recent studies have conducted research to analyze vulnerability of deep learning methods in order to come up with defense techniques against the adversarial attacks
(Das et al., 2018; Bhagoji et al., 2017; Metzen et al., 2017; Papernot et al., 2016b).To measure the strength of a perturbation, usually an or norm is used. Adversarial perturbations are mostly designed so that they have a small norm and are unnoticeable to human inspection. Designing a defense mechanism is a difficult task. Typically, the defender has only access to the perturbed instances (and definitely not the original ones, where there would be hope to identify which parts have been tampered with) and should be able to defend against different types of perturbations. Moreover, a defense mechanism which specialized on a particular kind of attack could be easily defeated by new attacks which are optimized against its strategy. Therefore, designing a defense technique which captures a universal pattern across various attacks is highly desirable, since this will able to defend against most of the adversarial attacks.
Shield proposed by Das et al. (Das et al., 2018), is a realtime defense framework which performs JPEG compression with random levels over local patches of images to eliminate unnoticeable perturbations which mostly appear in high frequency spectrum of images. In this paper, we propose a tensor decomposition approach to compute a lowrank approximation of images which significantly discards highrank perturbations. However, Shield considers images in isolation and does not pay attention to the correlation of images when facing adversarial attacks.
Our contributions are as follows:

Defense through the lens of factorization: We propose a novel defense against adversarial attacks on images which utilizes tensor decomposition to reconstruct a lowapproximation of perturbed images before feeding them to the deep network for classification. Without any retraining of the model, our method can significantly mitigate adversarial attacks.

Efficient and effective method: Representing images with tensor, allows processing images in batches as 4mode tensor, which is able to capture latent structure of perturbations from multiple images rather than a single image which leads to more performance improvements.
The rest of this paper is organized as follows. In Section 2 we discuss related work. We introduce our proposed method in Section 3 and provide experimental results in Section 4. Finally, in section 5 we offer conclusions and discuss future works.
2. Related Work
2.1. Adversarial Attacks
In this paper, we focus on defending against adversarial attacks on deep learning methods for the task of image classification. Here, we briefly outline some of the most popular adversarial attacks on images.
Given a classifier
, the goal of an adversarial attack is to modify an instance to a perturbed instance such that , while keeping the distance between perturbed and clean instance small. By we deonte some norm which is also used to express the strength of the perturbations. The popular choices are Euclidean distance ( norm) and Chebyshev distance ( norm). Here, we discuss some of the popular attacks, against which we evaluate our proposed method.Fast Gradient Sign Method (FGSM)(Goodfellow et al., 2014): FGSM is a fast method to compute perturbations which is based on computing firstorder gradients. FGSM generates adversarial images by introducing a perturbation as follows:
(1) 
where is a userdefined threshold that determines the strength of the perturbations and controls the magnitude of perturbations per pixel. is the parameter of the model, is the true label of the instance , and is the cost of training the neural network.
Iterative Fast Gradient Sign Method (IFGSM)(Kurakin et al., 2016): IFGSM is the iterative version of the FGSM. In each iteration , IFGSM clips the pixel values to remain within the neighborhood of the corresponding values from a “clean” instance :
(2) 
Projected Gradient Descent (PGD)(Madry et al., 2017): PGD is one of the strongest gradientbased attacks (Madry et al., 2017) Given a clean image , PGD aims to find a small perturbation to generate the perturbed instance . PGD starts from a random perturbation and iteratively updates the perturbation:
(3) 
where is a fixed step size. projects the perturbation onto set , set of allowed perturbation in the neighborhood the “clean” instance .
2.2. Defense Against Adversarial Attacks
Shield proposed by Das et al. (Das et al., 2018), uses image preprocessing as a defense mechanism to reduce the effect of perturbations. Shield is based on the observation that the attacks described above are highfrequency, thus, eliminating those high frequencies (which are not generally visible by the human eye) will sanitize the image. Shield performs Stochastic Local Quantization (SLQ) as a preprocessing step and subsequently employs JPEG compression with qualities 20, 40, 60, and 80 on the image, then for each block of the image, randomly selects from one of the compressed images. Shield also retrains the model on images compressed with different JPEG qualities and uses an ensemble of these models to defends against adversarial attacks.
In this paper, we preprocess images using tensor decomposition techniques to achieve a lowrank approximation of the image. We can significantly alleviate the effect of perturbations without performing any retraining. In a parallel approach (Entezari et al., 2020)
employs singular value decomposition to compute a lowrank approximation of graph to defend against adversarial attacks on graphs. However, this paper is the first to identify and leverage the observation that gradientbased attacks on deep learning image classifiers are manifested in highrank components of a decomposition of the image.
3. Proposed Method
In this section, we first investigate the characteristics of adversarial attacks on networks designed for the task of image classification. Then we propose a tensorbased defense mechanism against these attacks which improves the performance of the network.
3.1. Characteristics of Image Perturbations
Assume a trained model with a high accuracy on clean images is given. Adversarial attacks perform perturbations on the clean images in a way that they are imperceptible to humans, yet are successful in deceiving the model to misclassify the perturbed instances. In other words, for a clean image and its corresponding perturbed image , the goal is to have: . The adversarial attacks do not preserve the spectral characteristics of images and add high frequency components to images to remain unnoticeable to the human eyes (Das et al., 2018). Perturbations in image domain are crafted in a way that mostly affect high frequency spectrum of images. Therefore, discarding the high frequency factors of the image using approaches like compression or lowrank approximation of images could be successful defense mechanisms against these type of perturbations. Therefore, a mechanism that only keeps the lowrank components of the image and discards the highrank ones, can be successful in discarding the perturbations. In (Das et al., 2018), authors leverage JPEG compression to remove high frequency components of the image and alleviate the effect of perturbations. In this paper, we study the problem from a “matrix spectrum” point of view (i.e., the singular value profile and the intrinsic lowrank dimensionality of the data) and use tensor decomposition techniques to achieve a lowrank approximation of perturbed images.
3.2. Tensorbased Defense Mechanism
In this section, we briefly describe concepts and notations used in the paper.
A tensor, denoted by , is a multidimensional matrix. The order of a tensor is the number of modes/ways and is the number of indices required to index the tensor (Papalexakis et al., 2017). An RGB image is a threemode tensor where the first and second modes correspond to the pixels and the third mode corresponds to the red, green, and blue channels, i.e. the frontal slices are red, green, and blue channels of the image. An RGB image of size is a 3mode tensor of size , where and are width and height of the image, respectively.
To achieve a lowrank approximation of the perturbed images, we perform a tensor decomposition technique on the image and by choosing small values for the rank of the tensor, we reconstruct a lowrank approximation of the image which is fed to the deep network. The lowrank approximation of image discards high frequency perturbations which can improve the performance of the network on the perturbed images. However, traditional tensor decomposition techniques like CP/Parafac (Harshman, 1970) and Tucker(Tucker, 1966) are timeconsuming and may slow down the neural network performance which makes our proposed method impractical for realtime defense. To overcome this issue, we leverage TensorTrain decomposition (Oseledets, 2011)
which scales linearly with respect to the dimension of the tensor and was especially introduced to address the problem of curse of dimensionality
(Oseledets, 2011). This highlydesirable property of the TensorTrain allows us to process images in batches which form a 4mode tensor and perform the TensorTrain decomposition on 4mode tensors quite fast. For a batch of images, the size of the 4mode tensor will be . Generally decomposing a 4mode tensor is slower compared to a 3mode one, however, by considering images in batches, some of the I/O overhead is reduced which results in almost the same processing time on the entire dataset. Furthermore, processing images in batches improves the performance of the model. The reason behind this is that decomposing images in batches, extracts latent structure corresponding to perturbations from multiple images and captures general characteristics of perturbations.For a 4mode tensor, the TensorTrain decomposition can be written as follows:
(4) 
Figure 2 illustrates the TensorTrain decomposition of a 4mode tensor.
Another possible representation for the batch of images is to convert the 4mode tensor to a 3mode tensor by stacking the images along the third mode, i.e. stacking RGB channels and the result tensor will be of dimension . Figure 3 illustrates a 3mode stacked tensor of images.
There are other ways to convert a 4mode tensor into a 3mode one. For instance, another way is to flatten the RGB image into a matrix with three columns corresponding to the channels of the image. With this representation, the final tensor will be of size
. One disadvantage of this representation is that flattening the image ignores the spatial relationship of the pixels. Moreover, with this vectorized representation, the first dimension is much bigger than the other two dimensions and requires a larger value of rank to get a reasonable approximation of the image, and larger ranks make the decomposition slower. For these reasons, we do not consider the vectorized representation in our study. In the experimental evaluations that follows, we will examine different representation including single image versus batch of images and 3mode tensors versus 4mode tensors.
4. Experimental Evaluation
In this section, we show how the proposed method can successfully remove adversarial perturbations and we compare our results to Shield (SLQ). According to (Cornelius, 2019), original Shield evaluations has gained benefit from central cropping of images during evaluation, whereas the perturbations were generated with cropping being off. In all our evaluations, we disable the central cropping.
4.1. Experiment Setup
We performed experiments on the validation set of the ImageNet dataset which includes 50,000 images from 1,000 classes. All experiments are performed on the ResNetv2 50 model from the TFSlim module of TensorFlow. The adversarial attacks are from the CleverHans package ^{1}^{1}1https://github.com/tensorflow/cleverhans (Papernot et al., 2016a). We performed the experiments on a machine with one NVIDIA Titan Xp (12 GB) GPU. We used TensorLy ^{2}^{2}2https://github.com/tensorly/tensorly library in Python to perform tensor decomposition techniques (Kossaifi et al., 2019).
4.2. Parameter Tuning
In our evaluations, we express different configurations in form of a list as [tensor decomposition, tensor representation, batch size, rank] and we investigate the accuracy and runtime of the ResNetv2 50 on 1000 images from the ImageNet dataset for different configurations. The possible values for each part of the configuration list is as follows:

Tensor decomposition: {Parafac, Tucker, TensorTrain}

Tensor representation: {3mode, 3modestacked, 4mode}

Batch size: {1, 5, 10, 20, 50}

Rank: varies by choice of tensor representation and decomposition.
Performing tensor decomposition for a batch of images can reduce the decomposition overhead compared to decomposing a single image and accelerates the entire evaluation process. Moreover, considering images in batches helps to better capture the pattern of perturbations from multiple images. However, the choice of the right batch size is important. A large batch of images needs larger ranks for decomposition and could get very slow. Also, in a large batch of images, the variety of images which are from different classes increases which deteriorates the performance of the decomposition. To find the best batch size, we perform a grid search on values 5, 10, 20, and 50. Tensor Train decomposition of a 4mode tensor requires setting 3 values for the ranks. The first value corresponds to compressing the batches, the second value corresponds to compressing the image pixels, and the third value corresponds to compressing the RGB channels. We fix the first rank to the number of batches and the third rank to the number of channels i.e. 3. For the second rank, we search within range 40 to 150. Figure 4 shows the accuracy and runtime of the model for different batch sizes for TensorTrain decomposition with ranks ranging from 50 to 120 with steps of 5. The figure also shows how processing single images (batch size 1) differs from batch sizes greater than 5. In the case that we are processing single images, the runtime increases as the rank gets larger, however, as the batch size increase, the runtime becomes less sensitive to the ranks and for the batch size 50 it will become almost constant for all the ranks. Batch size 5 produces the highest accuracy, while batch size 10 has the lowest runtime. There is a tradeoff between runtime and accuracy. Based on the priorities of the system, one might sacrifice accuracy for speed.
Figure 5 shows the effect of different batch sizes on the 3modestacked representation. Plots for batch sizes 5, 10, and 20 are almost identical in both accuracy and runtime. Batch size 50 produces the highest accuracy with the 3modestacked representation. However, the highest accuracy with 3modestacked representation is lower than the highest accuracy achieved using the 4mode representation.
4.3. Results
As mentioned in Section 3, TensorTrain performs much faster than Parafac and Tucker. Therefore, for the Parafac and Tucker, we only report the result for the configuration which corresponds to the maximum accuracy, as a reference for comparison against TensorTrain. Table 1 shows the result.
Configurations  PGD  FGSM  iFGSM  Runtime 

()  ()  ()  (seconds)  
No defense  11.10  18.40  7.49  
[TensorTrain, 4mode, 5, [5,90,3]]  51.53  43.59  50.46  675 
[TensorTrain, 4mode, 10, [10,100,3]]  51.01  43.10  49.95  605 
[TensorTrain, 3mode, 1, 40]  49.75  42.32  48.52  530 
[Tucker, 3modestacked, 30, [105,105,90]]  49.37  40.07  48.79  1050 
[Parafac, 3mode, 1, 60]  48.11  41.38  49.75  5500 
SLQ  44.60  29.40  38.60  410 
As illustrated in Table 1, TensorTrain outperforms Tucker and Parafac with respect to both accuracy and runtime. TensorTrain performed on 4mode tensor has produces the highest accuracy. As explained earlier, processing images in batches better captures latent components corresponding to perturbation by leveraging higherorder correlations. TensorTrain can be utilized with different tensor representations (3mode, 3modestacked, or 4mode) to adjust to needs for higher accuracy or higher speed. While the 4mode representation produces the highest accuracy, the 3mode single image representation can be used to speed up the process, with small drop in the accuracy. SLQ is the fastest among all defenses, but it has the lowest accuracy.
Patch size  Ranks  Accuracy  Runtime (seconds) 

[5,10,3]  
[50,50]  [5,20,3]  48.35  1100 
[5,30,3]  
[5,70,3]  
[5,40,3]  
[150,150]  [5,50,3]  50.96  765 
[5,60,3]  
[5,70,3]  
No patching  [5,90,3]  50.48  710 
[5,110,3] 
4.4. Introducing Randomness to the Defense Framework
Incorporating some randomness in the defense framework has makes the job of the attacker more difficult to deal with a random strategy rather than a fixed one. By selecting randomly from a set of ranks, we can add randomness to the tensor decomposition process. Another way is to split image into small patches, similar to local patches from Shield, and perform decomposition of random rank on each patch and stitch up the patches to reconstruct a randomized lowrank approximation of images. In a 4mode tensor representation, splitting images into patches creates smaller 4mode tensors, e.g. splitting a 4mode tensor including 5 batches of images with size into patches of size creates 6 tensors of size . Table 2 shows the results of incorporating randomness with tensor decomposition.
5. Conclusions
In this paper, we explored to what extent lowrank tensor decomposition of perturbed images during the preprocessing step helps to defend against adversarial attacks. The lowrank approximation of the perturbed image is then fed to the deep network for the task of classification. We evaluated our method against popular adversarial attacks: FGSM, IFGSM, and PGD. We illustrated that considering images in small batches better captures the latent structure of perturbations and helps to improve the performance of the model. We also showed that how different configurations allow to tradeoff between accuracy and runtime.
References
 (1)
 Bhagoji et al. (2017) Arjun Nitin Bhagoji, Daniel Cullina, and Prateek Mittal. 2017. Dimensionality reduction as a defense against evasion attacks on machine learning classifiers. arXiv preprint (2017).
 Cornelius (2019) Cory Cornelius. 2019. The Efficacy of SHIELD under Different Threat Models. arXiv preprint arXiv:1902.00541 (2019).
 Das et al. (2018) Nilaksh Das, Madhuri Shanbhogue, ShangTse Chen, Fred Hohman, Siwei Li, Li Chen, Michael E Kounavis, and Duen Horng Chau. 2018. Shield: Fast, Practical Defense and Vaccination for Deep Learning using JPEG Compression. arXiv preprint arXiv:1802.06816 (2018).
 Entezari et al. (2020) Negin Entezari, Saba AlSayouri, Amirali Darvishzadeh, and Evangelos Papalexakis. 2020. All You Need Is Low (Rank): Defending Against Adversarial Attacks on Graphs. In 13th ACM International Conference on Web Search and Data Mining WSDM.
 Goodfellow et al. (2014) Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
 Harshman (1970) R.A. Harshman. 1970. Foundations of the PARAFAC procedure: Models and conditions for an” explanatory” multimodal factor analysis. (1970).
 Kossaifi et al. (2019) Jean Kossaifi, Yannis Panagakis, Anima Anandkumar, and Maja Pantic. 2019. Tensorly: Tensor learning in python. The Journal of Machine Learning Research 20, 1 (2019), 925–930.
 Kurakin et al. (2016) Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 (2016).
 Madry et al. (2017) Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
 Metzen et al. (2017) Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On detecting adversarial perturbations. arXiv preprint arXiv:1702.04267 (2017).
 Oseledets (2011) Ivan V Oseledets. 2011. Tensortrain decomposition. SIAM Journal on Scientific Computing 33, 5 (2011), 2295–2317.
 Papalexakis et al. (2017) Evangelos E Papalexakis, Christos Faloutsos, and Nicholas D Sidiropoulos. 2017. Tensors for data mining and data fusion: Models, applications, and scalable algorithms. ACM Transactions on Intelligent Systems and Technology (TIST) 8, 2 (2017), 16.
 Papernot et al. (2016a) Nicolas Papernot, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Fartash Faghri, Alexander Matyasko, Karen Hambardzumyan, YiLin Juang, Alexey Kurakin, and Ryan Sheatsley. 2016a. cleverhans v2. 0.0: an adversarial machine learning library. arXiv preprint arXiv:1610.00768 (2016).
 Papernot et al. (2016b) Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016b. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP). IEEE, 582–597.
 Szegedy et al. (2013) Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
 Tucker (1966) L.R. Tucker. 1966. Some mathematical notes on threemode factor analysis. Psychometrika 31, 3 (1966), 279–311.
Comments
There are no comments yet.