1 Introduction
Recently, deep neural networks (DNNs) have demonstrated its great potential of surpassing or close to humanlevel performance in multiple domains, such as object recognition [1], Game AI [2], synthetic voice [3], neighborhood voting prediction [4] and etc [5]
. It stimulates the demand for deploying the stateoftheart deep learning algorithms in realworld applications to release labors from repetitive work. Under such circumstance, the security and robustness of deep neural network is an essential concern which cannot be circumvented.
Adversarial example [6] (aka., adversarial attack) is a wellknown security issue of DNN, which can cause the system malfunction with the magnitudeconstrained input noise that mankind cannot discern. Both attack and defense of adversarial example on the input end of DNN has been heavily investigated in the past couple of years [7, 6, 8] and still be in progress [9, 10, 11]. Nevertheless, the security issue of network parameters themselves is not yet well explored. Recently, the development of fault injection attack [12] has raised further security concerns on the storage of DNN parameters.
The possible reasons that there was a lack of concerns on the security of network parameters may come in twofold: 1) The neural network is widely recognized as a robust system against parameter variations. 2) The DNNs are used to be only deployed on the highperformance computing system (e.g., CPUs, GPUs, and other accelerators [13, 14]), which normally contains a variety of methods ensuring data integrity. Thus, attacking the parameters is more related to a system cybersecurity topic. However, the game has been totally changed during the past few years. First, the robustness of neural network to small perturbation has been put into the spotlight by adversarial examples on DNN input [6, 7]. Second, with the aid of DNN compression techniques (e.g., pruning[15] and quantization [16]) and outstanding compact neural network architectures [17, 18], deep neural networks now are friendly to the resourcelimited mobile device as well. Such resourcelimited platforms normally lack effective data integrity check mechanism, which makes the deployed DNN vulnerable to popular fault injection techniques, such as row hammer and laser beam [19].
Recently, there exist a cohort of works [12, 20] in an attempt to attack DNN network parameters stored in DRAM using Row Hammer Attack (RHA). However, the key limitation to these previous attack methods is that they primarily focused on extremely vulnerable fullprecision DNN model (i.e., parameters in floatingpoint format). Our conducted simulation shows that randomly flipping the exponent part of floatingpoint weight could easily overwhelm the functionality of DNN. The explanation behind that is flipping the bits in exponent part of floatingpoint value can increase the weight to extremely large value, thus leading to the exploded output. As a result, attacking the weight constrained DNN (i.e., weights quantized into fixedpoint values) is the primary focus in this work, where the range of weight magnitude relies on the bitwidth of weights.
Overview of BitFlip attack:
In this work, we attempt to perform parameter attack on the weights of quantized DNN, whose weight magnitude is intrinsically constrained owing to the fixedpoint representation. In order to conduct an efficient bitflip attack on weights, for the first time, we propose a BitFlip Attack (BFA) together with Progressive Bit Search (PBS) technique, that can totally crush a fully functional quantized DNN and convert it to a random output generator with several bitflips. Our proposed PBS combines gradient ranking and progressive search to locate the most vulnerable bits, while BFA performs the bitflip operations on the located bits along their gradient ascending directions. In order to identify the vulnerable bits to be flipped within the identical layer and across different layers, we perform the inlayer search and crosslayer search in an iterative way. Thus, for each BFA iteration, only the most vulnerable bit elected by the PBS technique will be flip to its opposite binary value. The extensive experiments are conducted regarding various network structure, different datasets and quantization bitwidth, and etc. It is shocking to notice that ResNet18 will become a random output generator (i.e., 0.1% top1 accuracy) with only 13 bitflips out of 93 million bits by our proposed attacking method, on ImageNet dataset.
2 Related Work
Memory BitFlip in RealWorld:
Flipping a memory cell bit within memory system is a realistic and demonstrated threat model in existing computer systems. Recently, Kim et al., [21] have demonstrated a method to cause memory bitflip in DRAM merely through the frequent data accessing, which is now popularly known as RowHammer Attack (RHA). A malicious user can use RHA to modify the data stored in DRAM memory cell by just flipping one bit at a time. [22] showed that by creating a profile for the bit flips in a DRAM, row hammer attack can effectively flip a single bit at any address in the software stack. According to the stateoftheart investigations, common error detection and correction techniques, such as ErrorCorrecting Code (ECC) [23] and Intel SGX [24], are broken defense mechanism to RHA. Such existing memory bitflip attack (i.e. rowhammer attack) model brings a huge challenge to the security of DNN powered computing system since its parameters are normally stored in the main memory, i.e. DRAM, for maximizing the computation throughput, which is directly exposed to the adversarial attacker. Moreover, such challenge becomes more severe considering the fact that DNN powered applications are widely deployed in many resourcelimited (e.g. smart IoT devices, mobile system, edge devices, etc.) system that lacks necessary data integrity check mechanism.
Previous Neural Network Parameter Attack.
Adversarial example attack has been widely explored [25] to evaluate the robustness of DNN. However, we are still at the rudimentary stage towards investigating the effect of network parameter attack on neural network accuracy. Neural network parameters have been attacked using different levels of hardware trojans, which require a specific pattern of input to trigger the trojan inside the network [26]. Moreover, such trojan attack requires hardware level modifications, which may not be feasible in many practical applications. As a result, fault injection attacks could become a suitable alternative to attack DNN parameters [12]
. For example, single Bias attack (SBA) attacks a certain bias term of a neuron to change the classification of DNN to a different class
[12]. Other works have injected faults into the activation function of the neural network to miss classify a target input
[20].Limitations of previous works.
However, these previous attack algorithms are developed based on a fullprecision model (i.e. network parameters are floatingpoint numbers stored in memory in the format of IEEE standard for floatingpoint arithmetic
[27]), where we believe such attack algorithms may not be efficient. Since it is extremely easy to cause DNN malfunction by just flipping the most significant exponent bits of any random floatingpoint weight parameters. Through this simple method, it mainly causes DNN malfunction by exponentially increasing the magnitude of particular weight parameters by just several bitflips. We conducted such experiment to prove its efficiency in section 4.4. Based on our simulation results, it shows just 1 bitflip of the most significant exponent bit of a random floatingpoint number weight could cause ResNet18 network totally malfunction on ImageNet dataset.Why we need a bit search algorithm.
On the other side, most of recent deep neural network applications are performed in quantized platform such as google’s Tensor Processing Unit (TPU)
[28], that uses 8bit operations for quantized network. Such fixed precision models are more robust to network parameter perturbation. Similarly, we conducted another experiment to randomly choose quantized weight for bitflip attack using RHA. The simulation results in figure 4 show that 100 bitflip in a quantized ResNet18 could only cause 0.6% accuracy degradation in ImageNet, which clearly indicates that random selection of quantized weight parameters to be attacked is not efficient and feasible. Thus, an efficient algorithm is required to search for the most vulnerable weights/bits in a quantized DNN.3 Approach
In this section, we present a novel BitFlip Attack (BFA) method to maliciously cause a DNN system malfunction through flipping extremely small amount of vulnerable bits of weights. Our proposed algorithm, called Progressive Bit Search (PBS), is to identify those vulnerable DNN weight parameters (stored in terms of memory bits in DRAM) that could maximize the accuracy degradation with minimum number of bitflips. It is worth to note that this work focuses on BFA on a more robust DNN with quantized weight parameters instead of floatingpoint number weights as discussed earlier.
3.1 Problem Definition
Given a quantized DNN contains convolutional/fullyconnected layers, the original weights in floatingpoint are symmetrically quantized into levels with bits uniform quantizer. The quantized weights W are arithmetically represented in bits signed integer. In the computing memory system, W is stored in the format of twos complement^{1}^{1}1All the binary weight mentioned hereinafter referred to as the weights in twos complement., which is denoted as B in this work. More details of weights quantization are described in Section 3.2. The goal of this work is to find the optimal combination of vulnerable weight bits to perform BFA, thus maximizing the inference loss of DNN parameterized by the perturbed weights whose twos complement representation is . Such vulnerable bit searching problem can be formulated as an optimization problem as:
(1) 
where and
are the vectorized input and target output
^{2}^{2}2Note that, all the targets in this work are not the groundtruth labels, but the outputs of the clean DNN w.r.t the input data.. Taken as the input, the inference computation of network parameterized by is expressed as . Note that calculates the loss between DNN output and target. computes the Hamming distance between clean and perturbedbinary weight tensor, and is maximum Hamming distance allowed through the entire DNN.3.2 Quantization and Encoding
Weight quantization.
In this work, we adopt a layerwise bits uniform quantizer for weight quantization. For th layer, the quantization process from the floatingpoint base to its fixedpoint (signed integer) counterpart can be described as:
(2) 
(3) 
where is the dimension of weight tensor, is the step size of weight quantizer. For training the quantized DNN with nondifferential staircase function (in Eq. 3
), we use the straightthrough estimator
[29] as other works [16]. Note that, since is the coefficient shared by all the weights in th layer, we only store its fixedpoint part , rather than .Weight Encoding.
The computing system normally stores the signed integer in two’s complement representation, owing to its efficiency in arithmetic operations (e.g., mul). Given one weight element , the conversion from its binary representation () in two’s complement can be expressed as:
(4) 
With the conversion relation described by in Eq. 4, we can inversely obtain the binary representation of weights B from its fixedpoint counterpart as well.
3.3 BitFlip Attack
In this work, we perform the BFA utilizing the similar mechanism as FGSM [6], which was used to generate adversarial example. The key idea of BFA is to flip the bits along its gradient ascending direction w.r.t the loss of DNN. We take the binary vector in Eq. 4 as an example and attempt to perform BFA upon . We first calculates the gradients of w.r.t loss as:
(5) 
where is the inference loss of DNN parametrized by . The naive operation is to directly perform the bitflip using the gradients obtained in Eq. 5 and get perturbed bits as:
(6) 
where . However, since the bit value is constrained between 0 and 1 (), flipping the bit as Eq. 6 could lead to data overflow. Ideally, the BFA is supposed to follow the truth table in Table 1. Thus, we mathematically redefine the BFA as follows:
(7) 
(8) 
where is the bitwise xor operator. is the mask which indicates whether to perform the bitflip operation.
sign()  

0  1 (+)  1  1 
0  0 ()  0  0 
1  1 (+)  1  0 
1  0 ()  0  1 
3.4 Progressive Bit Search
Rather than performing the BFA upon each bit throughout the entire network, our goal is to perform BFA in a more precise and effective fashion. In this subsection, we propose a method called Progressive Bit Search (PBS) which combines the gradient ranking and progressive search. The proposed PBS method attempts to identify and flip most vulnerable bits per BFA iteration ( by default), thus progressively degrading the performance of DNN until it reaches the minimum accuracy or the preset number of iteration. As the flowchart of performing PBS depicted in Fig. 1, for each attack iteration, the process of bit searching can be generally divided into two successive steps: 1) Inlayer Search: the inlayer search is performed through electing the most vulnerable bits in the selected layer, then record the inference loss if those elected bits are flipped. 2) Crosslayer Search: with the inlayer search conducted upon each layer of the network independently, the crosslayer search is to evaluate the recorded loss increment caused by BFA with inlayer search, thus identify the top vulnerable bits across different layers. The details of each step are described as follows.
Inlayer Search.
For the PBS in th iteration, inlayer searching of the most vulnerable bits from in th layer is performed through gradient ranking. With the given vectored input and target , the inference and backpropagation are performed successively to calculate the gradients of bits w.r.t the inference loss. Then, we descendingly rank the vulnerability of bits by the absolute value of their gradients and elect the bits whose gradients are top, such process can be written as:
(9) 
where function returns the pointer pointing at the storage of those elected vulnerable bits. Then, we apply the BFA on those elected bits as:
(10) 
where the mask is generated following Eq. 7. Now, with the inlayer search and BFA performed on the th layer, we have to evaluate the loss increment caused by BFA in Eq. 10, which can be written as:
(11) 
where the only difference between and are the bits flipped in Eq. 10. Note that, those bits flipped to in Eq. 10 will be restored back to after the loss evaluation is finished.
Crosslayer Search.
As the aforementioned inlayer search can perform the layerwise vulnerable bits election and BFA evaluation, the crosslayer search evaluates the BFA across the entire network. For the PBS in th iteration, the crosslayer search first independently conduct the inlayer search on each layer, and generate the loss set as . Then, we could identify the layer with maximum loss and reperform the BFA (without restore) on the bits elected in th layer, which can be expressed as:
(12) 
After that, PBS is entered into iteration.



Acc.  Acc.  Acc.  
Net20  92.11  92.28  [7,10,10,12,17]  [7,10,10,12,17]  91.89  [8,8,11,12,13]  [8,8,11,12,13]  91.85  [7,7,7,8,12]  [7,7,7,8,12]  
Net32  92.77  92.32  [8,9,12,13,31]  [8,9,12,13,31]  93.09  [9,10,12,14,23]  [9,10,12,14,23]  92.31  [10,12,14,14,17]  [10,12,14,14,17]  
Net44  93.10  93.60  [6,10,11,13,22]  [6,10,11,13,22]  93.39  [13,13,15,16,17]  [13,13,15,16,17]  91.52  [14,14,15,16,50]  [14,14,15,16,50]  
Net56  92.59  93.14  [16,17,18,22,22]  [16,17,18,22,22]  93.56  [16,16,17,20,21]  [16,16,17,20,21]  92.53  [9,21,21,23,24]  [9,21,21,21,24] 
4 Experiments
4.1 Experimental setup
Datasets:
We take two visual datasets: CIFAR10 [30] and ImageNet [31] for object classification task. CIFAR10 contains 60K RGB images in size of . Following the standard practice, 50K examples are used for training and the remaining 10K for testing. The images are drawn evenly from 10 classes. ImageNet dataset contains 1.2M training images divided into 1000 distinct classes. The data augmentation used in this work is identical to methods in [32]. Note that, the proposed BFA is performed through randomly draw a sample of input images from the test/validation set, where the default sample size is 128 and 256 for CIFAR10 and ImageNet respectively. Then, only the sample input is used to perform BFA, where the rest data and groundtruth labels are isolated from the attacker. Moreover, each experimental configuration is run with 5 trials to alleviate error caused by the randomness of sampling input.
Network Architectures and quantization:
For CIFAR10, experiments are conducted on series of residual network (ResNet20/32/44/56)[32], where the weights are quantized into 4/6/8 bitwidth with retraining. For ImageNet, we choose a variety of famous network structures, including AlexNet, ResNet18/34/50. Based on our observation, with high bitwidth quantizer (e.g., =8), directly quantizing the pretrained fullprecision DNN without retraining (i.e., finetuning) only shows negligible accuracy degradation. Therefore, for fast evaluation of our proposed BFA on ImageNet dataset and its various network structures, we directly perform the weight quantization without retraining before conducting the BFA.
Attack Formulation:
Traditional attack mostly focuses on attacking DNN by feeding perturbed inputs [6] to the network. Such adversarial attack can be grouped into two major categories: 1) whitebox attack [6, 7], where the adversary has full access to the network architecture and parameters, and 2) blackbox attack [33, 34], where the adversary can only access the input and output of a DNN without its internal configurations. For our proposed BFA, it demands the full access to the DNN’s weights and gradients. Thus BFA can be considered as a white box attack. However, we assume that even under white box attack setup, the attacker has no access to the training dataset, training algorithm and hyper parameters used during the training of network.
4.2 BFA on CIFAR10
Our bitflip attack is evaluated across different architectures (i.e., ResNet20/32/44/56) using varying quantized bitwidths (i.e., =4/6/8) on CIFAR10 dataset in Table 2. Without BFA, the quantized models show negligible accuracy degradation or even higher accuracy in comparison to their fullprecision counterpart. The quantization noise introduced by the weight quantization is considered as a regularization method, which might contribute the accuracy improvement when model training is overfitting.
Since CIFAR10 dataset has 10 different classes of object, degrading the model’s accuracy down to 10% is equivalent to make the model as random output generator. In contrast to adversarial example (e.g., PGD attack [7]), our proposed BFA is unable to degrade the network accuracy to 0%. The reason is adversarial example is an inputspecific attack which is designed to misclassify each input separately, while our proposed BFA attempts to misclassify the images from each object category using the identical attacked model. Consequently, the successful BFA would be making the DNN to generate output randomly. Therefore, we report the number of bitflips required to cause the DNN’s test accuracy to go below 11% as the measurable indicator of BFA performance, for CIFAR10 dataset.
As the experimental result listed in Table 2, for all the ResNet architecture with varying quantization bitwidth, the required number of bitflips to make the DNN malfunction is most likely below 20. Besides , we take the hamming distance
between clean and perturbedmodel as another measurable indicator. The intuition behind is our proposed BFA attempts to flip the selected bits without considering its original status. Thus, it exists the probability that some of the bits might be flipped repeatedly with even times. However, the reality is that such back and forth bitflips rarely happen throughout all the experiments. Under varying quantization configurations, there is no obvious relation between the quantization bitwidth and the required number of bitflips (i.e., robustness of DNN against BFA).
4.3 BFA on ImageNet
The summary of evaluation of our attack on ImageNet dataset is presented in table 3. We report both baseline and 8bit quantized network accuracy for four popular image classification architectures on ImageNet. We observe roughly 0.10.4 % reduction in Top1 classification accuracy after quantizing the network’s weights to 8bits. Since ImageNet dataset has 1000 different classes of objects, a classification accuracy of 0.1% can be considered as random output. Thus reporting only the number of bit flips required to cause the accuracy to degrade to below 0.2% would be sufficient to prove the attack’s effectiveness.






56.55/79.08  56.13/78.94  17  17  

69.76/89.08  69.50/88.98  13  13  

73.30/91.42  73.13/91.38  11  11  

76.15/92.87  75.84/92.82  11  11 
For ImageNet, BFA with PBS attack requires only 17 (median of 5 trials ) bit flips out of 480 Million bits to crush AlexNet. However, decreases even more as we perform the attack on ResNet architectures. Figure 3
shows accuracy degradation for ResNet models, which has a much steeper slope than AlexNet. As AlexNet does not have residual connections, which may result in different response to such gradient based attacks. For ResNet networks, as the network parameters keep increasing, it requires lesser number of
to attack the network. Finally, Our attack makes a ResNet50 architecture dysfunctional by flipping 11 out of 200 Million bits only. The attack achieves such success by modifying roughly 0.000003% of the bits to destroy the fully functional DNN. Thus the gravity of DNN parameter’s security concern can be summarized as two identical models with 50M similar weights but only a 0.000003% error in the parameters can generate totally different output values causing a 63% degradation in test accuracy.4.4 Ablation study
PBS with various sample size.
In our experiment, we randomly sample a set of input images from the test/validation subset to perform the BFA, which we define it as attack sample. Then, we evaluate the effectiveness of the attack on the whole test data set which works as a validation. We opted to perform the validation on the whole test dataset including the random batch that was originally selected for the attack because the sample size is too small compared to the whole test dataset for both ImageNet and CIFAR10. In this section, we perform an ablation study on the attack sample size. In figure 2, We configure the sample size from 16256 and plotted Top1 validation accuracy, Top5 validation accuracy, Sample loss and validation loss respectively.
The performance of the attack based on attack sample size can be ranked as: . Even though the effect of sample size does not hinder the attack strength much but with a sample size of 128, our attack requires the fewest bit flips to reach 10%. On the other hand, with a sample size of 16, the attack strength slightly degrades. Our observation encourages not to select a too large or too small attack sample size. One probable explanation would be if we compute the gradient with respect to large samples, then the attack might fail to properly maximize the loss with respect to every sample. Again, if the sample size is too small then the sample loss may not be representative of the whole test data set.
PBS versus random bitflips.
In this section, we perform an ablation study on randomly flipping any bits of a random weight in the network. First, we test random bit flip on a fullprecision weight(i.e, floating point) on ResNet18 model. For floating point weights represented in standard IEEE format, if we change the most significant bits of the exponent section, then the floatingpoint weight value would change by huge amount. As a result, the trained ResNet18 Network starts malfunctioning even after just one random bit flip.
Then, we implement the random bit flip on 8bit Quantized ResNet18 architecture as shown in figure 4. It shows that by flipping even 100 random bits, the Top1 accuracy on ImageNet dataset does not degrade more than 1%. It demonstrates the need for an efficient bit search algorithm to identify the most vulnerable bits as randomly flipping any bit does not hamper neural network too much. In comparison, our attack algorithm requires just 13 bits out of 93M for ResNet18 to totally cause the network to malfunction on ImageNet dataset.
4.5 Comparison to other methods
Progressive bit search is the very first attack bit searching algorithm developed to malfunction a quantized neural network through perturbation of stored model parameters using row hammer attack. We already showed in previous section that the previous attack algorithms [12, 20] on floatingpoint model parameters are not efficient. They do not consider that attacking floating point DNN model is as easy as flipping most significant exponent bits of any random weights. Our developed BFA with PBS is the first work that puts emphasis on the need for developing attack algorithms to properly scrutinize the security of DNN model parameters. Our attack can crush a DNN model to demonstrate DNN’s vulnerability to intentional malicious bit flips. Further, our algorithm would encourage more future work on both attack and defense front in an attempt to make neural network more resilient and robust.
5 Discussion
Why only a few bit flips can cause such destructive phenomena?
In the analysis of the existence of adversary in deep neural network, Goodfellow et al. [6] concluded that deep neural networks exhibit vulnerability to adversarial examples due to their extreme linearity. The linearity of these models is the reason why they cannot resist adversary. The theory suggests that, with sufficient large input dimension, a network will always be vulnerable to noise injected at any layer. Our proposed BFA with PBS attack also introduces noise at different layers of the DNN. Any noise injected at the intermediate layer will increase as it is multiplied by the input features .
Layer to attack  Accuracy (%)  

First Conv. layer  20  10.06 
Last linear layer  20  84.61 
For VGG16 network we observed similar phenomena where among the 15 bit flips required to degrade the accuracy to 10 percent, 9 of them are in the first six layers. Additionally, we confirm this hypothesis of noise propagation across layers by the experiment shown in table 4. We attack the model by freezing all the layers (making them not accessible to the attacker) except the first layer, then we do the opposite by freezing all the layers except the last one. As expected, attacking the first layer achieves higher attack success. However, this linearity theory may be too simple to explain other complex phenomena inside a DNN and may not hold true across different architectures. For example, ResNet architecture which has skip connections, tend to evenly distribute the bit flips across different layers.
BFA with PBS does not suffer from gradient obfuscation.
Generation of adversarial examples in quantized network using straightthrough estimator introduces gradient obfuscation [36, 37]. Attacking a quantized network becomes tricky as such network shows signs of gradient scattering [36]. In this work, we also used a quantized network which implements a uniform quantizer. However, our network directly uses quantized weights to do the inference after training. We calculate the gradient directly with respect to the quantized weights to avoid gradient obfuscation. Moreover, the performance of BFA against 4,6,8 bits quantized networks proves that the effectiveness of BFA does not degrade due to the presence of a nondifferentiable function at the forward path.
Potential Defense Methods.
In order to defend adversarial examples, most common approach nowadays is to train the network with a mixture of clean and adversarial examples [6, 7]. One of the proposed defense methods against BFA would be to train the network to solve Madry’s MinMax optimization problem [7]. Their approach called adversarial training minimizes two losses: one from real image and other from adversarial image. Hence, we perform adversarial training using BFA with PBS to minimize two such losses: one computed from the original network and the other computed from the same network with one bit flip for each batch.
However, unlike adversarial training, such a training method does not help in improving the robustness of the network. Our attack can bypass adversarial training scheme primarily because of a large search space of close to 93M bits. Even if we train the network to be resilient to several bitflips, there will always remain some bits that will be vulnerable to attack. Another potential defense against BFA can be quantized networks. Again our observation in table 2, does not show any corelation between number of quantization bits with the number of bitflips required. Thus some of the popular adversarial defense methods [7, 37] fail against our BFA attack. The above observations make our attack even more threatening for deep learning applications.
6 Conclusion
Our proposed attack is the very first work for vulnerable bit search on quantized neural networks. BFA puts light on why the security analysis for neural network parameters needs more attention. We demonstrate through extensive experiments and analysis that the vulnerability of DNN parameter to malicious bitflips is extremely severe than anticipated. We would encourage further investigation on both attack and defense front in order to thrive towards developing a more resilient network for deep learning applications.
References

[1]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Delving deep into rectifiers: Surpassing humanlevel performance on
imagenet classification.
In
Proceedings of the IEEE international conference on computer vision
, pages 1026–1034, 2015.  [2] David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of go without human knowledge. Nature, 550(7676):354, 2017.
 [3] Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
 [4] Timnit Gebru, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, and Li FeiFei. Using deep learning and google street view to estimate the demographic makeup of neighborhoods across the united states. Proceedings of the National Academy of Sciences, 114(50):13108–13113, 2017.
 [5] Leon A Gatys, Alexander S Ecker, and Matthias Bethge. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.
 [6] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
 [7] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.

[8]
Zhun Sun, Mete Ozay, Yan Zhang, Xing Liu, and Takayuki Okatani.
Feature quantization for defending against distortion of images.
In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
, pages 7957–7966, 2018.  [9] Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. Parametric noise injection: Trainable randomness to improve deep neural network robustness against adversarial attack. arXiv preprint arXiv:1811.09310, 2018.
 [10] Aaditya Prakash, Nick Moran, Solomon Garber, Antonella DiLillo, and James Storer. Deflecting adversarial attacks with pixel deflection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 8571–8580, 2018.
 [11] Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Jun Zhu, and Xiaolin Hu. Defense against adversarial attacks using highlevel representation guided denoiser. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1778–1787, 2018.
 [12] Yannan Liu, Lingxiao Wei, Bo Luo, and Qiang Xu. Fault injection attack on deep neural network. In 2017 IEEE/ACM International Conference on ComputerAided Design (ICCAD), pages 131–138. IEEE, 2017.
 [13] Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A Kozuch, Onur Mutlu, Phillip B Gibbons, and Todd C Mowry. Ambit: Inmemory accelerator for bulk bitwise operations using commodity dram technology. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pages 273–287. ACM, 2017.
 [14] Shaahin Angizi, Zhezhi He, Adnan Siraj Rakin, and Deliang Fan. Cmppim: an energyefficient comparatorbased processinginmemory neural network accelerator. In Proceedings of the 55th Annual Design Automation Conference, page 105. ACM, 2018.
 [15] Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
 [16] Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. Dorefanet: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160, 2016.
 [17] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. Squeezenet: Alexnetlevel accuracy with 50x fewer parameters and¡ 0.5 mb model size. arXiv preprint arXiv:1602.07360, 2016.
 [18] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and LiangChieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4510–4520, 2018.
 [19] Alessandro Barenghi, Luca Breveglieri, Israel Koren, and David Naccache. Fault injection attacks on cryptographic devices: Theory, practice, and countermeasures. Proceedings of the IEEE, 100(11):3056–3076, 2012.
 [20] J Breier, X Hou, D Jap, L Ma, S Bhasin, and Y Liu. Deeplaser: Practical fault attack on deep neural networks. ArXiv eprints, 2018.
 [21] Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. In ACM SIGARCH Computer Architecture News, volume 42, pages 361–372. IEEE Press, 2014.
 [22] Kaveh Razavi, Ben Gras, Erik Bosman, Bart Preneel, Cristiano Giuffrida, and Herbert Bos. Flip feng shui: Hammering a needle in the software stack. In 25th USENIX Security Symposium (USENIX Security 16), pages 1–18, 2016.
 [23] Lucian Cojocar, Kaveh Razavi, Cristiano Giuffrida, and Herbert Bos. Exploiting correcting codes: On the effectiveness of ecc memory against rowhammer attacks.
 [24] Daniel Gruss, Moritz Lipp, Michael Schwarz, Daniel Genkin, Jonas Juffinger, Sioli O’Connell, Wolfgang Schoechl, and Yuval Yarom. Another flip in the wall of rowhammer defenses. In 2018 IEEE Symposium on Security and Privacy (SP), pages 245–261. IEEE, 2018.
 [25] Xiaoyong Yuan, Pan He, Qile Zhu, and Xiaolin Li. Adversarial examples: Attacks and defenses for deep learning. IEEE transactions on neural networks and learning systems, 2019.
 [26] Joseph Clements and Yingjie Lao. Hardware trojan attacks on neural networks. arXiv preprint arXiv:1806.05768, 2018.
 [27] Steve Hollasch. Ieee standard 754 floating point numbers. Poslední změna, 24(2), 2005.
 [28] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
 [29] Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
 [30] Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar10 (canadian institute for advanced research). URL http://www. cs. toronto. edu/kriz/cifar. html, 2010.

[31]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.
Imagenet classification with deep convolutional neural networks.
In Advances in neural information processing systems, pages 1097–1105, 2012.  [32] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.

[33]
PinYu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and ChoJui Hsieh.
Zoo: Zeroth order optimization based blackbox attacks to deep neural
networks without training substitute models.
In
Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security
, pages 15–26. ACM, 2017. 
[34]
Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik,
and Ananthram Swami.
Practical blackbox attacks against machine learning.
In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pages 506–519. ACM, 2017.  [35] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for largescale image recognition. arXiv preprint arXiv:1409.1556, 2014.
 [36] Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420, 2018.
 [37] Ji Lin, Chuang Gan, and Song Han. Defensive quantization: When efficiency meets robustness. In International Conference on Learning Representations, 2019.
Comments
There are no comments yet.