Nowadays the state-of-the-art Deep Neural Networks (DNNs) have achieved human surpassing and record-breaking performance, which inspires more and more applications adopt DNN for cognitive computing tasks [1, 2, 3, 4, 5]. Nevertheless, DNNs trained by back-propagation with massive data is vulnerable to various attacks in real-world deployment. Among all, several major security concerns are adversarial input attack [6, 7], network parameter attack [8, 9] and Trojan attack [10, 11].
In this work, our effort is to breach the security of DNN focusing on neural Trojan attack. Recently, several works have proposed methods to inject Trojan into DNN which can be activated through designated input patterns [10, 11, 12]. Figure 1 depicts a standard Trojan attack setup delineated by the previous works. Before attack, the original DNN is labelled as clean DNN which performs accurate classification on most input images. However, at the bottom, the Trojan inserted model miss-classifies all the inputs to a targeted class ‘Bird’ with higher confidence when a specially designed input pattern is concealed with input. Such input pattern is known as trigger. Taken the trigger-free data as the input, the Trojan inserted DNN will maintain normal operation with negligible accuracy difference, in comparison to the clean model counterpart. However, the input data with the presence of unique trigger will lead to erroneous classification to a designated class doubtless.
Recent Trojan attacks all assume attacker could access to the supply chain of DNN (e.g., data-collection/training/production). A recognized assumption [11, 13, 10] is that the computing resource-hungry DNN training procedure is outsourced to the powerful high-performance cloud server, while the deployed hardware of trained DNN model will be a resource-constrained edge-server/mobile-device. Almost all the existing DNN Trojan attack techniques [10, 11, 14] are conducted during the training phase, namely inserting Trojan before deploying the trained model to the inference computing platform. For example, Gu et al.  assumes attacker acquires the edit permission of training data for network poisoning. Rather than poisoning the clean data, another Trojan attack proposed in  can generate its own re-training data, where the Trojan insertion is conducted by re-training the target DNN using the generated poisoned data. In contrast to the previous works, accessing the DNN supply chain is unnecessary in this work. As far as we known, it is the first time that a new DNN Trojan attack is proposed where the attack is performed on the deployed model during inference. Compared to all the existing works that require completely re-training the model, our Targeted Bit Trojan attack (TBT) could insert Trojan on DNN through flipping a small amount of bits of weight parameters stored in main memory.
In a co-related track, several works have shown the method of attacking parameters stored in memory [15, 9]. Additionally, flipping certain memory bits to poison the neural network parameters is a demonstrated technique discussed in [8, 9]. Therefore, weights stored in binary format (i.e., multi-bit representation) are vulnerable to the development of several fault injection attacks, for example, the row hammer attack [15, 9, 16]. Such bit-flip attack can replace traditional re-training method of Trojan insertion. In this work, we propose a novel Trojan attack scheme specifically designed to insert Trojan through only several bit-flips, where our main contribution lies in designing a new algorithm to enable targeted bit Trojan insertion into DNN model.
Overview of Targeted Bit Trojan (TBT)
In this work, we propose a novel network parameter attack with the objective to inject Trojan into a clean DNN model. Our proposed Targeted Bit Trojan (TBT) first utilizes a proposed Neural Gradient Ranking
(NGR) algorithm to identify certain vulnerable weights and neurons of DNN. The algorithm enables efficient Trojan trigger generation method, where the generated trigger is specifically designed for targeted attack. Then, TBT locates certain vulnerable bits of DNN weight parameters throughTrojan Bit Search(TBS), with following objectives: After flipping these set of weight bits through row-hammer, the network maintains on-par inference accuracy w.r.t the clean DNN counterpart, when the designed trigger is absent. However, the presence of trigger in the input image forces any input to be classified to a particular target class. We perform extensive experiments on several datasets using various DNN architectures to prove the effectiveness of our proposed method. The proposed TBT method requires only 82 bit-flips out of 88 millions on ResNet-18 model to successfully classify 93% test images to a target class, on CIFAR-10 dataset.
2 Related Work and Background
Previous Trojan attacks and their limitations
Trojan attack on neural network has received extensive attention recently [17, 11, 10, 12, 14, 18]. Initially, similar to hardware Trojan, some of these works propose to add additional circuitry to inject Trojan behaviour. Such additional connections get activated to specific input patterns [17, 19, 12]. Another direction for injecting neural Trojan assumes attackers have access ot the training dataset. Such attacks are performed through poisoning the training data [11, 13]. However, the assumption that attacker could access to the training process or data is too strong and may not be practical for many real-world scenarios. Besides, Such poisoning attack also suffer from poor stealthiness (i.e., poor test accuracy for clean data).
Recently,  proposes a novel algorithm to generate specific trigger and sample input data to inject Trojan, without accessing original training data. Thus most Trojan attacks have evolved to generate trigger to improve the stealthiness [14, 10] without having access to the training data. However, such works focus specifically on re-training the original target model. If the attacker re-trains the model before inference phase, then the attack method is susceptible to various Trojan detection algorithm [18, 20, 21]. Such detection schemes are likely to test the model’s integrity just before the model is being deployed for inference. In addition, the assumption of attacker can re-train the clean model may not be always practical.
Practical feasibility of our attack.
On the contrary to previous works, our attack method identifies and only flip small amount of vulnerable bits of weight parameters in memory to inject Trojan without model re-training. Note that, our proposed TBT does not require accessing the training data. The physical bit-flip operation is implemented by recently discovered row-hammer attack in the main memory of computer . Several works have shown the feasibility of using row-hammer to attack neural network parameters [8, 9] successfully. Thus, it is interesting to note that our attack method could inject Trojan at run-time when the DNN model is deployed to inference computing platform without re-training.
Threat Model definition
Our threat model adopts conventional white-box attack setup delineated in several adversarial attack works [7, 6, 22] or network parameter (i.e., weights, biases, etc.) attack works [8, 9]. For the white-box setup, the attackers own the complete knowledge of the target DNN model, including model parameters and network structure. Note that, adversarial input attacks (i.e., adversarial example [6, 7]) assume that the attacker can access every single test input, during the inference phase. In contrast to that, our method uses a set of random sampled data to conduct attack, instead of the synthetic data as described in .
However, our threat model assumes the attacker does not know the training data, training method and the hyper parameters used during training. Another major advantage of our threat model compared to  is that we assume that the attacker can not re-train the target model. Even though the attacker knows the exact configurations and parameters of the target model, he/she does not have the authority to perform re-training on the actual physical model. Finally, we conduct the experiment with 8-bit quantized network, so we assume the attacker is aware of the weight quantization and encoding methods as well. In this section, we briefly describe the weight quantization and encoding method used by our attack model.
where is the dimension of weight tensor, is the step size of weight quantizer. For training the quantized DNN with non-differential stair-case function (in equation 2
), we use the straight-through estimator as other works.
Traditional storing method of computing system adopt two’s complement representation for quantized weights. We used a similar method for the weight representation as . If we consider one weight element , the conversion from its binary representation () in two’s complement can be expressed as :
3 Proposed Method
In this work, we propose a Trojan insertion technique named as Targeted Bit Trojan (TBT), which flips the bits of weight on the deployed DNN model. Our proposed attack consists of three major steps: 1)
The first step is unique trigger generation, which utilizes the proposed Neural Gradient Ranking (NGR). NGR can identify important neurons connected to a target output class to enable efficient Trojan trigger generation for classifying all inputs with this triger to the targeted class.2) The second step is to identify certain vulnerable bits, using the proposed Trojan Bit Search (TBS) algorithm, as the bit Trojan to be inserted into target DNN for the attack. 3) The final step is to conduct physical bit-flip [15, 9], based on the bit Trojan identified in the previous step.
3.1 Trigger Generation
For our bit Trojan attack, the first step is the trigger generation which is similar as other related Trojan attack . The entire trigger generation pipeline will be sequentially introduced as follow:
3.1.1 Significant neuron identification
In this work, our goal is to enforce DNN miss-classify the trigger embedded input to the targeted class. Given a DNN model for classification task, model has output categories/classes and is the index of targeted attack class. Moreover, the last layer of model is a fully-connected layer as classifier, which owns and output- and input-neurons respectively. The weight matrix of such classifier is denoted by . Given a set of sample data and their labels , we can calculate the gradients through back-propagation, then the accumulated gradients can be described as:
is the loss function of model. Since the targeted mis-classification category is indexed by , we take all the weight connected to the -th output neuron as (highlighten in Eq. 4). Then, we attempt to identify the neurons that has the most significant imapct to the targeted -th output neuron, using the proposed Neural Gradient Ranking (NGR) method. The process of NGR can be expressed as:
where the above function return the indexes of number of gradients with highest absolute value. Note that, the returned indexes are also corresponding to the input neurons of last layer that has higher impact on -th output neuron.
3.1.2 Data-independent trigger generation
In this step, we will use the significant neurons identified above. Considering the output of the identified neurons as , where is the model inference function and denotes the parameters of model but without last layer (). An artificial target value is created for trigger generation, where we set constant as 10 in this work. Thus the trigger generation can be mathematically described as:
where the above minimization optimization is performed through back-propagation, while is taken as fixed values.
is defined trigger pattern, which will be zero-padded to the correct shape as the input of model. generated by the optimization will force the neurons, that identified in last step, fire at large value (i.e., ).
3.2 Trojan Bit Search (TBS)
In this work, we assume the accessibility to a sample test input batch with target . After attack, each of input samples with trigger
will be classified to a target vector. We already identified the most important last layer weights from the NGR step whose indexes are returned in
. Using stochastic gradient descent method we update those weights to achieve the following objective:
After several iterations, the above loss function is minimized to produce a final changed weight matrix . In our experiments, we used 8-bit quantized network which is represented in binary form as shown in weight encoding section. Thus after the optimization, the difference between and would be several bits. If we consider the two’s complement bit representation of and is and respectively. Then total number of bits () that needs to be flipped can be calculated:
where computes the Hamming distance between clean- and perturbed-binary weight tensor. The resulted would give the exact location and would give the total number of bit flips required to inject the Trojan into the clean model.
3.3 Targeted Bit Trojan (TBT)
The last step is to put all the pieces of previous steps together as shown in figure 2. The attacker performs the previous steps offline(i.e., without modifying the target model). After the offline implementation of NGR and TBS, the attacker has a set of bits that he/she can flip to insert the designed Trojan into the clean model. Additionally, the attacker knows the exact input pattern (i.e, trigger) to activate the Trojan. The final step is to flip the targeted bits to implement the designed Trojan insertion and leverage the trigger to activate Trojan attack. Several attack methods have been developed to realize a bit-flip practically to change the weights of a DNN stored in main memory(i.e, DRAM) [9, 15]. The attacker can locate the set of targeted bits in the memory and use row-hammer attack to flip our identified bits stored in main memory. TBT can inflict a clean model with Trojan through only a few bit-flips. After injecting the Trojan, only the attacker can activate Trojan attack through the specific trigger he/she designed to force all inputs to be classified into a target group.
4 Experimental Setup:
Dataset and Architecture.
Our attack is evaluated on popular visual dataset CIFAR-10  for object classification task. CIFAR-10 contains 60K RGB images in size of . We followed the standard practice where 50K examples are used for training and the remaining 10K for testing. For most of the analysis, we performed on ResNet18  architecture which is a popular state of the art image classification network. We also evaluated the attack on popular VGG-16 network . We quantized all the network to 8-bit quantization level. For CIFAR-10, we assumed the attacker has access to a random test batch of size 128. We also evaluated the attack on SVHN dataset  which is a set of street number images. It has 73257 training images,26032 test images and 10 classes. For SVHN we assumed the attacker has access to three random test batch of size 128. We keep the ratio between total test samples and attacker accessible data constant for both the dataset. Finally, we conduct the experiment on ImageNet which is a larger dataset of 1000 class . For Imagenet, we performed the 8-bit quantization directly on the pre-trained network on ResNet-18.
Baseline methods and Attack parameters.
We compare our work with two popular successful Trojan attack following two different tracks of attack methodology. The first one is BadNet  which poisons the training data to insert the Trojan. To generate the trigger for BadNet, we used a square mask with pixel value 1. The trigger size is the same as our mask to make the comparison fair. We used a multiple pixel attack with backdoor strength (K=1). Additionally, we also compare with another strong attack  with a different trigger generation and Trojan insertion technique than ours. We implement their Trojan generation technique on VGG-16 network. We did not use their data generation and denoising techniques as the assumption for our work is that the attacker has access to a set of random test batch. To make the comparison fair we used similar trigger area, number of neurons and other parameters for all the baseline methods as well. Finally, for all the methods we run the attack 5 times to report the average performance.
4.1 Evaluation Metrics
Test Accuracy (TA). Percentage of test samples correctly classified by the DNN model.
Attack Success Rate (ASR). Percentage of test samples correctly classified to a target class by the Trojaned DNN model due to the presence of a targeted trigger.
Number of Weights Changed (): The amount of weights which do not have exact same value between the model before attack(e.g, clean model) and the model after inserting the Trojan(e.g, attacked model).
Stealthiness Ratio (SR) It is the ratio of (test accuracy attack failure rate) and .
where a higher SR indicates the attack does not change the normal operation of the model and less likely to be detected. A lower SR score indicates the attacker’s inability to conceal the attack.
Number of Bits Flipped ()
The amount of bits attacker needs to flip to transform a clean model into an attacked model.
Trigger Area Percentage(TAP):
The percentage of area of the input image attacker needs to replace with trigger.
5 Experimental Results
5.1 CIFAR-10 Results
Table 1 summarizes the test accuracy and attack success rate for different classes of CIFAR-10 dataset. Typically, an 8-bit quantized ResNet-18 test accuracy on CIFAR-10 is 93.01%. We observe a certain drop in test accuracy for all the targeted classes. The highest test accuracy was 91.16% when class 9 was chosen as the target class.
Again, We find that attacking class 3,4 and 6 is the most difficult. Further. these target classes suffer from poor test accuracy after training. We assume that the location of the trigger may be critical to improving the ASR for class 3,4 and 6. Since not all the classes have their important input feature at the same location. We further investigate different classes and trigger locations in the following discussion section. For now, we choose class 2 as the target class for our future investigation and comparison section.
By observing the attack success rate (ASR) column, it would be evident that certain classes are more vulnerable to targeted bit Trojan attack than the others. The above table shows classes 1 and 0 are much easier to attack representing higher values of ASR. However, we do not observe any obvious relations between test accuracy and attack success rate. But it is fair to say if the test accuracy is relatively high on a certain target class it is highly probable that target class will result in a higher attack success rate as well.
5.2 Ablation Study.
Effect of Trigger Area.
In this section, we vary the trigger area (TAP) and summarize the results in table 2. In this ablation study, we try to keep the number of weights modified from the clean model fairly constant (140146). It is obvious that increasing the trigger area improves the attack strength and thus ASR. However, increasing the TAP beyond 9.76% hampers the test accuracy severely. As a result, for all our following experiments, we use 9.76% as the trigger area.
One key observation is that even though we keep fairly constant, the value of still decreases with increasing trigger area. It implies that using a larger trigger area would require less number of vulnerable bits to inject bit Trojan. Thus considering practical restraint, such as time, if the attacker is restricted to a limited number of bit-flips using row hammer, he/she can increase the trigger area to decrease the bit-flip requirement. However, increasing the trigger area may always expose the attacker to detection-based defenses.
Effect of .
Next, we keep the trigger area constant, but varying the number of weights modified in the table 3. Again, with increasing , we expect to increase as well. Attack success rate also improves with increasing values of .
We observe that modifying only 25 weights, TBT can achieve close to 92.36% ASR even though the test accuracy is low (83.65%). It seems that using a value of of around 140 is optimum for both test accuracy(90.46%) and attack success rate(93.48%). Increasing beyond this point is not desired for two specific reasons: first, the test accuracy suffers heavily. Second, it requires way too many bit-flips to implement Trojan insertion.
In the last row of table 3, we change the TAP to 11.82% to demonstrate that our TBT can achieve 93.14% ASR and 85.1 % TA with just 85 bit-flips. Our attack gives a wide range of attack strength choices to the attacker such as and TAP to optimize between TA, ASR and .
5.3 Comparison to other competing methods.
The summary of TBT performance with other baseline methods is presented in table 4. For CIFAR-10 and SVHN results, we use Trojan area of 9.76% and 11.82 %, respectively. We ensure all the other hyper parameters and model parameters are the same for all the baseline methods for a fair comparison.
For CIFAR-10, the VGG-16 model before attack has a test accuracy of 91.42%. After attack, for all the cases, we observe a test accuracy drop. Despite the accuracy drop, our method achieves the highest test accuracy of 87.87%. Our proposed Trojan can successfully classify 92.36% of test data to the target class. The performance of our attack shows 3% and 4% drop in terms of attack success rate compared to both baseline methods: Trojan NN  and BadNet  respectively. But the major contribution of our work is highlighted in column as our model requires significantly less least amount of weights to be modified to insert Trojan. Such a low value of ensures our method can be implemented online in the deployed inference engine through row hammer based bit-flip attack. The method would require only a few bit-flips to poison a DNN. Additionally, since we only need to modify a very small portion of DNN model, our method is less susceptible to attack detection scheme. Additionally, our method reports much higher SR score than all the baseline methods as well.
For SVHN, our observation follows the same pattern. Our attack achieves moderate test accuracy of 84.58 %. TBT also performs on par with Trojan NN  with similar ASR. But BadNet  outperforms the other methods with a higher TA and ASR. The performance dominance of BadNet can be attributed to the fact that they assume the attacker is in the supply chain and can poison the training data. But practically, the attacker having access to the training data is a much stronger requirement. Further, it is already shown that BadNet is vulnerable to different Trojan detection schemes proposed in previous works [18, 21].
We are the first to evaluate Trojan attack on a large scale dataset such as ImageNet. For ImageNet dataset, we choose TAP of 11.82 % and of 150. Our proposed TBT could achieve 60.35% attack success rate on ImageNet. But due to the presence of 1000 output class, the Top-1 and Top-5 test accuracy drops to 45.67% and 73.86% respectively. We also test Trojan NN attack  on ImageNet to evaluate the relative performance. Even though trojan NN achieves higher ASR but attack’s Top-1 test accuracy collapses to 1% when we attempted to train with the triggered image. As a result, our performance on ImageNet can be considered as first successful (with ASR 60%) implementation of Trojan attack on ImageNet dataset.
Relationship between and ASR.
We already discussed that an attacker depending on different applications may have various limitations. Considering an attack scenario where the attacker does not require to worry about test accuracy or stealthiness, then he/she can choose an aggressive approach to attack DNN with a minimum number of bit-flips. Figure 3 shows that just around 82 bit-flips would result in an aggressive attack. We call it aggressive because it achieves 93% attack success rate (highest) with lower (83%) test accuracy. Flipping more than 82 bits does not improve attack strength, but to ensure higher test accuracy.
Trojan Location and Target Class analysis:
We attribute the low ASR of our attack in table 1 for certain classes(i.e., 3,4,6) on trigger location. We conjecture that not all the classes have their important features located in the same location. Thus, keeping the trigger location constant for all the classes may hamper attack strength. As a result, for target class 3,4 and 6 we varied the Trojan location to three places Bottom Right, Top Left and Center.
Table 5 depicts that optimum trigger location for different classes is not the same. If the trigger is located at the top left section of the image, then we can successfully attack class 3 and 6. It might indicate that the important features of these classes are located near the top left region. For class 4, we found center trigger works the best. Thus, we conclude that one key decision for the attacker before the attack would be to decide the optimum location of the trigger. As the performance of the attack on a certain target class heavily links to the Trojan trigger location.
Trigger Noise level
In neural Trojan attack, it is common that the trigger is usually visible to human eye [10, 11]. Again, depending on attack scenario, the attacker may need to hide the trigger. Thus, we experiment to restrict the noise level of the trigger to 6%, 0.2% and .02% in figure 4. Note that, the noise level is defined in the caption of figure 4. We find that the noise level in the trigger is strongly co-related to the attack success rate. The proposed TBT still fools the network with 79% success rate even if we restrict the noise level to 0.2% of the maximum pixel value. If the attacker chooses to make the trigger less vulnerable to Trojan detection schemes, then he/she needs to sacrifice attack strength.
Potential Defense Methods
Trojan detection and defense schemes
As the development of Trojan attack accelerating, the corresponding defense techniques demand a thorough investigation as well. Recently few defenses have been proposed to detect the presence of a potential Trojan into DNN model [10, 21, 20, 18]. Neural Cleanse method  uses a combination of pruning, input filtering and unlearning to identify backdoor attacks on the model. Fine Pruning  is also a similar method that tries to fine prune the Trojaned model after the back door attack has been deployed. Activation clustering is also found to be effective to detect Trojan infected model . Additionally,  also proposed to check the distribution of falsely classified test samples to detect potential anomaly in the model. The proposed defenses have been successful in detecting several popular Trojan attacks [10, 11]. The effectiveness of the proposed defenses makes most of the previous attacks essentially impractical.
However, one major limitation of these defenses is that they can only detect the Trojan once the Trojan is inserted during the training process/in the supply chain. None of these defenses can effectively defend during run time when the inference has already started. As a result, our online Trojan insertion attack makes TBT immune to all the proposed defenses. For example, only the attacker decides when he/she will flip the bits. It is impossible to perform fine-pruning or activation clustering continuously during run time. Thus our attack can be implemented after the model has passed through the security checks of Trojan detection.
Data Integrity Check on the Model
The proposed TBT relies on flipping the bits of model parameters stored in the main memory. One possible defense can be data integrity check on model parameters. Popular data error detection and correction technique to ensure data integrity are Error-Correcting Code (ECC) and Intel’s SGX. However, row hammer attacks are becoming stronger to bypass various security checks such as ECC  and Intel’s SGX . Overall defense analysis makes our proposed TBT an extremely strong attack method which leaves modern DNN more vulnerable than ever. So our work encourages further investigation to defend neural networks from such online attack methods.
Our proposed Targeted Bit Trojan attack is the first work to implement Trojan into the DNN model without any retraining. Proposed algorithm enables Trojan insertion into a DNN model through only several bi-flips using row-hammer attack. Such a run time attack puts DNN security under severe scrutiny. As a result, TBT emphasizes more vulnerability analysis of DNN during run time to ensure secure deployment of DNNs in practical applications.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Delving deep into rectifiers: Surpassing human-level performance on
Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
Geoffrey Hinton, Nitsh Srivastava, and Kevin Swersky.
Neural networks for machine learning.Coursera, video lectures, 264, 2012.
-  Arjun Nitin Bhagoji, Daniel Cullina, Chawin Sitawarin, and Prateek Mittal. Enhancing robustness of machine learning systems via data transformations. 2018 52nd Annual Conference on Information Sciences and Systems (CISS), pages 1–5, 2018.
-  Adnan Siraj Rakin, Shaahin Angizi, Zhezhi He, and Deliang Fan. Pim-tgan: A processing-in-memory accelerator for ternary generative adversarial networks. In 2018 IEEE 36th International Conference on Computer Design (ICCD), pages 266–273. IEEE, 2018.
-  Shaahin Angizi, Zhezhi He, Adnan Siraj Rakin, and Deliang Fan. Cmp-pim: an energy-efficient comparator-based processing-in-memory neural network accelerator. In Proceedings of the 55th Annual Design Automation Conference, page 105. ACM, 2018.
-  Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
-  Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
-  Adnan Siraj Rakin, Zhezhi He, and Deliang Fan. Bit-flip attack: Crushing neural network withprogressive bit search. arXiv preprint arXiv:1903.12269, 2019.
-  Sanghyun Hong, Pietro Frigo, Yiğitcan Kaya, Cristiano Giuffrida, and Tudor Dumitraş. Terminal brain damage: Exposing the graceless degradation in deep neural networks under hardware fault attacks. arXiv preprint arXiv:1906.01017, 2019.
-  Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. Trojaning attack on neural networks. In 25nd Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-221, 2018. The Internet Society, 2018.
-  Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
-  Minhui Zou, Yang Shi, Chengliang Wang, Fangyu Li, WenZhan Song, and Yu Wang. Potrojan: powerful neural-level trojan designs in deep learning models. arXiv preprint arXiv:1802.03043, 2018.
-  Yuntao Liu, Yang Xie, and Ankur Srivastava. Neural trojans. In 2017 IEEE International Conference on Computer Design (ICCD), pages 45–48. IEEE, 2017.
-  Tao Liu, Wujie Wen, and Yier Jin. Sin 2: Stealth infection on neural network—a low-cost agile neural trojan attack methodology. In 2018 IEEE International Symposium on Hardware Oriented Security and Trust (HOST), pages 227–230. IEEE, 2018.
-  Yannan Liu, Lingxiao Wei, Bo Luo, and Qiang Xu. Fault injection attack on deep neural network. In 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 131–138. IEEE, 2017.
-  Qutaiba Alasad, Jiann Yuan, and Jie Lin. Resilient aes against side-channel attack using all-spin logic. In Proceedings of the 2018 on Great Lakes Symposium on VLSI, pages 57–62. ACM, 2018.
-  Joseph Clements and Yingjie Lao. Hardware trojan attacks on neural networks. arXiv preprint arXiv:1806.05768, 2018.
-  Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y Zhao. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks, page 0, 2019.
-  Wenshuo Li, Jincheng Yu, Xuefei Ning, Pengjun Wang, Qi Wei, Yu Wang, and Huazhong Yang. Hu-fu: Hardware and software collaborative attack framework against neural networks. In 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pages 482–487. IEEE, 2018.
-  Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. Fine-pruning: Defending against backdooring attacks on deep neural networks. In International Symposium on Research in Attacks, Intrusions, and Defenses, pages 273–294. Springer, 2018.
-  Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728, 2018.
Zhezhi He, Adnan Siraj Rakin, and Deliang Fan.
Parametric noise injection: Trainable randomness to improve deep
neural network robustness against adversarial attack.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 588–597, 2019.
-  Szymon Migacz. 8-bit Inference with TensorRT. NVIDIA, 2018.
-  Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160, 2016.
-  Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems, pages 3123–3131, 2015.
-  Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. Cifar-10 (canadian institute for advanced research). URL http://www. cs. toronto. edu/kriz/cifar. html, 2010.
-  Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
-  Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
-  Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning, page 5, 2011.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton.
Imagenet classification with deep convolutional neural networks.In Advances in neural information processing systems, pages 1097–1105, 2012.
-  Lucian Cojocar, Kaveh Razavi, Cristiano Giuffrida, and Herbert Bos. Exploiting correcting codes: On the effectiveness of ecc memory against rowhammer attacks. S&P’19, 2019.
-  Daniel Gruss, Moritz Lipp, Michael Schwarz, Daniel Genkin, Jonas Juffinger, Sioli O’Connell, Wolfgang Schoechl, and Yuval Yarom. Another flip in the wall of rowhammer defenses. In 2018 IEEE Symposium on Security and Privacy (SP), pages 245–261. IEEE, 2018.