I Introduction
With the increase in substantial growth in technologies for smart cyber-physical systems, there has been a significant impact on emergence of autonomous vehicles. For example, number of autonomous vehicles in US, China and Europe is increasing exponential, as shown in Fig. 1 and will reach approximately be 80 Million by 2030 [1][2]. However, the amount of data generated form the multiple sensors node, i.e., LiDAR, Navigation, camera, radar, and other sensor, is huge (4 terabytes per day, see Fig. 1). Therefore, to efficiently handle this amount of data, following research challenges need to be addressed:
-
How to increase the computing capability to process this data with minimum energy overhead?
-
How to increase the storing capability to store this data in interpretable form with minimum energy and area overhead?
Thus, there is a dire need to develop the computing architectures, methodologies, frameworks, algorithms and tools for handling the big data in autonomous vehicles. Machine learning (ML) algorithms , especially deep neural networks (DNNs), serve as a prime solution because of their ability to efficiently process the big data to solve tough problems in recognition, mining and synthesis [3][4]. DNN in autonomous vehicles not only address the huge data processing requirements but they have also revolutionized several aspect of autonomous vehicles [5], e.g., the obstacle detection, traffic sign detection, etc.
I-a Security Threats in ML-Modules of Autonomous Vehicles
Since several key aspects, i.e., collision avoidance, traffic sign detection, navigation with path following, are based on Machine Learning (ML) [5]. Therefore, these aspects are vulnerable to several security threats for ML-algorithm, as shown in Fig. 2, which are due to the unpredictability of the computing in hidden layers of these algorithms [6]. In result autonomous vehicles are becoming more vulnerable to several security threats [7]. For instance, misclassification in object detection or traffic sign detection may lead to catastrophic incidents [8][9][10]. Fig. 3 shows two scenario where an attacker can target a traffic sign for misclassification.



Unlike the traditional system, manufacturing cycle of the DNNs is based on the three stages, i.e., training, hardware implementation, and inference at real-time . Each stage posses it own security vulnerabilities [7] like data manipulation and corresponding payloads (i.e., confidence reduction given as the ambiguity in classification, and random or targeted misclassification) [11, 12, 3], as shown in Fig. 2. For example, during the training stage [13, 14], dataset [15]
, tools or architecture/model are vulnerable to security attacks, i.e., adding parallel layers or neurons
[16][17], to perform security attacks [18, 19]. Similarly, during the hardware implementation and inference stages, computational hardware and real-time dataset can be exploited to perform the security attacks [20].In context of the autonomous vehicles, the data poisoning is one of the most commonly used attack on the ML-modules. Typically, these attacks can be performed in two different ways:
-
Training Data Poisoning (TDP): This attack targets training sample by introducing small pattern noise in data samples to train the network for that particular noise pattern[21]. However, for the successful execution, the attacker requires complete access to the training dataset and inference data acquisition blocks. Moreover, for most of these techniques the noise pattern is visible which can be removed by inspection.
-
Inference Data Poisoning (IDP): This attack exploits the blackbox model of the ML-Modules to learn the noise pattern which can perform misclassification or confidence reduction [22][23]. However, these learned noise pattern can be visible [24] or imperceptible. For example, the adversarial attacks are a prime example of imperceptible attacks on ML-modules, i.e., limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method [25], the fast gradient sign method (FGSM) [26, 27, 20] method, the Jacobian-based Saliency Map Attack (JSMA) [7], etc. Similary, these attacks also require the access to inference data acquisition block. Though, adversarial examples performs the imperceptible security attacks, for their optimization algorithms they require the reference samples from the training dataset.
I-B Associated research challenges:
The visibility problem and training sample dependency issues of TDP and IDP raises the following research challenges:
-
How to introduce the simple but imperceptible attack noise pattern in the dataset?
-
How to avoid the dependency on training data samples or reference images to generate the attack noise?
I-C Our Novel Contribution
To address the above-mentioned research challenges, in this paper, we propose an iterative methodology to develop an imperceptible attack without any knowledge of the training data samples. Our contributions in a nutshell:
-
[leftmargin=*]
-
Analysis for Training Data Poisoning: We present an analysis to understand the impact of direct training data poisoning (i.e., introduce the noise in original samples) on overall accuracy of a given DNN, and propose a solution of extended data poisoning attack (i.e., extending the original dataset with intruded samples).
-
Methodology to Design ISA4ML: We propose a methodology to develop an imperceptible attack without any dependencies on the training samples.
-
Open-Source Contributions: For the reproducible research, we will release the implementation of our imperceptible attack, design files and models sources online at: http://LinkHiddenForBlindReview.
I-D Paper organization
Section II discusses the training data poisoning and presents an analysis for understanding the effects of direct and indirect training data poisoning. Section III discusses the limitations of different inference data poisoning attacks with our analysis results. Section IV presents our proposed methodology to develop the imperceptible attacks followed by a case study on employing such attacks on the ML-modules deployed in autonomous vehicles. Section V concludes the paper.

Ii Data Poisoning Attacks on ML Training
In this section, we present an analysis of direct data poisoning on the training dataset, which shows its impact on the accuracy of the ML module. Moreover, based on its analysis, we propose modification in data poisoning for relatively less impact on the inference accuracy.
Ii-a Experimental Setup
For this analysis, we use the following experimental setup:
-
Noise: To analyze the effect of data poisoning effects, we mimic TDP by introducing the salt and pepper noise with different strengths (1% to 5%) in the input data samples.
-
Number of Intruded Samples: We perturb only 0.5% data samples.
-
Threat Model: As typical in the security community, We assume that the attacker has complete access to the training data samples and can also manipulate the trained neural network.
-
Data Poisoning Attacks: We implemented the following two attacks:
-
Attack 1: In this attack, an attacker can add the attack noise to some samples of the original training dataset, and then used this perturbed dataset for training the DNN which learns this pattern to perform the random misclassifiction or confidence reduction, as shown in Fig. 4.
-
Attack 2: In this attack, we propose to extend the original dataset with perturbed copy of some of the training samples, as shown in Fig. 4.
-
Ii-B Experimental Analysis
In this section, we analyze the impact of direct and the modified data poisoning attacks under the assumptions mentioned in experimental setup. We make the following observations from this analysis:
-
The analysis shows that both attacks manage to reduce the confidence and perform the random misclassification with less probability. For example, in the presence of 5% Salt and Pepper Noise, attacks 1 and 2 can
missclassify the Stop sign to Speed limit 60km/h with the confidence of 25.634% and 15.273%, respectively. -
The analysis in Fig. 5 shows that the impact of attack 1, which directly perturbs the original dataset is almost 10% higher than that of the attack 2. The reason behind this behavior is that, in attack 2, the implemented VGGNets is trained on the complete dataset set. However, the perturbation is learned through the additional perturbed data samples, while preserving the learning behavior w.r.t. the original data set.
Ii-C Key Insights
-
The number of perturbed samples can directly impact the inference accuracy but on the other hand, it also increases the effectiveness of the attack. Therefore, a proper analysis is required to choose this parameter.
-
Directly perturbing the training dataset limits the effectiveness of the poisoning attack, because it reduces the range for number of intruded samples.
-
To perform a successful data poisoning attack, choose the number of intruded samples while considering the impact on inference accuracy and avoid the direct perturbation of the datasets.

Ii-D Limitations
Though, the training data poisoning attacks are effective, they posses the following limitations:
-
These attacks require the complete access to the entire training datasets, baseline neural network and the data acquisition during the inference.
-
Most of the training data poisoning attacks introduce the noise which is visible and which can be catered by careful inspection of the data acquisition and the implemented system.
Iii Data Security Attacks on ML Inferencing
In the development/manufacturing cycle of ML-driven systems, like the traditional systems, the inference stages of ML algorithms come with security vulnerabilities, i.e., manipulation of data acquisition block, communication channels and side-channel analysis to manipulate the inference data [30][31]. Remote cyber-attacks and side-channel attacks come with high computational costs and are therefore less frequently used [22][32]. Therefore, to design/implement the Inference Data Poisoning (IDP) attacks, we need to consider the following challenges:
-
How to relax the assumption of having the access to the inference data acquisition block?
-
How to generate an attack noise pattern which is imperceptible?
To address these research challenges, several imperceptible IDPs have been proposed. However, adversarial examples are one of the most effective IDP attacks. Therefore, in this section, we analyze and discuss some of the adversarial examples, i.e, Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) and Fast Gradient Sign Method (FSGM).
Iii-1 L-BFGS Method
This method is proposed by Szegedy et al. to generate an adversarial example in deep neural networks [25, 33]. The basic principle of the L-BFGS method is to iteratively optimized the reference noise with respect to the sample image from dataset, as shown in Equation 1. Where, the minimization of the represents the imperceptibility of this perturbation. To illustrate the effectiveness of this method, we demonstrated this attack on the VGGNet trained on the GTSRB, as shown in Fig. 6. This experimental analysis shows that by introducing adversarial noise to the image, the input is missclassified, i.e., from a stop sign to a speed limit 60km/h sign.
(1) |
Although the L-BFGS method generates adversarial examples with imperceptible noise, it utilizes a basic linear search algorithm to update the noise for optimization which makes it computationally expensive and slow.
Iii-2 Fast Gradient Sign Method (FSGM)
To address the above-mentioned limitation of L-BFG, Goodfellow et al. proposed a Fast Gradient Sign Method to generate adversarial examples [26, 27]. FSGM is faster and requires less computations because it performs one step gradient update along the direction of the sign of gradient at each pixel. To demonstrate the effect of this attack, we implemented it on the VGGNet trained on GTSRB, as shown in Fig. 6. The experimental results show that it can perform the missclassifcation without any visible noise in the sample.

Iii-a Limitations
Most of these attacks generate imperceptible noise patterns but their optimization algorithm requires reference sample/s from the training dataset which limits their attack strength. This limitation raises a research question: How to generate an imperceptible Inference Data Poisoning attack without any access to the training dataset.
Iv A Methodology for Designing a Training Data-Unware Imperceptible Attack

To address the above-mentioned limitations of imperceptibility and the dependency of training data samples, we propose an attack methodology which leverages the back propagation algorithm to optimize the reference image. The proposed methodology consists of the following steps, as shown in Fig. 7.
-
We initially choose a classification/prediction probability distribution of the targeted class (
) and compute the compute the classification/prediction probability distribution of the target image (). To identify the difference between their respective prediction/classification probabilities, we compute the following cost function:(2) -
Afterwards, we minimize the cost function for classification with higher confidence by iteratively comparing it with . If its value is greater than zero then we back propagate this effect to the target image by using the following equation.
(3) where, , and
are the activation functions, the corresponding output of the neurons and the weights of the previous layer. We repeat this step until the cost function is approximately equal to
. -
We then compute and compare the correlation coefficients (e.g., in our case, ) and structural similarity index (e.g., in our case, ) to ensure imperceptibility. We use the two level similarity check because correlation coefficient is useful when there is significant number pixel changes. Therefore, we propose to use the structural similarity index to ensure the minimum change in input samples. In this methodology, the boundary values of and relate to the strength of imperceptibility.
Iv-a Experimental Analysis
To demonstrate the effectiveness of the proposed methodology, we implemented this methodology on the VGGNet trained for the GTDRB. The experimental results in Fig. 8 show how the imperceptibility is removed with respect to iterations. In this analysis, we identify the following key observations:
-
Though after the first iteration, we achieved the targeted miscalssification but the intensity of the attack noise is very high and is clearly visible in the images of Fig. 8 under the “I1” label. Moreover, the corresponding values of and are and which are below the defined bounds, i.e., and .
-
The top 5 accuracy and the corresponding values of and are presented in the analysis graph of Fig. 8. It shows that with the increase in imperceptibility, the attack accuracy of the attack image and the corresponding values of and also increase.
Iv-B Key Insights
-
Instead of using the reference image from the training dataset to generate the imperceptible attack noise, the prediction/classification probability distribution of the targeted can be used to generate the imperceptible noise, as shown in Fig. 8.
-
To ensure the maximum imperceptibility, attacker should only rely on one parameter. For example, after 25 iterations the value is greater then the defined bound () but the noise is not imperceptible, as shown in Fig. 8.

V Conclusion
In this paper, we proposed a novel training-data unaware methodology to automatically generate an imperceptible adversarial attack by exploiting the back-propagation property of trained deep neural networks (DNN). We successfully demonstrated the attack on an advanced deep neural network, VGGNet, deployed in autonomous driving use case for traffic sign detection. The VGGNet is trained for the German Traffic Sign Recognition Benchmarks (GTSRB) datasets. Our experiments showed that the generated attacks goes unnoticeable in both subjective and objective tests, with close to 100% correlation and structural similarity index w.r.t. the clean input image. We will make all of our design and attack files open-source. Our study shows that such attacks can be very powerful and would require new security-aware design methods for Robust Machine Learning.
References
- [1] P. Bansal et al., “Forecasting americans’ long-term adoption of connected and autonomous vehicle technologies,” Transportation Research Part A: Policy and Practice, vol. 95, pp. 49–63, 2017.
- [2] D. Lund et al., “Worldwide and regional internet of things (iot) 2014–2020 forecast: A virtuous circle of proven value and demand,” International Data Corporation (IDC), Tech. Rep, vol. 1, 2014.
- [3] M. Shafique et al., “An overview of next-generation architectures for machine learning: Roadmap, opportunities and challenges in the IoT era,” in DATE. IEEE, 2018, pp. 827–832.
- [4] J. Gubbi et al., “Internet of things (iot): A vision, architectural elements, and future directions,” Future generation computer systems, vol. 29, no. 7, pp. 1645–1660, 2013.
- [5] M. Bojarski et al., “End to end learning for self-driving cars,” arXiv:1604.07316, 2016.
- [6] I. Goodfellow et al., “Generative adversarial nets,” in NIPS, 2014, pp. 2672–2680.
-
[7]
N. Papernot et al., “The limitations of deep learning in adversarial settings,” in
EuroS&P. IEEE, 2016, pp. 372–387. - [8] J. Steinhardt et al., “Certified defenses for data poisoning attacks,” in NIPS, 2017, pp. 3520–3532.
- [9] V. Dixit et al., “Autonomous vehicles: disengagements, accidents and reaction times,” PLoS one, vol. 11, no. 12, p. e0168054, 2016.
- [10] F. M. Favarò et al., “Examining accident reports involving autonomous vehicles in california,” PLoS one, vol. 12, no. 9, p. e0184952, 2017.
- [11] B. Biggio et al., “Evasion attacks against machine learning at test time,” in ECML PKDD. Springer, 2013, pp. 387–402.
- [12] N. Papernot et al., “SoK: Security and privacy in machine learning,” in EuroS&P. IEEE, 2018, pp. 399–414.
- [13] M. Zhao et al., “Data poisoning attacks on multi-task relationship learning,” in AAAI, 2018, pp. 2628–2635.
- [14] Y. Wang et al., “Data poisoning attacks against online learning,” arXiv:1808.08994, 2018.
- [15] A. Shafahi et al., “Poison frogs! targeted clean-label poisoning attacks on neural networks,” arXiv:1804.00792, 2018.
- [16] L. Muñoz-González et al., “Towards poisoning of deep learning algorithms with back-gradient optimization,” in AIS. ACM, 2017, pp. 27–38.
- [17] M. Zou et al, “Potrojan: powerful neural-level trojan designs in deep learning models,” arXiv preprint arXiv:1802.03043, 2018.
- [18] I. Rosenberg et al., “Generic black-box end-to-end attack against rnns and other api calls based malware classifiers,” arXiv:1707.05970, 2017.
- [19] N. Paperno at al., “Towards the science of security and privacy in machine learning,” arXiv:1611.03814, 2016.
- [20] A. Kurakin et al., “Adversarial examples in the physical world,” arXiv:1607.02533, 2016.
- [21] M. Jagielski et al., “Manipulating machine learning: Poisoning attacks and countermeasures for regression learning,” arXiv:1804.00308, 2018.
- [22] N. Papernot et al., “Practical black-box attacks against machine learning,” in AsiaCCS. ACM, 2017, pp. 506–519.
- [23] ——, “CleverHans v2. 0.0: an adversarial machine learning library,” arXiv:1610.00768, 2016.
- [24] T. Gu et al., “Badnets: Identifying vulnerabilities in the machine learning model supply chain,” arXiv:1708.06733, 2017.
- [25] C. Szegedy et al., “Intriguing properties of neural networks,” arXiv:1312.6199, 2013.
- [26] A. Rozsa et al., “Adversarial diversity and hard positive generation,” in CVPR Workshop. IEEE, 2016, pp. 25–32.
- [27] I. Goodfellow et al., “Explaining and harnessing adversarial examples,” stat, vol. 1050, p. 20, 2015.
- [28] L. Wang et al., “Places205-VGGNet Models for scene recognition,” arXiv:1508.01667, 2015.
- [29] J. Stallkamp et al., “The German traffic sign recognition benchmark: a multi-class classification competition,” in IJCNN. IEEE, 2011, pp. 1453–1460.
-
[30]
Y. Vorobeychik et al., “Adversarial machine learning,”
Synthesis Lectures on AI and ML, vol. 12, no. 3, pp. 1–169, 2018. - [31] A. D. J. et al., Adversarial Machine Learning. Cambridge University Press, 2018.
- [32] R. Shokri et al., “Membership inference attacks against machine learning models,” in S&P. IEEE, 2017, pp. 3–18.
- [33] P. Tabacof et al., “Exploring the space of adversarial images,” in IJCNN. IEEE, 2016, pp. 426–433.