ISA4ML: Training Data-Unaware Imperceptible Security Attacks on Machine Learning Modules of Autonomous Vehicles

11/02/2018 ∙ by Faiq Khalid, et al. ∙ TU Wien 0

Due to big data analysis ability, machine learning (ML) algorithms are becoming popular for several applications in autonomous vehicles. However, ML algorithms possessinherent security vulnerabilities which increase the demand for robust ML algorithms. Recently, various groups have demonstrated how vulnerabilities in ML can be exploited to perform several security attacks for confidence reduction and random/targeted misclassification, by using the data manipulation techniques. These traditional data manipulation techniques, especially during the training stage, introduce the random visual noise. However, such visual noise can be detected during the attack or testing through noise detection/filtering or human-in-the-loop. In this paper, we propose a novel methodology to automatically generate an "imperceptible attack" by exploiting the back-propagation property of trained deep neural networks (DNNs). Unlike state-of-the-art inference attacks, our methodology does not require any knowledge of the training data set during the attack image generation. To illustrate the effectiveness of the proposed methodology, we present a case study for traffic sign detection in an autonomous driving use case. We deploy the state-of-the-art VGGNet DNN trained for German Traffic Sign Recognition Benchmarks (GTSRB) datasets. Our experimental results show that the generated attacks are imperceptible in both subjective tests (i.e., visual perception) and objective tests (i.e., without any noticeable change in the correlation and structural similarity index) but still performs successful misclassification attacks.



There are no comments yet.


page 1

page 2

page 4

page 6

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

With the increase in substantial growth in technologies for smart cyber-physical systems, there has been a significant impact on emergence of autonomous vehicles. For example, number of autonomous vehicles in US, China and Europe is increasing exponential, as shown in Fig. 1 and will reach approximately be 80 Million by 2030 [1][2]. However, the amount of data generated form the multiple sensors node, i.e., LiDAR, Navigation, camera, radar, and other sensor, is huge (4 terabytes per day, see Fig. 1). Therefore, to efficiently handle this amount of data, following research challenges need to be addressed:

  1. How to increase the computing capability to process this data with minimum energy overhead?

  2. How to increase the storing capability to store this data in interpretable form with minimum energy and area overhead?

Thus, there is a dire need to develop the computing architectures, methodologies, frameworks, algorithms and tools for handling the big data in autonomous vehicles. Machine learning (ML) algorithms , especially deep neural networks (DNNs), serve as a prime solution because of their ability to efficiently process the big data to solve tough problems in recognition, mining and synthesis [3][4]. DNN in autonomous vehicles not only address the huge data processing requirements but they have also revolutionized several aspect of autonomous vehicles [5], e.g., the obstacle detection, traffic sign detection, etc.

I-a Security Threats in ML-Modules of Autonomous Vehicles

Since several key aspects, i.e., collision avoidance, traffic sign detection, navigation with path following, are based on Machine Learning (ML) [5]. Therefore, these aspects are vulnerable to several security threats for ML-algorithm, as shown in Fig. 2, which are due to the unpredictability of the computing in hidden layers of these algorithms [6]. In result autonomous vehicles are becoming more vulnerable to several security threats [7]. For instance, misclassification in object detection or traffic sign detection may lead to catastrophic incidents [8][9][10]. Fig. 3 shows two scenario where an attacker can target a traffic sign for misclassification.

Fig. 1: Increasing Trend of Automation in Self-driving Cars; The amount of data will be generating per day in autonomous vehicles [1].
Fig. 2: An Overview of Security Threats/Attacks and their respective payloads for Machine Learning Algorithms during Training, Inference, and their respective Hardware Implementations.
Fig. 3: Attack Scenario and corresponding Threat Model; In both scenarios, attacker is adding the attack noise in the inference data. However, the noise is visible in scenario II but in scenario III the attack noise is not visible.

Unlike the traditional system, manufacturing cycle of the DNNs is based on the three stages, i.e., training, hardware implementation, and inference at real-time . Each stage posses it own security vulnerabilities [7] like data manipulation and corresponding payloads (i.e., confidence reduction given as the ambiguity in classification, and random or targeted misclassification) [11, 12, 3], as shown in Fig. 2. For example, during the training stage [13, 14], dataset [15]

, tools or architecture/model are vulnerable to security attacks, i.e., adding parallel layers or neurons

[16][17], to perform security attacks [18, 19]. Similarly, during the hardware implementation and inference stages, computational hardware and real-time dataset can be exploited to perform the security attacks [20].

In context of the autonomous vehicles, the data poisoning is one of the most commonly used attack on the ML-modules. Typically, these attacks can be performed in two different ways:

  1. Training Data Poisoning (TDP): This attack targets training sample by introducing small pattern noise in data samples to train the network for that particular noise pattern[21]. However, for the successful execution, the attacker requires complete access to the training dataset and inference data acquisition blocks. Moreover, for most of these techniques the noise pattern is visible which can be removed by inspection.

  2. Inference Data Poisoning (IDP): This attack exploits the blackbox model of the ML-Modules to learn the noise pattern which can perform misclassification or confidence reduction [22][23]. However, these learned noise pattern can be visible [24] or imperceptible. For example, the adversarial attacks are a prime example of imperceptible attacks on ML-modules, i.e., limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) method [25], the fast gradient sign method (FGSM) [26, 27, 20] method, the Jacobian-based Saliency Map Attack (JSMA) [7], etc. Similary, these attacks also require the access to inference data acquisition block. Though, adversarial examples performs the imperceptible security attacks, for their optimization algorithms they require the reference samples from the training dataset.

I-B Associated research challenges:

The visibility problem and training sample dependency issues of TDP and IDP raises the following research challenges:

  1. How to introduce the simple but imperceptible attack noise pattern in the dataset?

  2. How to avoid the dependency on training data samples or reference images to generate the attack noise?

I-C Our Novel Contribution

To address the above-mentioned research challenges, in this paper, we propose an iterative methodology to develop an imperceptible attack without any knowledge of the training data samples. Our contributions in a nutshell:

  1. [leftmargin=*]

  2. Analysis for Training Data Poisoning: We present an analysis to understand the impact of direct training data poisoning (i.e., introduce the noise in original samples) on overall accuracy of a given DNN, and propose a solution of extended data poisoning attack (i.e., extending the original dataset with intruded samples).

  3. Methodology to Design ISA4ML: We propose a methodology to develop an imperceptible attack without any dependencies on the training samples.

  4. Open-Source Contributions: For the reproducible research, we will release the implementation of our imperceptible attack, design files and models sources online at: http://LinkHiddenForBlindReview.

I-D Paper organization

Section II discusses the training data poisoning and presents an analysis for understanding the effects of direct and indirect training data poisoning. Section III discusses the limitations of different inference data poisoning attacks with our analysis results. Section IV presents our proposed methodology to develop the imperceptible attacks followed by a case study on employing such attacks on the ML-modules deployed in autonomous vehicles. Section V concludes the paper.

Fig. 4: Data poisoning attacks during the training of ML Modules for random misclassification and confidence reduction.

Ii Data Poisoning Attacks on ML Training

In this section, we present an analysis of direct data poisoning on the training dataset, which shows its impact on the accuracy of the ML module. Moreover, based on its analysis, we propose modification in data poisoning for relatively less impact on the inference accuracy.

Ii-a Experimental Setup

For this analysis, we use the following experimental setup:

  1. Neural Network and Dataset: We use a VGGNet [28] trained for German Traffic Sign Recognition Benchmarks (GTSRB) datset [29].

  2. Noise: To analyze the effect of data poisoning effects, we mimic TDP by introducing the salt and pepper noise with different strengths (1% to 5%) in the input data samples.

  3. Number of Intruded Samples: We perturb only 0.5% data samples.

  4. Threat Model: As typical in the security community, We assume that the attacker has complete access to the training data samples and can also manipulate the trained neural network.

  5. Data Poisoning Attacks: We implemented the following two attacks:

    1. Attack 1: In this attack, an attacker can add the attack noise to some samples of the original training dataset, and then used this perturbed dataset for training the DNN which learns this pattern to perform the random misclassifiction or confidence reduction, as shown in Fig. 4.

    2. Attack 2: In this attack, we propose to extend the original dataset with perturbed copy of some of the training samples, as shown in Fig. 4.

Ii-B Experimental Analysis

In this section, we analyze the impact of direct and the modified data poisoning attacks under the assumptions mentioned in experimental setup. We make the following observations from this analysis:

  1. The analysis shows that both attacks manage to reduce the confidence and perform the random misclassification with less probability. For example, in the presence of 5% Salt and Pepper Noise, attacks 1 and 2 can

    missclassify the Stop sign to Speed limit 60km/h with the confidence of 25.634% and 15.273%, respectively.

  2. The analysis in Fig. 5 shows that the impact of attack 1, which directly perturbs the original dataset is almost 10% higher than that of the attack 2. The reason behind this behavior is that, in attack 2, the implemented VGGNets is trained on the complete dataset set. However, the perturbation is learned through the additional perturbed data samples, while preserving the learning behavior w.r.t. the original data set.

Ii-C Key Insights

  1. The number of perturbed samples can directly impact the inference accuracy but on the other hand, it also increases the effectiveness of the attack. Therefore, a proper analysis is required to choose this parameter.

  2. Directly perturbing the training dataset limits the effectiveness of the poisoning attack, because it reduces the range for number of intruded samples.

  3. To perform a successful data poisoning attack, choose the number of intruded samples while considering the impact on inference accuracy and avoid the direct perturbation of the datasets.

Fig. 5: Experimental results for two data poisoning attacks (attack 1 and 2, see Fig. 4). This analysis shows that the direct training data poisoning attack has a significant impact on the overall accuracy (up to 15%) but the indirect attack has relatively less impact on the overall accuracy.

Ii-D Limitations

Though, the training data poisoning attacks are effective, they posses the following limitations:

  1. These attacks require the complete access to the entire training datasets, baseline neural network and the data acquisition during the inference.

  2. Most of the training data poisoning attacks introduce the noise which is visible and which can be catered by careful inspection of the data acquisition and the implemented system.

Iii Data Security Attacks on ML Inferencing

In the development/manufacturing cycle of ML-driven systems, like the traditional systems, the inference stages of ML algorithms come with security vulnerabilities, i.e., manipulation of data acquisition block, communication channels and side-channel analysis to manipulate the inference data [30][31]. Remote cyber-attacks and side-channel attacks come with high computational costs and are therefore less frequently used [22][32]. Therefore, to design/implement the Inference Data Poisoning (IDP) attacks, we need to consider the following challenges:

  1. How to relax the assumption of having the access to the inference data acquisition block?

  2. How to generate an attack noise pattern which is imperceptible?

To address these research challenges, several imperceptible IDPs have been proposed. However, adversarial examples are one of the most effective IDP attacks. Therefore, in this section, we analyze and discuss some of the adversarial examples, i.e, Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) and Fast Gradient Sign Method (FSGM).

Iii-1 L-BFGS Method

This method is proposed by Szegedy et al. to generate an adversarial example in deep neural networks [25, 33]. The basic principle of the L-BFGS method is to iteratively optimized the reference noise with respect to the sample image from dataset, as shown in Equation 1. Where, the minimization of the represents the imperceptibility of this perturbation. To illustrate the effectiveness of this method, we demonstrated this attack on the VGGNet trained on the GTSRB, as shown in Fig. 6. This experimental analysis shows that by introducing adversarial noise to the image, the input is missclassified, i.e., from a stop sign to a speed limit 60km/h sign.


Although the L-BFGS method generates adversarial examples with imperceptible noise, it utilizes a basic linear search algorithm to update the noise for optimization which makes it computationally expensive and slow.

Iii-2 Fast Gradient Sign Method (FSGM)

To address the above-mentioned limitation of L-BFG, Goodfellow et al. proposed a Fast Gradient Sign Method to generate adversarial examples [26, 27]. FSGM is faster and requires less computations because it performs one step gradient update along the direction of the sign of gradient at each pixel. To demonstrate the effect of this attack, we implemented it on the VGGNet trained on GTSRB, as shown in Fig. 6. The experimental results show that it can perform the missclassifcation without any visible noise in the sample.

Fig. 6: Experimental Analysis for L-BFGS and FGSM attacks.

Iii-a Limitations

Most of these attacks generate imperceptible noise patterns but their optimization algorithm requires reference sample/s from the training dataset which limits their attack strength. This limitation raises a research question: How to generate an imperceptible Inference Data Poisoning attack without any access to the training dataset.

Iv A Methodology for Designing a Training Data-Unware Imperceptible Attack

Fig. 7: Proposed Methodology for Training Data-Unware Imperceptible Security Attacks on Machine Learning Algorithms

To address the above-mentioned limitations of imperceptibility and the dependency of training data samples, we propose an attack methodology which leverages the back propagation algorithm to optimize the reference image. The proposed methodology consists of the following steps, as shown in Fig. 7.

  1. We initially choose a classification/prediction probability distribution of the targeted class (

    ) and compute the compute the classification/prediction probability distribution of the target image (). To identify the difference between their respective prediction/classification probabilities, we compute the following cost function:

  2. Afterwards, we minimize the cost function for classification with higher confidence by iteratively comparing it with . If its value is greater than zero then we back propagate this effect to the target image by using the following equation.


    where, , and

    are the activation functions, the corresponding output of the neurons and the weights of the previous layer. We repeat this step until the cost function is approximately equal to


  3. We then compute and compare the correlation coefficients (e.g., in our case, ) and structural similarity index (e.g., in our case, ) to ensure imperceptibility. We use the two level similarity check because correlation coefficient is useful when there is significant number pixel changes. Therefore, we propose to use the structural similarity index to ensure the minimum change in input samples. In this methodology, the boundary values of and relate to the strength of imperceptibility.

Iv-a Experimental Analysis

To demonstrate the effectiveness of the proposed methodology, we implemented this methodology on the VGGNet trained for the GTDRB. The experimental results in Fig. 8 show how the imperceptibility is removed with respect to iterations. In this analysis, we identify the following key observations:

  1. Though after the first iteration, we achieved the targeted miscalssification but the intensity of the attack noise is very high and is clearly visible in the images of Fig. 8 under the “I1” label. Moreover, the corresponding values of and are and which are below the defined bounds, i.e., and .

  2. The top 5 accuracy and the corresponding values of and are presented in the analysis graph of Fig. 8. It shows that with the increase in imperceptibility, the attack accuracy of the attack image and the corresponding values of and also increase.

Iv-B Key Insights

  1. Instead of using the reference image from the training dataset to generate the imperceptible attack noise, the prediction/classification probability distribution of the targeted can be used to generate the imperceptible noise, as shown in Fig. 8.

  2. To ensure the maximum imperceptibility, attacker should only rely on one parameter. For example, after 25 iterations the value is greater then the defined bound () but the noise is not imperceptible, as shown in Fig. 8.

Fig. 8: Experimental results for proposed imperceptible security attack on VGGNet, a DNN trained on GTSRB dataset. This analysis shows that with the increase in iterations the attack noise become invisible, as depicted by the image sequences.

V Conclusion

In this paper, we proposed a novel training-data unaware methodology to automatically generate an imperceptible adversarial attack by exploiting the back-propagation property of trained deep neural networks (DNN). We successfully demonstrated the attack on an advanced deep neural network, VGGNet, deployed in autonomous driving use case for traffic sign detection. The VGGNet is trained for the German Traffic Sign Recognition Benchmarks (GTSRB) datasets. Our experiments showed that the generated attacks goes unnoticeable in both subjective and objective tests, with close to 100% correlation and structural similarity index w.r.t. the clean input image. We will make all of our design and attack files open-source. Our study shows that such attacks can be very powerful and would require new security-aware design methods for Robust Machine Learning.