Adversarial Sensor Attack on LiDAR-based Perception in Autonomous Driving

07/16/2019 ∙ by Yulong Cao, et al. ∙ University of Michigan University of California, Irvine 11

In Autonomous Vehicles (AVs), one fundamental pillar is perception, which leverages sensors like cameras and LiDARs (Light Detection and Ranging) to understand the driving environment. Due to its direct impact on road safety, multiple prior efforts have been made to study its the security of perception systems. In contrast to prior work that concentrates on camera-based perception, in this work we perform the first security study of LiDAR-based perception in AV settings, which is highly important but unexplored. We consider LiDAR spoofing attacks as the threat model and set the attack goal as spoofing obstacles close to the front of a victim AV. We find that blindly applying LiDAR spoofing is insufficient to achieve this goal due to the machine learning-based object detection process. Thus, we then explore the possibility of strategically controlling the spoofed attack to fool the machine learning model. We formulate this task as an optimization problem and design modeling methods for the input perturbation function and the objective function. We also identify the inherent limitations of directly solving the problem using optimization and design an algorithm that combines optimization and global sampling, which improves the attack success rates to around 75 study to understand the attack impact at the AV driving decision level, we construct and evaluate two attack scenarios that may damage road safety and mobility. We also discuss defense directions at the AV system, sensor, and machine learning model levels.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 10

page 12

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Autonomous vehicles, or self-driving cars, are under rapid development, with some vehicles already found on public roads (way, 2018; lyf, 2018; bai, 2018b) In AV systems, one fundamental pillar is perception, which leverages sensors like cameras and LiDARs (Light Detection and Ranging) to understand the surrounding driving environment. Since such function is directly related to safety-critical driving decisions such as collision avoidance, multiple prior research efforts have been made to study the security of camera-based perception in AV settings. For example, prior work has reported sensor-level attacks such as camera blinding (Petit et al., 2015), physical-world camera attacks such as adding stickers to traffic signs (Eykholt et al., 2018b, a), and trojan attacks on the neural networks for AV camera input (Liu et al., 2018).

Despite the research efforts in camera-based perception, there is no thorough exploration into the security of LiDAR-based perception in AV settings. LiDARs, which measure distances to surrounding obstacles using infrared lasers, can provide 360-degree viewing angles and generate 3-dimensional representations of the road environment instead of just 2-dimensional images for cameras. Thus, they are generally considered as more important sensors than cameras for AV driving safety (lid, 2018b, 2017a) and are adopted by nearly all AV makers today (lid, 2017b, c, 2018a; apo, 2017). A few recent works demonstrated the feasibility of injecting spoofed points into the sensor input from the LiDAR (Petit et al., 2015; Shin et al., 2017). Since such input also needs to be processed by an object detection step in the AV perception pipeline, it is largely unclear whether such spoofing can directly lead to semantically-impactful security consequences, e.g., adding spoofed road obstacles, in the LiDAR-based perception in AV systems.

In this work, we perform the first study to explore the security of LiDAR-based perception in AV settings. To perform the analysis, we target the LiDAR-based perception implementation in Baidu Apollo, an open-source AV system that has over 100 partners and has reached a mass production agreement with multiple partners such as Volvo and Ford (bai, 2018a, b). We consider a LiDAR spoofing attack, i.e., injecting spoofed LiDAR data points by shooting lasers, as our threat model since it has demonstrated feasibility in previous work (Petit et al., 2015; Shin et al., 2017). With this threat model, we set the attack goal as adding spoofed obstacles in close distances to the front of a victim AV (or front-near obstacles) in order to alter its driving decisions.

In our study, we first reproduce the LiDAR spoofing attack from the work done by Shin et al. (2017) and try to exploit Baidu Apollo’s LiDAR-based perception pipeline, which leverages machine learning for object detection as with the majority of the state-of-the-art LiDAR-based AV perception techniques (kit, 2017). We enumerate different spoofing patterns from the previous work, e.g., a spoofed wall, and different spoofing angles and shapes, but none of them succeed in generating a spoofed road obstacle after the machine learning step. We find that a potential reason is that the current spoofing technique can only cover a very narrow spoofing angle, i.e., 8 horizontally in our experiments, which is not enough to generate a point cloud of a road obstacle near the front of a vehicle. Thus, blindly applying existing spoofing techniques cannot easily succeed.

To achieve the attack goal with existing spoofing techniques, we explore the possibility of strategically controlling the spoofed points to fool the machine learning model in the object detection step. While it is known that machine learning output can be maliciously altered by carefully-crafted perturbations to the input (Papernot et al., 2017; Eykholt et al., 2018b; Carlini et al., 2016; Yuan et al., 2018; Carlini and Wagner, 2018), no prior work studied LiDAR-based object detection models for AV systems. To approach this problem, we formulate the attack task as an optimization problem, which has been shown to be effective in previous machine learning security studies (Carlini and Wagner, 2017b; Cisse et al., 2017; Xie et al., 2017; Xiao et al., 2018a; Cheng et al., 2018; Xiao et al., 2018b). Specific to our study, two functions need to be newly formulated: (1) an input perturbation function that models LiDAR spoofing capability in changing machine learning model input, and (2) an objective function that can reflect the attack goal. For the former, since previous work did not perform detailed measurements for the purpose of such modeling, we experimentally explore the capability of controlling the spoofed data points, e.g., the number of points and their positions. Next, we design a set of global spatial transformation functions to model these observed attack capabilities at the model input level. In this step, both the quantified attack capabilities and the modeling methodology are useful for future security studies of LiDAR-related machine learning models.

For the attack goal of adding front-near obstacles, designing a objective function is also non-trivial since the machine learning model output is post-processed in the perception module of Baidu Apollo before it is converted to a list of perceived obstacles. To address this, we study the post-processing logic, extract key strategies of transforming model output into perceived obstacles, and formulate it into the objective function.

With the optimization problem mathematically formulated, we start by directly solving it using optimization algorithms like previous studies (Carlini and Wagner, 2017b). However, we find that the average success rate of adding front-near obstacles is only 30%. We find that this is actually caused by the nature of the problem, which makes it easy for any optimization algorithm to get trapped in local extrema. To solve this problem, we design an algorithm that combines global sampling and optimization, which is able to successfully increase the average success rates to around 75%.

As a case study for understanding the impact of the discovered attack input at the AV driving decision level, we construct two attack scenarios: (1) emergency brake attack, which may force a moving AV to suddenly brake and thus injure the passengers or cause rear-end collisions, and (2) AV freezing attack, which may cause an AV waiting for the red light to be permanently “frozen” in the intersection and block traffic. Using real-world AV driving data traces released by the Baidu Apollo team, both attacks successfully trigger the attacker-desired driving decisions in Apollo’s simulator.

Based on the insights from our security analysis, we propose defense solutions not only at AV system level, e.g., filtering out LiDAR data points from ground reflection, but also at sensor and machine learning model levels.

In summary, this work makes the following contributions:

  • We perform the first security study of LiDAR-based perception for AV systems. We find that blindly applying existing LiDAR spoofing techniques cannot easily succeed in generating semantically-impactful security consequences after the machine learning-based object detection step. To achieve the attack goal with existing spoofing techniques, we then explore the possibility of strategically controlling the spoofed points to fool the machine learning model, and formulate the attack as an optimization problem.

  • To perform analysis for the machine learning model used in LiDAR-based AV perception, we make two methodology-level contributions. First, we conduct experiments to analyze the LiDAR spoofing attack capability and design a global spatial transformation based method to model such capability in mathematical forms. Second, we identify inherent limitations of directly solving our problem using optimization, and design an algorithm that combines optimization and global sampling. This is able to increase the attack success rates to around 75%.

  • As a case study to understand the impact of the attacks at the AV driving decision level, we construct two potential attack scenarios: emergency brake attack, which may hurt the passengers or cause a rear-end collision, and AV freezing attack, which may block traffic. Using a simulation based evaluation on real-world AV driving data, both attacks successfully trigger the attacker-desired driving decisions. Based on the insights, we discuss defense directions at AV system, sensor, and machine learning model levels.

2. Background

2.1. LiDAR-based Perception in AV Systems

Figure 1. Overview of the data processing pipeline for LiDAR-based perception in Baidu Apollo.

AVs rely on various sensors to perform real-time positioning (also called localization) and environment perception (or simply perception). LiDAR, camera, radar, and GPS/IMU are major sensors used by various autonomous driving systems. The data collected from those sensors are transformed and processed before it becomes useful information for AV systems. Fig. 1 shows the data processing pipeline of LiDAR sensor data in the perception module of Baidu Apollo (apo, 2017). As shown, it involves three main steps as follows:

Step 1: Pre processing. The raw LiDAR sensor input is called 3D point cloud and we denote it as . The dimension of is , where

denotes the number of data points and each data point is a 4-dimension vector with the 3D coordinates,

, , and , and the intensity of the point. In the pre-processing step, is first transformed into an absolute coordinate system. Next, the Region of Interest (ROI) filter removes unrelated portions of the 3D point cloud data, e.g., those that are outside of the road, based on HDMap information. Next, a feature generation process generates a feature matrix , which is the input to the subsequent machine learning model. In this process, the ROI-filtered 3D point cloud within the range (60 meters by default) is mapped to cells according to the and coordinates. In each cell, the assigned points are used to generate features as listed in Table 1.

Step 2: DNN-based object detection. A Deep Neural Network (DNN) then takes the feature matrix

as input and produces a set of output metrics for each cell, e.g., the probability of the cell being a part of an obstacle. These output metrics are listed in Table

2.

Step 3: Post processing. The clustering process only considers cells with objectness values (one of the output metrics listed in Table 2) greater than a given threshold ( by default). Then, the process constructs candidate object clusters by building a connected graph using the cells’ output metrics. Candidate object clusters are then filtered by selecting clusters with average positiveness values (another output metric) greater than a given threshold ( by default). The box builder then reconstructs the bounding box including height, width, length of an obstacle candidate from the 3D point cloud assigned to it. Finally, the tracker integrates consecutive frames of processed results to generate tracked obstacles, augmented with additional information such as speed, acceleration, and turning rates, as the output of the LiDAR-based perception.

With the information of perceived obstacles such as their positions, shapes, and obstacle types, the Apollo system then uses such information to make driving decisions. The perception output is further processed by the prediction module which predicts the future trajectories of perceived obstacles, and then the planning module which plans the future driving routes and makes decisions such as stopping, lane changing, yielding, etc.

Feature Description
Max height Maximum height of points in the cell.
Max intensity Intensity of the brightest point in the cell.
Mean height Mean height of points in the cell.
Mean intensity Mean intensity of points in the cell.
Count Number of points in the cell.
Direction Angle of the cell’s center with respect to the origin.
Distance Distance between the cell’s center and the origin.
Non-empty Binary value indicating whether the cell is empty or occupied.

Table 1. DNN model input features.
Metrics Description
Center offset Offset to the predicted center of the cluster the cell belongs to.
Objectness The probability of a cell belonging to an obstacle.
Positiveness The confidence score of the detection.
Object height The predicted object height.
Class probability The probability of the cell being a part of a vehicle, pedestrian, etc.

Table 2. DNN model output metrics.

2.2. LiDAR Sensor and Spoofing Attacks

To understand the principles underlying our security analysis methodology, it is necessary to understand how the LiDAR sensor generates a point cloud and how it is possible to alter it in a controlled way using spoofing attacks.

LiDAR sensor. A LiDAR sensor functions by firing laser pulses and capturing their reflections using photodiodes. Because the speed of light is constant, the time it takes for the echo pulses to reach the receiving photodiode provides an accurate measurement of the distance between a LiDAR and a potential obstacle. By firing the laser pulses at many vertical and horizontal angles, a LiDAR generates a point cloud used by the AV systems to detect objects.

LiDAR spoofing attack. Sensor spoofing attacks use the same physical channels as the targeted sensor to manipulate the sensor readings. This strategy makes it very difficult for the sensor system to recognize such attack, since the attack doesn’t require any physical contact or tampering with the sensor, and it doesn’t interfere with the processing and transmission of the digital sensor measurement. These types of attack could trick the victim sensor to provide seemingly legitimate but actually erroneous data.

LiDAR has been shown to be vulnerable to laser spoofing attacks in prior work. Petit et al. demonstrated that a LiDAR spoofing attack can be performed by replaying the LiDAR laser pulses from a different position to create fake points further than the location of the spoofer (Petit et al., 2015). Shin et al. showed that it is possible to generate a fake point cloud at different distances, even closer than the spoofer location (Shin et al., 2017). In this paper, we build upon these prior work to study the effect of this attack vector on the security of AV perception.

2.3. Adversarial Machine Learning

Neural networks.

A neural network is a function consisting of connected units called (artificial) neurons that work together to represent a differentiable function that outputs a distribution. A given neural network (e.g., classification) can be defined by its model architecture and parameters

. An optimizer such as Adam (Kingma and Ba, 2014) is used to update the parameters with respect to the objective function .

Adversarial examples. Given a machine learning model , input and its corresponding label , an adversarial attacker aims to generate adversarial examples so that (untargeted attack) or , where is a target label (targeted attack). Carlini and Wagner (2017b) proposed to generate an adversarial perturbation for a targeted attack by optimizing an objective function as follows:

where is the target adversarial goal and denote that the adversarial examples should be in a valid set. Further, optimization-based algorithms have been leveraged to generate adversarial examples on various kinds of machine learning tasks successfully, such as segmentation (Xie et al., 2017; Cisse et al., 2017)

, human pose estimation 

(Cisse et al., 2017), object detection (Xie et al., 2017), Visual Question Answer system (Xu et al., 2017), image caption translation (Cheng et al., 2018), etc. In this paper, we also leverage an optimization-based method to generate adversarial examples to fool LiDAR-based AV perception.

3. Attack Goal and Threat Model

Attack goal. To cause semantically-impactful security consequence in AV settings, we set the attack goal as fooling the LiDAR-based perception into perceiving fake obstacles in front of a victim AV in order to maliciously alter its driving decisions. More specifically, in this work, we target front-near fake obstacles, i.e., those that are in close distances to the front of a victim AV, since they have the highest potential to trigger immediate erroneous AV driving decisions. In this work, we define front-near obstacles as those that are around 5 meters to the front of a victim AV.

Threat model. To achieve the attack goal above, we consider LiDAR spoofing attacks as our threat model, which is a demonstrated practical attack vector for LiDAR sensors (Petit et al., 2015; Shin et al., 2017) as described in §2.2. In AV settings, there are several possible scenarios to perform such attack. First, the attacker can place an attacking device at the roadside to shoot malicious laser pulses to AVs passing by. Second, the attacker can drive an attack vehicle in close proximity to the victim AV, e.g., in the same lane or adjacent lanes. To perform the attack, the attack vehicle is equipped with an attacking device that shoots laser pulses to the victim AV’s LiDAR. To perform laser aiming in these scenarios, the attacker can use techniques such as camera-based object detection and tracking. In AV settings, these attacks are stealthy since the laser pulses are invisible and laser shooting devices are relatively small in size.

As a first security analysis, we assume that the attacker has white-box access to the machine learning model and the perception system. We consider this threat model reasonable since the attacker could obtain white-box access by additional engineering efforts to reverse engineering the software.

Figure 2. Overview of the Adv-LiDAR methodology.

4. Limitation of Blind sensor spoofing

To understand the security of LiDAR-based perception under LiDAR spoofing attacks, we first reproduce the state-of-the-art LiDAR spoofing attack by Shin et al. (Shin et al., 2017), and explore the effectiveness of directly applying it to attack the LiDAR-based perception pipeline in Baidu Apollo (apo, 2017), an open-source AV system that has over 100 partners and has reached mass production agreement with multiple partners such as Volvo, Ford, and King Long (bai, 2018a, b).

Spoofing attack description. The attack by Shin et al. (Shin et al., 2017) consists of three components: a photodiode, a delay component, and an infrared laser, which are shown in Fig. 3. The photodiode is used to synchronize with the victim LiDAR. The photodiode triggers the delay component whenever it captures laser pulses fired from the victim LiDAR. Then the delay component triggers the attack laser after a certain amount of time to attack the following firing cycles of the victim LiDAR. Since the firing sequence of laser pulses is consistent, an adversary can choose which fake points will appear in the point cloud by crafting a pulse waveform to trigger the attack laser.

Figure 3. Illustration of LiDAR spoofing attack. The photodiode receives the laser pulses from the LiDAR and activate the delay component that triggers the attacker laser to simulate real echo pulses.

Experimental setup. We perform spoofing attack experiments on a VLP-16 PUCK LiDAR System from Velodyne (Inc., 2018). The VLP-16 uses a vertical array of 16 separate laser diodes to fire laser pulses at different angles. It has a 30 degree vertical angle range from -15 to +15 , with 2 of angular resolution. The VLP-16 rotates horizontally around a center axis to send pulses in a 360 horizontal range, with a varying azimuth resolution between 0.1 and 0.4 . The laser firing sequence follows the pattern shown in Figure 4. The VLP-16 fires 16 laser pulses in a cycle every 55.296 s, with a period of 2.304 s. The receiving time window is about 667 ns. We chose this sensor because it is compatible with Baidu Apollo and uses the same design principle as the more advanced HDL-64E LiDARs used in many AVs. The similar design indicates that the same laser attacks that affect the VLP-16 can be extended to high-resolution LiDARs like the HDL-64E.

We use the OSRAM SFH 213 FA as our photodiode, with a comparator circuit similar to the one used by Shin et al. We use a Tektronix AFG3251 function generator as the delay component with the photodiode circuit as an external trigger. In turn, the function generator provides the trigger to the laser driver module PCO-7114 that drives the attack laser diode OSRAM SPL PL90. With the PCO-7114 laser driver, we were able to fire the laser pulses at the same pulse rate of the VLP-16, 2.304 s, compared to 100 s of the previous work. An optical lens with a diameter of 30mm and a focal length of 100 mm was used to focus the beam, making it more effective for ranges farther than 5 meters. We generate the custom pulse waveform using the Tektronix software ArbExpress (arb, 2016) to create different shapes and the Velodyne software VeloView (vel, 2018) to analyze and extract the point clouds.

Figure 4. The consistent firing sequence of the LiDAR allows an attacker to choose the angles and distances from which spoofed points appear. For example, applying the attacker signal, fake dots will appear at 1, 3, -3, and -1 angles (0 is the center of the LiDAR)

Experiment results. The prior work of Shin et al. is able to spoof a maximum of 10 fake dots in a single horizontal line. With our setup improvements (a faster firing rate and a lens to focus the beam), fake points can be generated at all of the 16 vertical viewing angles and an 8 horizontal angle at greater than 10 meters away. In total, around 100 dots can be spoofed by covering these horizontal and vertical angles (illustrated in Fig. 14 in Appendix). These spoofed dots can also be shaped by modifying the custom pulse waveform used to fire the attack laser. Noticed that even though around 100 dots can be spoofed, they are not all spoofed stably. The attacker is able to spoof points at different angles because the spoofed laser pulses hit a certain area on the victim LiDAR due to the optical lens focusing. The closer to the center of the area, the stronger and stabler laser pulses are received by the victim LiDAR. We find that among 60 points at the center 8-10 vertical lines can be stably spoofed with high intensity.

4.1. Blind LiDAR Spoofing Experiments

After reproducing the LiDAR spoofing attack, we then explore whether blindly applying such attack can directly generate spoofed obstacles in the LiDAR-based perception in Baidu Apollo. Since our LiDAR spoofing experiments are performed in indoor environments, we synthesize the on-road attack effect by adding spoofed LiDAR points to the original 3D point cloud collected by Baidu Apollo team on local roads at Sunnyvale, CA. The synthesizing process is illustrated in Fig. 5. After this process, we run Apollo’s perception module with the attacker-perturbed 3D point cloud as input to obtain the object detection output. In this analysis, we explore three blind attack experiments as follows:

Experiment 1: Directly apply original spoofing attack traces. In this experiment, we directly replay spoofing attack traces to attack LiDAR-based perception in Apollo. More specifically, we experiment with attack traces obtained from two sources: (1) the original spoofing attack traces from Shin et al. (Shin et al., 2017), and (2) the attack traces generated from the spoofing attack reproduced by us, which can inject more dots after our setup improvements. However, we are not able to observe a spoofed obstacle for any of these traces at the output of the LiDAR-based perception pipeline.

Figure 5. Generating the attacker-perturbed 3D point cloud by synthesizing the pristine 3D point cloud with the attack trace to spoof a front-near obstacle 5 meters away from the victim AV.

Experiment 2: Apply spoofing attack traces at different angles. To understand whether successfully spoofing an obstacle depends on the angle of the spoofed points, in this experiment we inject spoofed points at different locations. More specifically, we uniformly sample 100 different angles out of 360 degrees around the victim AV, and inject the spoofing attack traces reproduced by us. However, we are not able to observe spoofed obstacles for any of these angles.

Experiment 3: Apply spoofing attack traces with different shapes. To understand whether successfully spoofing an obstacle depends on the pattern of the spoofed points, in this experiment we inject points with different spoofing patterns. More specifically, we generate random patterns of spoofed points by randomly setting distances for each point at different angles. We generate 160 points covering 16 vertical lines, 10 points for each line with continuous horizontal angles. To trigger immediate control decision changes in an AV, the spoofed obstacle needs to be close to the victim AV. Thus, we set the generated distances of the spoofed point to be within 4 to 6 meters to the victim AV. We generate 100 different spoofed patterns in total, but we are not able to observe spoofed obstacles for any of these patterns.

Summary. In these experiments, we try various blind spoofing attack strategies directly derived from the state-of-the-art LiDAR spoofing attack, but none of them succeed in generating spoofed obstacles in the LiDAR-based perception pipeline in Baidu Apollo. There are two potential reasons. First, as described earlier, the current attack methodology can only cover a very narrow spoofing angle, i.e., 8 of horizontal angle even after our setup improvements. Second, the coverage of vertical angles is limited by the frequency of spoofing laser pulses. Thus, when attacking a LiDAR with more vertical angles, e.g., a 64-line LiDAR, since a 64-line LiDAR takes similar time as a 16-line LiDAR in scanning vertical angles, the attacker cannot spoof more vertical angles than those for a 16-line LiDAR. Thus, the current methodology limits the number of spoofed points, making it hard to generate enough points to mimic an important road obstacle.

To illustrate that, as shown in Fig. 6, the point cloud for a real vehicle has a much wider angle and much more points than the attack traces reproduced by us. Thus, blindly applying the spoofing attack cannot easily fool the machine learning based object detection process in the LiDAR-based perception pipeline. In the next section, we explore the possibility of further exploiting machine learning model vulnerabilities to achieve our attack goal.

Figure 6. The point cloud from a real vehicle reflection (left) and from the spoofing attack (right) in a 64-line HDL-64E LiDAR. The vehicle is around 7 meters in front of the AV.

5. Improved Methodology: Adv-LiDAR

As discussed in §4, without considering the machine learning model used in LiDAR-based perception, blindly applying existing LiDAR spoofing attacks can hardly achieve the attack goal of generating front-near obstacles. Since it is known that machine learning output can be maliciously altered by carefully-crafted perturbations to the input (Papernot et al., 2017; Eykholt et al., 2018b; Carlini et al., 2016; Yuan et al., 2018; Carlini and Wagner, 2018), we are then motivated to explore the possibility of strategically controlling the spoofed points to fool the machine learning model in LiDAR-based perception. In this section, we first describe the technical challenges after involving adversarial machine learning analysis in this research problem, and then present our solution methodology overview, called Adv-LiDAR.

5.1. Technical Challenges

Even though previous studies have shown promising results in attacking machine learning models, none of them studied LiDAR-based object detection models, and their approaches have limited applicability to our analysis goal due to three challenges:

First, attackers have limited capability of perturbing machine learning model inputs in our problem. Other than perturbing pixels on an image, perturbing machine learning inputs under AV settings requires perturbing 3D point cloud raw data by sensor attack and bypassing the associated pre-processing process. Therefore, such perturbation capability needs to be quantified and modeled.

Second, optimization-based methods for generating adversarial examples in previous studies may not be directly suitable for our analysis problem due to the limited model input perturbation capability. As shown in §7, we find that optimization-based methods are inherently limited due to the nature of our problem, and can only achieve very low success rate in generating front-near obstacles.

Third, in our problem, successfully changing the machine learning model output does not directly lead to successes in achieving our attack goal in AV settings. As detailed later in §7, in AV systems such as Baidu Apollo, machine learning model output is post-processed before it is converted to a list of perceived obstacles. Thus, an objective function that can effectively reflect our attack goal needs to be newly designed.

5.2. Adv-LiDAR Methodology Overview

max width= Notation Description Notation Description 3D point cloud Input feature matrix Adversarial 3D point cloud Adversarial input feature matrix Spoofed 3D point cloud Spoofed input feature matrix Adversarial spoofed 3D point cloud Adversarial spoofed input feature matrix 3D Cartesian coordinate Upper bound of during sampling Coordinate of Coordinate of Machine learning model Model outputs 4-pixel neighbor at the location Height Scaling function Spoofing attack capability Mapping function (3D 2D) Extraction function Merge function Gaussian mask Center points of the Gaussian mask Objective function Adversarial loss 2D Homography Matrix ( : rotation, : scaling ; : translation ) Height scaling ratio Set of spoofed 3D point cloud Set of spoofed input feature matrix Global spatial transformation function for 3D point cloud Global spatial transformation function for input feature matrix

Table 3. Notations adopted in this work.

In this section, we provide an overview of our solution methodology, which we call Adv-LiDAR, that addresses the three challenges above. At a high level, to identify adversarial examples for the machine learning model , we adopt an optimization-based approach, which has shown both high efficiency and effectiveness by previous studies for machine learning models across different domains (Carlini and Wagner, 2017b; Cisse et al., 2017; Xiao et al., 2018c, b). To help explain the formulation of the optimization problem, we summarize the notations in Table 3. Specifically, the problem is formulated as follows:

(1)

where is the pristine 3D point cloud and represents the corresponding 2D input feature matrix. is the pre-processing function that maps into 2.1). and are the corresponding adversarial spoofed 3D point cloud and adversarial spoofed input feature matrix. is a set of spoofed 3D point cloud generated from LiDAR spoofing attacks. is the adversarial loss designed to achieve the adversarial goal given the machine learning model . The constraints are used to guarantee that the generated adversarial examples satisfy the spoofing attack capability.

Figure 2 overviews the analysis tasks needed to solve the optimization problem. First, we need to conduct an input perturbation analysis that formulates the spoofing attack capabilities and merging function . Second, we need to perform a model analysis to design an objective function to generate adversarial examples. Third, as a case study to understand the impact of the attacks at the AV driving decision level, we further perform a driving decision analysis using the identified adversarial examples. More details about these tasks are as follows:

Input perturbation analysis. Formulating and is non-trivial. First, previous work regarding LiDAR spoofing attacks neither provided detailed measurements on the attacker’s capability in perturbing 3D point cloud nor expressed it in a closed form expression. Second, point cloud data is pre-processed by several steps as shown in Section 2.1 before turning into machine learning input, which means the merging function cannot be directly expressed. To address these two challenges, as will be detailed later in §6, we first conduct spoofing attacks on LiDAR to collect a set of possible spoofed 3D point cloud. Using such spoofed 3D point cloud, we model the spoofing attack capability . We further analyze the pre-processing program to obtain the additional constraints to the machine learning input perturbation, or the spoofed input feature matrix. Based on this analysis, we formulate the spoofed input feature matrix into a differentiable function using global spatial transformations, which is required for the model analysis.

Objective function design and model analysis. As introduced earlier in §5.1, in LiDAR-based perception in AV systems, the machine learning model output is post-processed (§ 2.1) before turning into a list of perceived obstacles. To find an effective objective function, we study the post-processing steps to extract key strategies of transforming model output into perceived obstacles, and formulate it into an objective function that reflects the attack goal. In addition, we find that our optimization problem cannot be effectively solved by directly using existing optimization-based methods. We analyze the loss surface, and find that this inefficiency is caused by the problem nature. To address this challenge, we improve the methodology by combining global sampling with optimization. Details about the analysis methodology and results are in §7 and § 8.

Driving decision case study. With the results from previous analysis steps, we can generate adversarial 3D point cloud that can inject spoofed obstacles at the LiDAR-based perception level. To understand their impact at the AV driving decision level, we construct and evaluate two attack scenarios as case studies. The evaluation methodology and results are detailed later in §9.

6. Input Perturbation Analysis

To generate adversarial examples by solving the above optimization problem in Equation 2.3, we need to formulate merging function and input feature matrix spoofing capability as a closed form. In this section, we first analyze the spoofing attack capability (), and then use it to formulate .

6.1. Spoofing Attack Capability

Based on the attack reproduction experiments in  §4, the observed attack capability () can be described from two aspects:

Number of spoofed points. As described in §4, even though it is possible to spoof around 100 points after our setup improvement, we find that around 60 points can be reliably spoofed in our experiments. Thus, we consider 60 as the highest number of reliable spoofed points. Noticed that, the maximum number of spoofed points could be increased if the attacker uses more advanced attack equipment. Here, we choose a set of devices that are more accessible (detailed in  §4) and end up with the ability to reliably spoof around 60 points. In addition, considering that an attacker may use a slower laser or cruder focusing optics, such as in the setup by Shin et al. (Shin et al., 2017), we also consider 20 and 40 spoofed points in our analysis.

Location of spoofed points. Given the number of spoofed points, the observed attack capability in placing these points are described and modeled as follows:

  1. Modify the distance of the spoofed point from the LiDAR by changing the delay of the attack laser signal pulses in small intervals (nanosecond scale). From the perspective of spoofed 3D point cloud , this can be modeled as moving the position of the spoofed points nearer or further on the axis that connects the spoofed points and the LiDAR sensor by distance (Fig. 7 (a)).

  2. Modify the altitude of a spoofed point within the vertical range of the LiDAR by changing the delay in intervals of 2.304 . From the perspective of spoofed 3D point cloud , this can be modeled as moving the position of the spoofed points from vertical line to vertical line to change the height of it by height (Fig. 7 (b)).

  3. Modify the azimuth of a spoofed point within a horizontal viewing angle of 8 by changing the delay in intervals of 55.296 . By moving the LiDAR spoofer to different locations around the LiDAR, it is possible to spoof at any horizontal angle. From the perspective of spoofed 3D point cloud , this can be modeled as rotating the spoofed points with the LiDAR sensor as the pivot point on the horizontal plane by angle (Fig. 7 (c)).

Therefore, we model the attack capability by applying these three modifications to the given spoofed 3D point cloud . Here the spoofed 3D point cloud is collected by reproducing the sensor spoofing attack. The point number of can be 20, 40 and 60 to represent different attack capabilities as mentioned before. In the next section, the attack capability modeled here is used to model the perturbation of the input feature matrix .

Figure 7. Attack capability in perturbing 3D Point Cloud

6.2. Input Perturbation Modeling

After analyzing spoofing attack capability , to formulate in Equation 1, We need to have the following steps: (1) formulating the merging function ; (2) modeling the spoofed input feature matrix spoofing capability based on known spoofing attack capability . In this section, we first formulate the merging function by analyzing the pre-processing program. Then we model the spoofed input feature matrix spoofing capability by expressing with spoofed input feature matrix in a differentiable function using global spatial transformations. Here, spoofed input feature matrix can be attained with a given spoofed 3D point cloud by .

Formulating merging function (). To model the merging function operated on and , which are in the domain of input feature matrix, we need to first analyze the pre-processing program that transforms the 3D point cloud into the input feature matrix . As described in §2.1, the pre-processing process consists of three sub-processes: coordinate transformation, ROI filtering and input feature matrix extraction. The first two processes make minor effects on the adversarial spoofed 3D point cloud generated by the spoofing attack we conducted in §6. The coordinate transformation process has no effect because the adversarial spoofed 3D point cloud will be transformed along with the 3D point cloud X. As for the ROI filtering process, it filters out 3D point cloud located outside of the road from a bird’s-eye view. Therefore, as long as we spoof points on the road, the ROI filtering process makes no effect on the adversarial spoofed 3D point cloud . The feature extraction process, as we mentioned in Section 2.1, extracts statistical features such as average height (), average intensity (), max height () and so on.

Because of such pre-processing, the spoofed input feature matrix cannot be directly added to the input feature matrix to attain the adversarial input feature matrix . To attain , we express such “addition” operation () as a differentiable function shown below. Note that in this equation we do not include a few features in Table 1 such as direction and distance since they are either constant or can be derived directly from the features included in the equation.

(2)

Modeling input feature matrix spoofing capability . To model input feature matrix spoofing capability , it equals to representing adversarial input feature matrix with known spoofed input feature matrix . We can use global spatial transformations including rotation, translation and scaling, under certain constraints to represent the input feature matrix spoofing capability. Here the translation and scaling transformation interprets the attack capability in terms of modifying the azimuth of 3D point cloud while the rotation transformation interprets the attack capability in terms of modifying the distance of 3D point cloud from the LiDAR.

Specifically, we apply the global spatial transformation to a set of the spoofed input feature matrix to formulate the spoofed input feature matrix spoofing capability and to represent adversarial spoofed input feature matrix t’. For each spoofed input feature matrix , it is mapped from a corresponding spoofed 3D point cloud such that .

We use to denote values of the -th position on the spoofed input feature matrix and 2D coordinate () to denote its location. is transformed from an arbitrary instance where by applying a homography matrix . The location of can be derived as as follows:

(3)

Notice that here, has a fixed ratio since the translation is performed along the axis shown in Fig. 7 (1). Since is dependent on the spoofed input feature matrix we provide for performing the transformation, we align the spoofed input feature matrix in advance to the axis where and accordingly . Therefore, we can optimize alone. Also, this process is equivalent to scaling so we remove .

We use the differentiable bilinear interpolation 

(Jaderberg et al., 2015) to calculate :

(4)

where represents the 4-pixel neighbors (top-left, top-right,bottom-left, bottom-right) at the location () .

Further, we can observe that the input feature matrix contains the height information as shown in Table 1. So we also optimize a global scale scalar to the height features when generating adversarial spoofed input feature matrix . Define as the scaling function that multiplies the features which contain the height information by . Based on this transformation, Equation 4 will be changed as follows. For simplification, we denote the whole transformation progress as . So represents the transformed adversarial spoofed input feature matrixgiven spoofed input feature matrix with transformation parameters .

(5)

7. Generating Adversarial Examples

Figure 8. Overview of the adversarial example generation process.

After modeling the input perturbation, in this section we design the objective function with an effective adversarial loss , and leverage an optimization method to find the attack transformation parameters that minimize such loss.

Design the adversarial loss . Unlike previous work that performs the analysis only at the machine learning model level, there is no obvious objective function reflecting our attack goal of spoofing front-near obstacles. Yet, creating an effective objective function has been shown to be essential in generating effective adversarial examples (Carlini and Wagner, 2017b). In order to design an effective objective function, we analyze the post-processing step for the machine learning output. As shown in §2.1, in the clustering process, each cell of the model output is filtered by its objectness value. After the clustering process, candidate object clusters are filtered by their positiveness values. Upon such observation, we designed the adversarial loss as follows,

(6)

where is the function to extract the probabilities of attribute from model by feeding in adversarial example . is a standard Gaussian mask with center coordinate which is an attack target position chosen by the attacker. We attain by mapping the attack target position in the real world onto the corresponding coordinates of the cell in the input feature matrix using . The adversarial loss is then the summation over all the cells in the input feature matrix of the weighted value described above. By minimizing this designed adversarial loss, it equals to increasing the probability to detect the obstacle of the adversarial spoofed 3D point cloud given the machine learning model .

Optimization algorithm and our improvement using sampling. With the design above, the optimization problem can be directly solved by using the Adam optimizer (Kingma and Ba, 2014) to obtain the transformation parameters and scalar by minimizing the following objective function:

(7)

where can be obtained by Equation 5 and . In this paper, we call this direct solution vanilla optimization.

Figure 9. Loss surface over transformation parameters (rotation) and (translation). Using a small step size (green line) will trap the optimizing process near a local extreme while choosing a large step size (red line) will be less effective.

We visualize the loss surface against the transformation parameters in Fig. 9. During the vanilla optimization process, we observe that the loss surface over the transformation parameters is noisy at a small scale (green line) and quite flat at a large scale (red line). This leads to the problem of choosing a proper step size for optimization-based methods. For example, choosing a small step size will trap the optimizing process near a local minimum while choosing a large step size will be less effective due to noisy local loss pointing to the wrong direction. Different from Carlini et al. (Carlini and Wagner, 2017b) that directly chose multiple starting points to reduces the trap of local minima, the optimization process under our setting is easy to get stuck in bad local minima due to the hard constraints of the perturbations. We propose a way to use sampling at a larger scale and to optimize at a smaller scale. To initiate the optimization process at different positions, we first calculate the range of the transformation parameters so that the transformed spoofed 3D point cloud is located in the target area. Then we uniformly take samples for rotation and translation parameters and compose samples to initiate with.

Generating adversarial spoofed 3D point cloud. To further construct the adversarial 3D point cloud , we need to construct adversarial spoofed 3D point cloud . Using the transformation parameters , we can express the corresponding adversarial spoofed 3D point cloud such that with a dual transformation function of . We use to denote value of coordinate () and to denote the value of intensity for all points in spoofed 3D point cloud . With transformation parameters , we can express of the transformed adversarial spoofed 3D point cloud in Equation 8.

(8)

Therefore, we can use represents the transformed adversarial spoofed 3D point cloud given spoofed 3D point cloud with transformation parameters .

Overall adversarial example generation process. Fig. 8 provides an overview of the overall adversarial example generation process. Given 3D point cloud X and spoofed 3D point cloud (Fig. 8 (a)), we first map them via to get corresponding input feature matrix and spoofed input feature matrix . Then we apply the sampling algorithm to initialize the transformation parameters as shown in Fig. 8 (b). After the initialization, we leverage optimizer to further optimize the transformation parameters (

) with respect to the adversarial loss function

(Fig. 8 (c)). With the transformation parameters and , we apply the dual transformation function using the Equation 8 to get adversarial spoofed 3D point cloud . At last, to obtain the adversarial 3D point cloud , we append to 3D point cloud (Fig. 8 (d)). The entire adversarial example generation algorithm including the optimization parameters is detailed in Appendix A.

8. Evaluation and Results

In this section, we evaluate our adversarial example generation method in terms of attack effectiveness and robustness.

Experiment Setup. We use the real-world LiDAR sensor data trace released by Baidu Apollo team with Velodyne HDL-64E S3, which is collected for 30 seconds on local roads at Sunnyvale, CA. We uniformly sample 300 3D point cloud frames from this trace in our evaluation. The attack goal is set as spoofing an obstacle that is 2-8 meters to the front of the victim AV. The distance is measured from the front end of the victim AV to the rear end of the obstacle.

8.1. Attack Effectiveness

Fig. 10 shows the success rates of generating a spoofed obstacle with different attack capabilities using the vanilla optimization and our improved optimization with global sampling (detailed in §7). As shown, with our improvement using sampling, the success rates of spoofing front-near obstacles are increased from 18.9% to 43.3% on average, which is a 2.65 improvement. This shows that combining global sampling with optimization is effective in addressing the problem of trapping in local minima described in §7.

Fig. 10 also shows that the success rates increase with more spoofed points, which is expected since the attack capability is increased with more spoofed points. In particular, when the attacker can reliably inject 60 spoofed points, which is the attack capability observed in our experiments (§4), the attack success rate is able to achieve around using our improved optimization method.

In addition, we observe that the spoofed obstacles in all of the successful attacks are classified as vehicles after the LiDAR-based perception process, even though we do not specifically aim at spoofing vehicle-type obstacles in our problem formulation.

Figure 10. Attack success rate of spoofing a front-near obstacle with different number of spoofed points. V-opt refers to vanilla optimization which is directly using the optimizer and S-opt refers to sampling based optimization. We choose Adam (Kingma and Ba, 2014) as the optimizer in both cases.

8.2. Robustness Analysis

In this section, we perform analysis to understand the robustness of the generated adversarial spoofed 3D point cloud to variations in 3D point cloud and spoofed 3D point cloud . Such analysis is meaningful for generating adversarial spoofed 3D point cloud that has high attack success rate in the real world. To launch the attack in the real world, there are two main variations that affect the results: variation in spoofed points and variation in positions of the victim AV. 1) The imprecision in the attack devices contributes to the variation of the spoofed points. The attacker is able to stably spoof 60 points at a global position as we state in  §2.2. However, it is difficult to spoof points with precise positions. It is important to understand whether such imprecision affects the attack success rate. 2) The position of the victim AV is not controlled by the attacker and might vary from where the attacker collected the 3D point cloud. It is important to understand whether such difference affects the attack success rate.

Robustness to variations in point cloud. To measure the robustness to variations in the 3D point cloud, we first select all the 3D point cloud frames that can generate successful adversarial spoofed 3D point cloud. For each of them, we apply its generated adversarial spoofed 3D point cloud to 15 consecutive frames (around 1.5 s) after it and calculate the success rates. Fig. 11 shows the analysis results. In this figure, the x-axis is the index for the 15 consecutive frames, and thus the larger the frame index is, the larger the variation is to the original 3D point cloud that generates the adversarial spoofed 3D point cloud. As shown, the robustness for attacks with more spoofed points is generally higher than that for attacks with fewer spoofed points, which shows that higher attack capability can increase the robustness. Particularly, with 60 spoofed points, the success rates are on average above 75% during the 15 subsequent frames, which demonstrates a high degree of robustness. This suggests that launching such attack does not necessarily require the victim AV to appear at the exact position that generates the adversarial example in order to have high success rates.

Figure 11. The robustness of the generated adversarial spoofed 3D point cloud to variations in 3D point cloud . We quantify the variation in 3D point cloud as the frame indexes difference between the evaluated 3D point cloud and the 3D point cloud used for generating the adversarial spoofed 3D point cloud.

Robustness to variations in spoofed 3D point cloud. To evaluate the robustness to variations in the spoofed 3D point cloud, for a given spoofed 3D point cloud , we first generate the corresponding adversarial spoofed 3D point cloud with a 3D point cloud . Next, we generate 5 more spoofed 3D point cloud traces using our LiDAR spoofing attack experiment setup. Next, we use the same transformation that generates from to generate , and then combine each of them with to launch the attack. Table 4 shows the average success rates with different attack capabilities. As shown, for all three attack capabilities we are able to achieve over 82% success rates. With 60 spoofed points, the success rate is as high as 90%. This suggests that launching such attack does not require the LiDAR spoofing attack to be precise all the time in order to achieve high success rates.

Targeted position # Spoofed points
20 40 60
2-8 meters 87% 82% 90%
Table 4. Robustness analysis results of generated adversarial spoofed 3D point cloud to variation in spoofed 3D point cloud . The robustness is measured by average attack success rates.

9. Driving Decision Case Study

To understand the impact of our attacks at the driving decision level, in this section we construct several attack scenarios and evaluate them on Baidu Apollo using simulation as case studies.

Experiment setup. We perform the case study using the simulation feature provided by Baidu Apollo, called Sim-control, which is designed to allow users to observe the AV system behavior at the driving decision level by replaying collected real-world sensor data traces. Sim-control does not consist of a physics engine to simulate the control of the vehicle. Instead, the AV behaves exactly the same as what it plans. Although it cannot directly reflect the attack consequences in the physical world, it can serve for our purpose of understanding the impact of our attacks on AV driving decisions.

For each attack scenario in the case study, we simulate it in Sim-control using synthesized continuous frames of successful adversarial 3D point cloud identified in § 8 as input. The experiments are performed on Baidu Apollo 3.0.

Case study results. We construct and evaluate two attack scenarios in this case study111Video demos can be found at http://tinyurl.com/advlidar:

(1) Emergency brake attack. In this attack, we generate adversarial 3D point cloud that spoofs a front-near obstacle to a moving victim AV. We find that the AV makes a stop decision upon this attack. As illustrated in Fig. 12, the stop decision triggered by a spoofed front-near obstacle causes the victim AV to decrease its speed from 43 km/h to 0 km/h within 1 second. This stop decision will lead to a hard brake (har, 2005), which may hurt the passengers or result in rear-end collisions. Noticed that, Apollo does implement driving decisions for overtaking. However, for overtaking, a minimum distance is required based on the relative speed of the obstacle. Therefore, with our near spoofed obstacle, the victim AV makes stop decisions instead of overtaking decisions.

Figure 12. Demonstration of the emergency brake attack. Due to the spoofed obstacle, the victim AV makes a sudden stop decision to drop its speed from 43 km/h to 0 km/h within a second, which may cause injuries of passengers or rear-end collisions. Figure 13. Demonstration of the AV freezing attack. The traffic light is turned green but the victim AV is not moving due to the spoofed front-near obstacles.

(2) AV freezing attack. In this attack, we generate an adversarial 3D point cloud that spoofs an obstacle in front of an AV victim when it is waiting for the red traffic light. We simulate this scenario with the data trace at an intersection with traffic lights. As shown in Fig. 13, since the victim AV is static, the attacker can constantly attack and prevent it from moving even after the traffic signal turns green, which may be exploited to cause traffic jams. Noticed that, Apollo does implement driving decisions for deviating static obstacles. However, for deviation or side passing, it requires a minimum distance (15 meters by default). Therefore, with our near spoofed obstacle, the victim AV makes stop decisions instead of side passing decisions.

10. Discussion

In this section, we discuss the limitations and generality of this study. We then discuss potential defense directions.

10.1. Limitations and Future Work

Limitations in the sensor attack. One major limitation is that our current results cannot directly demonstrate attack performance and practicality in the real world. For example, performing our attack on a real AV on the road requires dynamically aiming an attack device at the LiDAR on a victim car with high precision, which is difficult to prove the feasibility without road tests in the physical world. In this work, our goal is to provide new understandings of this research problem. Future research directions include conducting real world testing. To demonstrate the attack in the real world, we plan to first conduct the sensor attack with LiDAR on top of a real vehicle in outdoor settings. In this setting, the sensor attack could be enhanced by: 1) enlarging the laser spoofing area to solve the aiming problem; 2) adjusting the delay time so that the attacker could spoof points at different angles without moving the attack devices. Then we could apply our proposed methodology to conduct drive-by experiments in different attack scenarios mentioned in  §9.

Limitations in adversarial example generation. First, we construct adversarial sensor data by using a subset of spoofing attack capability . Therefore, our analysis may not fully reveal the full potential of sensor attacks. Second, though we have performed the driving decision case study, we did not perform a comprehensive analysis on modules beyond the perception module. That means that the designed objective function can be further improved to more directly target specific abnormal AV driving decisions.

10.2. Generality on LiDAR-based AV Perception

Generality of the methodology. Attacking any LiDAR-based AV perception system with an adversarial sensor attack can be formulated as three components: (1) formulating the spoofed 3D point cloud capability , (2) generating adversarial examples, and (3) evaluating at the driving decision level. Even though our construction of these components might be specific to Baidu Apollo, our analysis methodology can be generalized to other LiDAR-based AV perception systems.

Generality of the results. The formulation of 3D point cloud spoofing capability can be generalized as it is independent from AV systems. The success of the attack may be extended to other LiDAR-based AV perception system due to the nature of the LiDAR sensor attack. The LiDAR spoofing attack introduces a spoofed 3D point cloud, which was not foreseen in the training process of machine learning models used in the AV perception system. Therefore, other models are likely to be also vulnerable to such spoofing patterns.

10.3. Defense Discussion

This section discusses defense directions at AV system, sensor, and machine learning model levels.

10.3.1. AV System-Level defenses

In our proposed attack, the attacker only needs to inject at most 60 points to spoof an obstacle, but the 3D point cloud of a detected real vehicle can have as many as a thousand points (can be illustrated in Fig. 6). We look into the point cloud of a detected spoofed obstacle and find that the 3D point cloud consists of points reflected from the ground, in addition to the points spoofed by the attacker. For example, one of the successful adversarial spoofed 3D point cloud we generated with 20 spoofed points is detected as an obstacle containing 283 points.

Points from ground reflection are clustered into obstacles due to the information loss introduced in the pre-processing phase. More specifically, mapping a 3D point cloud into a 2D matrix results in height information loss. This vulnerability contributes to the success of the proposed attack. To mitigate the impacts of this problem, we propose two defenses at the AV system level: (1) filtering out the ground reflection in the pre-processing phase, and (2) either avoiding transforming 3D point cloud into input feature matrix or adding more features to reduce the information loss.

10.3.2. Sensor-Level Defenses

Several defenses could be adopted against spoofing attacks on LiDAR sensors:

Detection techniques. Sensor fusion, which intelligently combines data from several sensors to detect anomalies and improve performance, could be adopted against LiDAR spoofing attacks. AV systems are often equipped with sensors beyond LiDAR. Camera, radars, ultrasonic sensors provide additional information and redundancy to detect and handle an attack on LiDAR.

Different sensor fusion algorithms have been proposed focusing on the security and safety aspects (Yang et al., 2018) (Ivanov et al., 2014). However, the sensor fusion defense requires the majority of sensors to be functioning correctly. While not a perfect defense, sensor fusion approaches can significantly increase the effort of an attacker.

Mitigation techniques. Another class of defenses aims to reduce the influence of the attack by modifying the internal sensing structure of the LiDAR. Different solutions include reducing the receiving angle and filtering unwanted light spectra to make LiDARs less susceptible to attacks (Petit et al., 2015; Shin et al., 2017). However, these techniques also reduce the capacity of the LiDAR to measure the reflected laser pulses, which limits the range and the sensitivity of the device.

Randomization techniques. Another defense is adding randomness to how the LiDAR fires laser pulses. The attacker cannot know when to and what laser pulses to fire if the LiDAR fires laser pulses with an unpredictable pattern. A solution could be firing a random grouping of laser pulses each cycle. An attacker would not know which reflections the LiDAR would be expecting. Another alternative would be randomizing the laser pulses waveform. With sensitive equipment, it would be possible to only accept reflection waveforms that match randomized patterns uniquely produced by the LiDAR laser. Another solution, proposed by Shoukry et al. (Shoukry et al., 2015), consists of randomly turning off the transmitter to verify with the receiver if there are any unexpected incoming signals. Adding randomness makes it difficult for an attacker to influence the measurements, but this approach also adds significant complexity to the overall system and trades off with performance.

10.3.3. Machine Learning Model-Level Defense

Various detection and defense methods have also been explored (Ma et al., 2018; Madry et al., 2017; Carlini and Wagner, 2017a; Athalye et al., 2018) against adversarial examples in image classification. Adversarial training  (Goodfellow et al., 2014) and its variations (Tramèr et al., 2017; Madry et al., 2017) are more successful to improve the robustness of the model. Motivated by the adversarial examples generated by our algorithm, we can combine them with the original training data to conduct adversarial retraining and thus improve the model robustness.

11. Related Work

Vehicle systems security. Numerous previous works explore security problems in vehicle systems and have uncovered vulnerabilities in in-vehicle networks of modern automobiles (Koscher et al., 2010; Checkoway et al., 2011; Cho and Shin, 2016), infotainment systems (Mazloom et al., 2016), and emerging connected vehicle-based systems (Chen et al., 2018; Feng et al., 2018; Wong et al., 2019). In comparison, our work focuses on vehicle systems with the emerging autonomous driving technology and specifically targets the security of LiDAR-based AV perception, which is an attack surface not presented in traditional vehicle systems designed for human drivers.

Vehicle-related sensor attacks. The sensors commonly used in traditional vehicles have been shown to be vulnerable to attacks. Rouf et al. showed that tire pressure sensors are vulnerable to wireless jamming and spoofing attacks (Rouf et al., 2010). Shoukry et al. attacked the anti-lock braking system of a vehicle by spoofing the magnetic wheel speed sensor (Shoukry et al., 2013). As AVs become popular, so have attacks against their perception sensors. Yan et al. used spoofing and jamming attacks to attack the ultrasonic sensors, radar, and camera on a Tesla Model S (Yan, 2016). There have also been two works exploring the vulnerability of LiDAR to spoofing and jamming attacks (Petit et al., 2015; Shin et al., 2017). In this work, we build on these prior work to show that LiDAR spoofing attacks can be used to attack the machine learning models used for LiDAR-based AV perception and affect the driving decision.

Adversarial example generation. Adversarial examples have been heavily explored in the image domain (Goodfellow et al., 2014; Xiao et al., 2018c; Carlini and Wagner, 2017b; Papernot et al., 2017). Xie et al. (2017) generated adversarial examples for segmentation and object detection while Cisse et al. (2017) for segmentation and human pose estimation. Researchers also apply adversarial examples to the physical world to fool machine learning models (Evtimov et al., 2017; Eykholt et al., 2018a; Athalye and Sutskever, 2018). Compared to these previous work exploring adversarial examples in the image domain, this work explores adversarial examples for LiDAR-based perception. An ongoing work (Xiang et al., 2018) studies the generation of 3D adversarial point clouds. However, such attack focuses on the digital domain and can not be directly applied to the context of AV systems. In comparison, our method is motivated to generate adversarial examples based on the capability of sensor attacks to fool the LiDAR-based perception models in AV systems.

12. Conclusion

In this work, we perform the first security study of LiDAR-based perception in AV systems. We consider LiDAR spoofing attacks as the threat model, and set the attack goal as spoofing front-near obstacles. We first reproduce the state-of-the-art LiDAR spoofing attack, and find that blindly applying it is insufficient to achieve the attack goal due to the machine learning-based object detection process. We thus perform analysis to fool the machine learning model by formulating the attack task as an optimization problem. We first construct the input perturbation function using local attack experiments and global spatial transformation-based modeling, and then construct the objective function by studying the post-processing process. We also identify the inherent limitations of directly using optimization-based methods and design a new algorithm that increases the attack success rates by 2.65 on average. As a case study, we further construct and evaluate two attack scenarios that may compromise AV safety and mobility. We also discuss defense directions at AV system, sensor, and machine learning model levels.

Acknowledgements.
We would like to thank Shengtuo Hu, Jiwon Joung, Yunhan Jack Jia, Yuru Shao, Yikai Lin, David Ke Hong, the anonymous reviewers, and our shepherd Zhe Zhou for providing valuable feedback on our work. This research was supported in part by an award from Mcity at University of Michigan, by the National Science Foundation under grants CNS-1850533, CNS-1330142, CNS-1526455 and CCF-1628991, by ONR under N00014-18-1-2020.

References

Appendix

Appendix A Algorithm Details and Experiment Settings

Algorithm 1 shows the detailed algorithm to generate adversarial examples. In our experiment, we select Adam (Kingma and Ba, 2014) as our optimizer with learning rate . means updating the parameters with respect to Loss function . We select TensorFlow (Abadi et al., 2016) as backbone. is set as while is set as the angle that generates -meter distance from the target position.

input:
1 Target model: ;
2 3D point cloud ;
3 3D spoofed 3D point cloud ;
4 Optimizer ;
5 Max iteration ;
output:
6 3D adversarial 3D point cloud ;
7 : , , , , ;
/* Initiate parameters by sampling around the transformation parameters , that transforms t to the target position of the attack */
8 for  to  do
9       for  to  do
10             /* Initialize parameter . */;
11             , ;
12             for  to  do
13                   /* Calculate adversarial loss */;
14                    7.;
                   /* Update the parameters based on optimizer and loss */
15                  ;
16                   if   then
17                        
18                  
19             end for
20            
21       end for
22      
23 end for
24;
25 ;
Return:
Algorithm 1 Generating adversarial examples by leveraging global spatial transformation
Figure 14. Collected traces from the reproduced sensor attack. The points in the yellow circle are spoofed by the sensor attack.