Cyber-Physical Security and Safety of Autonomous Connected Vehicles: Optimal Control Meets Multi-Armed Bandit Learning

12/13/2018 ∙ by Aidin Ferdowsi, et al. ∙ University of Oulu Rutgers University Virginia Polytechnic Institute and State University 0

Autonomous connected vehicles (ACVs) rely on intra-vehicle sensors such as camera and radar as well as inter-vehicle communication to operate effectively. This reliance on cyber components exposes ACVs to cyber and physical attacks in which an adversary can manipulate sensor readings and physically take control of an ACV. In this paper, a comprehensive framework is proposed to thwart cyber and physical attacks on ACV networks. First, an optimal safe controller for ACVs is derived to maximize the street traffic flow while minimizing the risk of accidents by optimizing ACV speed and inter-ACV spacing. It is proven that the proposed controller is robust to physical attacks which aim at making ACV systems instable. To improve the cyber-physical security of ACV systems, next, data injection attack (DIA) detection approaches are proposed to address cyber attacks on sensors and their physical impact on the ACV system. To comprehensively design the DIA detection approaches, ACV sensors are characterized in two subsets based on the availability of a-priori information about their data. For sensors having a prior information, a DIA detection approach is proposed and an optimal threshold level is derived for the difference between the actual and estimated values of sensors data which enables ACV to stay robust against cyber attacks. For sensors having no prior information, a novel multi-armed bandit (MAB) algorithm is proposed to enable ACV to securely control its motion. Simulation results show that the proposed optimal safe controller outperforms current state of the art controllers by maximizing the robustness of ACVs to physical attacks. The results also show that the proposed DIA detection approaches, compared to Kalman filtering, can improve the security of ACV sensors against cyber attacks and ultimately improve the physical robustness of an ACV system.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Intelligent transportation systems (ITS) will encompass autonomous connected vehicles (ACVs), roadside smart sensors (RSSs), vehicular communications, and even drones [1, 2, 3, 4]. To operate autonomously in future ITS, ACVs must process a large volume of data collected via sensors and communication links. Maintaining reliability of this data is crucial for road safety and smooth traffic flow [5, 6, 7, 8]. However, this reliance on communications and data processing renders ACVs highly susceptible to cyber-physical attacks. In particular, an attacker can possibly interject the ACV data processing stage, inject faulty data, and ultimately induce accidents or compromise the road traffic flow[9]. As demonstrated in a real-world experiment on a Jeep Cherokee in [10], ACVs are largely vulnerable to cyber attacks that can control their critical systems, including braking and acceleration. Naturally, by taking control of ACVs, an adversary can not only impact the compromised ACV, but it can also reduce the flow of other vehicles and cause a non-optimal ITS operation. This, in turn, motivates a holistic study for joint cyber and physical impacts of attacks on ACV systems.

Recently, a number of security solutions have been proposed for addressing intra-vehicle network and vehicular communication cyber security problems [11, 12, 13, 14, 15, 16, 17, 18, 19]. In [11], the authors showed that long-range wireless attacks on the current security protocols of ACVs can disrupt their controller area network (CAN). Furthermore, the work in [12]

, proposed a data analytics approach for the intrusion detection problem by applying a hidden Markov model. In

[13], the security vulnerabilities of current vehicular communication architectures are identified. The work in [14] proposed the use of multi-source filters to secure a vehicular network against data injection attacks (DIAs). Furthermore, the authors in [15] introduced a new framework to improve the trustworthiness of beacons by combining two physical measurements (angle of arrival and Doppler effect) from received wireless signals. In [16], the authors designed a multi-antenna technique for improving the physical layer security of vehicular millimeter-wave communications. Moreover, in [17], the authors proposed a collaborative control strategy for vehicular platooning to address spoofing and denial of service attacks. The work in [18]

developed a deep learning algorithm for authenticating sensor signals. Finally, an overview of current research on advanced intra-vehicle networks and the smart components of ITS is presented in

[19].

In addition to cyber security in ITSs, physical safety and optimal control of ACVs have been studied in [20, 21, 22, 23, 24, 25, 26]. In [20], the authors identified the key vulnerabilities of a vehicle’s controller and secured them using intrusion detection algorithms. The work in [21] analyzed the ACVs as cyber-physical systems and developed an optimal controller for their motion. The authors in [22] studied the safe operation of ACV networks in presence of an adversary that tries to estimate the dynamics of ACVs by its own observations. The authors in [23] proposed centralized and decentralized safe cruise control approaches for ACV platoons. A learning-based approach is proposed in [24] to control the velocity of ACVs. Furthermore, in [25]

, the authors have proposed a robust deep reinforcement learning (RL) algorithm which mitigates cyber attacks on ACV sensors and maintains the safety of ACV system. In

[26], the authors studied the essence of secure and safe codesign for ACV systems.

However, despite their importance, the architecture and solutions in [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26] do not take into account the interdependence between the cyber and physical layers of ACVs while designing their security solutions. Moreover, the prior art in [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26], does not provide solutions that can enhance the robustness of ACV motion control to malicious attacks. Nevertheless, designing an optimal and safe ITS requires robustness to attacks on intra-vehicle sensors as well as inter-vehicle communication. In addition, these existing works do not properly model the attacker’s action space and goal (physical disruption in ITS) while providing their security solutions. In this context, the cyber-physical interdependence of the attacker’s actions and goals will help providing better security solutions. Finally, the existing literature lacks a fundamental analysis of physical attacks as in the Jeep hijacking scenario[10] in which the attacker aims at disrupting the ITS operation by causing non-optimality in a compromised ACV’s speed.

The main contribution of this paper is, thus, a comprehensive study of joint cyber-physical security challenges and solutions in ACV networks which can be summarized as follows:

  • To address both safety and optimality of an ACV system, first an optimal safe controller is proposed so as to maximize the traffic flow and minimize the risk of accidents by optimizing the speed and spacing of ACVs. To the best of our knowledge, this work will be the first to analyze the physical attack on a ACV network and to prove that the proposed controller can maximize the stability and robustness of ACV systems against physical attacks such as in the Jeep hijacking scenario [10].

  • To improve the cyber-physical security study of ACV systems, next, new DIA detection approaches are proposed to address cyber attacks on ACV sensors and to analyze the physical impact of DIAs on an ACV system. To efficiently design the DIA detection approaches, ACV sensors are characterized in two subsets based on the availability of a priori information about their readings.

  • For the first subset of sensors which have a priori information, a DIA detection approach is proposed derive an optimal threshold level for sensor errors which enables ACV to detect DIAs. For the second subset of sensors that lack a priori information, a novel multi-armed bandit (MAB) algorithm is proposed to learn which sensors are attacked. The proposed MAB algorithm uses the so-called Mahalanobis distance between the sensor data and an a-posteriori prediction to calculate a regret value and optimize the ACV’s sensor fusion process by applying an upper confidence bound (UCB) algorithm. The proposed detection approaches maximize both the cyber security and physical robustness of ACV systems against DIAs.

Simulation results show that the proposed optimal safe controller has higher safety, optimality, and robustness against physical attacks compared to other state-of-the-art approaches. In addition, our results show that the proposed DIA detection approaches yield an improved performance compared to Kalman filtering in mitigating the cyber attacks. Therefore, the proposed solutions improve the stability of ACV networks against DIAs.

The rest of the paper is organized as follows. Section II introduces our system model while Section III derives the optimal safe controller. Section IV proves that the proposed optimal safe controller is robust against physical attacks. Section V proposes approaches to mitigate cyber attacks on ACV while reducing the risk of accidents. Finally, simulation results are shown in Section VI and conclusions are drawn in Section VII.

Ii System Model

Ii-a ACV Physical Model

Consider an ACV, , that follows a leading ACV, and tries to maintain a spacing from ACV as shown in Fig. 1. Maintaining a spacing between ACVs is important to maximize the traffic flow and minimize the risk of accidents[9]. Let be ACV ’s speed in m/s. Then, ACV ’s speed deviation can be written as where is ’s engine force in Newtons (N), is ’s mass in kilograms (kg), and is ’s physical controller input in N/kg . Moreover, letting be ’s speed, the spacing between and can be written as Note that this model can be easily generalized to multiple ACVs by repeating the same set of equations for every pair of ACVs to capture any ACV network as shown in Fig. 1.

Figure 1: Illustration of the considered ACV system model.

Due to discrete time sensor readings in ACVs, we convert the aforementioned continuous system model to a discrete one using a linear transformation as follows

[27]:

(1)

where is the sampling period of the sensors in seconds. The model can be summarized as:

(2)

where

(3)

To validate the practicality of the proposed system model, we need to show that can control the speed and spacing of , i.e.,

can take state vector

to any desired state . The following remark shows that can control system (2).

Remark 1.

If , then the system in (2) is controllable. To illustrate the reason, we know that the system (2) is controllable if the rank of controllability matrix is 2 (number of state variables)[27]. Thus, we have:

(4)

Therefore, for any the columns of are linearly independent which implies that the rank of will be .

Ii-B ACV cyber model

In order to navigate, as shown in Fig. 1, ACV relies on , , and sensors which measure , , and , respectively. For instance, multiple intra-vehicle inertial measurement units (IMUs) measure , multiple cameras, radars, and LiDAR can measure , and roadside sensors and ACV measure and transmit the measurements to ACV using vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication links. Thus, we model the sensor readings as follows:

(5)

where , , and are sensor vectors with , , and elements which measure , , and , respectively. Also, , , and are vectors with , , and elements equal to . Moreover as assumed in [27] and [28], , , and

are noise vectors that follow a white Gaussian distribution with zero mean and variance vectors

, , and . Since the sensor readings in (5) are noisy, we need to optimally estimate , and from , , and by minimizing the estimation error. To find the optimal estimations , , and (the estimations of , and ), we use two types of estimators. A static estimator to estimate the variables at the initial state, , (the time step that starts following ) and a dynamic estimator to estimate and using the state equations in (1). We use a static estimator at the initial state because ACV does not follow the speed of ACV using (1) before and, hence, a dynamic estimator cannot be used due to lack of information about the dynamics of before . Moreover, for , we always use a static estimator since we do not have any information on the dynamics of .

For the static estimator, we define a least-square (LS) cost function for each variable as follows:

(6)
(7)
(8)

where , and are the measurement covariance matrices associated with sensors measure , , and , respectively. Moreover, since the sensors independently measuring the three variables, they will not have any noise covariance and , and will be diagonal. Therefore, explicit solution for the optimal estimation can be derived as:

(9)
(10)
(11)

For the dynamic estimator, we use a Kalman filter which uses the state equation in the estimation process. To this end, we define an output equation as follows:

(12)

Note that we cannot apply the dynamic estimator on since we do not have a-priori information about the dynamics of . To dynamically estimate and , we use an a priori estimation derived from the state equations as well as a weighted residual of output error to correct the a priori estimation as follows[29]:

(13)

where is the Kalman gain. By defining an a posteriori error covariance matrix , where , we can find a to minimize . The solution for such can be given by[29]:

(14)
(15)
(16)
(17)
(18)

where is a block diagonal matrix. As can be seen from (15), (16), and (18), the update processes for and are independent of the states and controller. Thus, and will converge to constant matrices, and .

For the studied system, we now define the cyber-physical security problems for ACV systems that we will study next. We will address three main problems: 1) What is the optimal safe ACV controller for the system in (2) that minimizes the risk of accidents while maximizing the traffic flow on the roads? 2) Is the proposed optimal safe controller for ACV systems robust and stable against physical attacks? and 3) How to securely fuse the sensor readings to mitigate DIAs on ACVs and minimize the impact of such attacks on the control of ACVs? Addressing these problems is particularly important because the model in (2) identifies the microscopic characteristics of an ACV network and, thus, to achieve large-scale security and safety in ACV networks we must secure every ACV against cyber-physical attacks.

Addressing these three problem requires a comprehensive study of the interdependencies between the cyber and physical characteristics of ACV systems. Such interdependent cyber-physical study helps to find the vulnerabilities of the ACV systems against both cyber and physical attacks. Thus, we can derive an optimal controller that minimizes the risk of accidents and we can design cyber attack detection approaches that not only take into account the cyber characteristics of the ACV system, but also aims at minimizing the likelihood of collisions in ACV networks. Unlike the works in [11, 12, 13, 14, 15, 16, 17, 18, 19], we consider the physical characteristics of ACV systems while developing DIA detection approaches. Moreover, the combined optimal and safe ACV controller design has not been studied previously in [20, 21, 22, 23, 24, 25, 26].

Iii Optimal Safe ACV Controller

Our first task is thus to derive an optimal safe controller for ACV systems. To analyze ACV ’s optimal control input , we define an optimal safe spacing value as , where is ACV ’s maximum braking deceleration. This value is defined so as to guarantee that if the leading ACV stops suddenly, the following ACV will stop completely before hitting ACV as long as starts braking immediately after observing ’s braking process. This can be captured by the following energy equivalence condition:

(19)

Our goal here is to maintain an optimal safe spacing between ACVs and . Thus, we define a physical regret as the square of difference between the optimal safe spacing and the actual spacing. This regret quantifies the safety and optimality of ACV motion by preventing any collisions and minimizing the spacing between the ACVs and can be written as follows:

(20)

In addition, each ACV only have access to estimation of , , and . Thus, ACV must design an input to minimize an estimation of physical regret which is defined as follows:

(21)

This problem is challenging to solve because is an independent parameter and ACV cannot be sure about the future values of . To solve this problem, we consider two scenarios: a) ACV has no prediction about ACV ’s future speed values (One-step ahead controller) and b) ACV has a predictor which can predict ACV ’s future speed value for time steps (-step ahead controller)(such predictors have attracted recent attention in the transportation literature, e.g., see [30] and [31]). Next, we propose an optimal controller for these two cases.

Iii-a One-step ahead controller

To solve the one-step ahead controller problem, we consider some physical limitations on the speed and the control input. We prohibit the speed from being greater than the free-flow speed of a road, . Moreover, due to the physical capabilities of the vehicle and for maintaining passengers’ comfort, we must have a limitation on the control input and speed deviation. Thus, the optimization problem of the ACV can be written as follows:

(22)
s.t. (23)
(24)
(25)

where and are the minimum and maximum allowable control input and is the maximum allowable change in the controller to yield a comfortable ride.

Theorem 1.

The one-step ahead optimal controller is:

(26)

where and .

Proof.

See Appendix -A. ∎

Theorem 1 derives the optimal controller for the ACV when it only optimizes its action for the next step without considering future actions. In the next subsection, we derive the ACV ’s optimal controller when it considers minimizing the regret for step ahead.

Iii-B -step ahead controller

To find the -step ahead controller, first we define a discount factor which specifies the level of future physical regret for the decision taken at each time step. Thus, by defining the -step ahead total discounted physical regret the controller optimization problem can be written as follows:

(27)

where the conditions in (23), (24), and (25) hold true. Moreover , is the number of future steps which is taken into account in finding the optimal controller, and is the optimal controller at time step .

Theorem 2.

The solution of -step ahead controller is equivalent to the one-step ahead controller.

Proof.

To solve the problem in (27), we use a so called indirect method. To this end, we start by defining an augmented physical regret using the following state equation:

(28)

where . Then, let Hamiltonian function defined as . Thus, we can write (28) as follows:

(29)

Thus, to find critical points (candidate minima) we must solve . First, we start by finding the differential and then we identify the derivatives as follows:

(30)

Thus, to have each of the terms in brackets must be equal to zero:

(31)
(32)
(33)
(34)
(35)

Now, using (33) we will have:

(36)

Moreover, from (34) we will have:

(37)

Since , then from (37), we derive:

(38)

By substituting in (36), we obtain . This process can continue until where we obtain and . Moreover, considering the constraints (23), (24), and (25), we will end up having the optimal controller as defined in Theorem 1. Thus, we prove that the -step ahead optimal controller is equivalent to -step ahead optimal controller. ∎

From Theorem 2, we can observe that, if the ACV minimizes its immediate physical regret, it will also minimize its long-term physical regret. This result shows that the proposed optimal safe controller does not require any information from future dynamics of ACV as done in [30] and [31].

Iv Physical Attack on ACV systems

As derived in the previous section, the proposed optimal controller is a function of . This makes ACV vulnerable against a physical attack on ACV . Thus, we now analyze whether an attacker can cause instable dynamics at ACV by controlling . Consider an adversary who takes the control of ACV and tries to cause instability in ACV ’s speed, , and spacing . Using our derived optimal controller, by ensuring is not saturated (), and considering the estimated values to be close to the real values we will have:

(39)

where . (39) is designed such that ACV will always maintain an optimal safe spacing with . However, from (39) we can see that the behavior of the system is a function of . Thus, next, we will analyze the physical attack scenario that is analogous to the Jeep hijacking case in[10].

Iv-a Stability Analysis

To analyze the stability of (39), first, we define some useful concepts.

Definition 1.

[28] is said to be an equilibrium point for the system in (39) and a constant input , if .

An equilibrium indicates a point at which the states will not change. From Definition 1 we can derive the equilibrium point for (39) by considering a constant input and solving the following set of equations:

(40)

which results in: The derived value for shows that in order to reach an equilibrium, ACV must maintain the optimal safe spacing from ACV and its speed must equal to . Thus, next, we show that our derived optimal controller is robust, i.e., under our controller if an adversary hijacks ACV , it cannot cause instability in and .

Definition 2.

A system is called asymptotically stable around its equilibrium point if it satisfies the following two conditions[28]: 1) Given any and , such that if , then , and 2) such that if , then as .

The first condition requires the state trajectory to be confined to an arbitrarily small “ball” centered at the equilibrium point and of radius , when released from an arbitrary initial condition in a ball of sufficiently small (but positive) radius . This is called stability in the Lyapunov sense[28, 32]. It is possible to have Lyapunov stability without having asymptotic stability.

Next, we show how a Lyapunov function can help to analyze the stability of system (39)[28]. From [28], we know that if there exists a Lyapunov function for system (39), then is a stable equilibrium point in the sense of Lyapunov. In addition, if then is an asymptotically stable equilibrium point. We can prove that the system in (39) is asymptotically stable for the equilibrium point , as follows.

Proposition 1.

is a stable equilibrium point in the Lyapunov sense.

Proof.

Let . Then we will have . Moreover, we can show that . Now, to check if is a Lyapunov function we have:

(41)

Thus, is a stable equilibrium point in the sense of Lyapunov. ∎

From Proposition 1, we can see that, as long as follows using our proposed controller, its speed and spacing from will stay stable and will not be affected by the physical attack on . This shows that, not only our proposed controller maximizes the safety and optimality in ITS roads, but also it is robust to physical attacks such as in the Jeep scenario [10].

However, as can be seen from Theorems 1 and 2, even though it is robust to physical attacks, the derived optimal controller is largely dependent on the estimated values , , and . Thus, an adversary can manipulate the sensor data to inject error in the estimation and ultimately increase the ACV ’s physical regret. Analyzing such attacks require a cyber-physical study of the ACV system to derive approaches that mitigate attacks on sensors and minimize the effect of such attacks on the physical regret. Thus, next, we analyze the cyber attack on the ACV system.

V Cyber Attack on ACV Systems

We now consider a cyber attacker that injects faulty data to sensor readings (cameras, LiDARs, radars, IMUs, and roadside sensors) such that the attacked sensor vector can be written as:

(42)

where , , and are data injection vectors on sensors and , , and are compromised sensor readings from , , and , respectively. As we discussed in the state estimation section, we have a-priory information about sensors which measure or , but such information is lacking for sensors that collect data from . Thus, next, we consider cyber attacks on sensors a) with a-priori information and b) without a-priori information.

V-a Attack on sensors with a-priori information

As discussed in Subsection II-B, we use Kalman filtering to estimate and . However, Kalman filtering is not robust to DIAs [33]. Thus, we propose a filtering mechanism that can limit the effect of the DIA on or . To this end, we use the a priori estimation at each time step to find an a priori sensor reading . Then, the attack detection filter checks the absolute value of the residual , where is the threshold vector. Any sensor which violates this inequality will be considered as a compromised sensor and will not be involved in Kalman filter update procedure. To find an optimal value for the threshold , we next characterize the stochastic behavior of residual when the ACV is not under attack.

Theorem 3.

The residual follows a Gaussian distribution with zero mean and covariance matrix as follows:

(43)

and is the solution of following discrete Ricatti equation where and

Proof.

See Appendix -B. ∎

Theorem 3 derives the distribution of which we will use next to find an optimal value for the threshold level . Fig. 2 and Fig. 3 show a comparison between simulation and analytical results derived for the mean and covariance matrix of , , and . From Figs. 2 and 3 we can that the analytical results match the simulation results which validates Theorem 3.

Figure 2: Analytical and simulation result for cumulative density function of and .
Figure 3: Analytical and simulation result for the mean and variance of .

We can now find the probability with which

(element of ) remains below the threshold value as where is the cumulative density function of which follows a Gaussian distribution with zero mean and variance (element in -th row and -th column of ). Thus, we can derive the optimal value for by defining for every sensor. For instance, choosing values or will result in or . Even by defining the threshold value, the attacker might stay stealthy in some cases if it controls the amount of injected data to the sensors. Next, we find a relationship between the maximum value of DIA and the probability of staying stealthy.

Proposition 2.

The probability with which an attack vector will not trigger the -th element of attack detection filter, , (stealthy attack) will be given by:

(44)

where </