Evaluation of the Architecture Alternatives for Real-time Intrusion Detection Systems for Connected Vehicles

01/18/2022
by   Mubark B Jedh, et al.
0

Attackers demonstrated the use of remote access to the in-vehicle network of connected vehicles to launch cyber-attacks and remotely take control of these vehicles. Machine-learning-based Intrusion Detection Systems (IDSs) techniques have been proposed for the detection of such attacks. The evaluation of some of these IDS demonstrated their efficacy in terms of accuracy in detecting message injections but was performed offline, which limits the confidence in their use for real-time protection scenarios. This paper evaluates four architecture designs for real-time IDS for connected vehicles using Controller Area Network (CAN) datasets collected from a moving vehicle under malicious speed reading message injections. The evaluation shows that a real-time IDS for a connected vehicle designed as two processes, a process for CAN Bus monitoring and another one for anomaly detection engine is reliable (no loss of messages) and could be used for real-time resilience mechanisms as a response to cyber-attacks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

08/13/2018

An Entropy Analysis based Intrusion Detection System for Controller Area Network in Vehicles

Dozens of Electronic Control Units (ECUs) can be found on modern vehicle...
11/27/2018

A Real-Time Remote IDS Testbed for Connected Vehicles

Connected vehicles are becoming commonplace. A constant connection betwe...
08/02/2020

On the Security of Networked Control Systems in Smart Vehicle and its Adaptive Cruise Control

With the benefits of Internet of Vehicles (IoV) paradigm, come along unp...
02/14/2022

AnoMili: Spoofing Prevention and Explainable Anomaly Detection for the 1553 Military Avionic Bus

MIL-STD-1553, a standard that defines a communication bus for interconne...
06/15/2021

CAN-LOC: Spoofing Detection and Physical Intrusion Localization on an In-Vehicle CAN Bus Based on Deep Features of Voltage Signals

The Controller Area Network (CAN) is used for communication between in-v...
03/22/2010

Integrating Real-Time Analysis With The Dendritic Cell Algorithm Through Segmentation

As an immune inspired algorithm, the Dendritic Cell Algorithm (DCA) has ...
06/16/2021

Detecting message modification attacks on the CAN bus with Temporal Convolutional Networks

Multiple attacks have shown that in-vehicle networks have vulnerabilitie...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Modern Automobile contains 20 to 80 Electronic Control Units that control the functionalities of the vehicle, including engine, power steering, brake, air-conditioning, radio, and seats, lane departure alert, and emergency braking, as depicted by Figure 1. These ECUs communicate using the Controller Area Network (CAN) bus protocol [bosh1991]. The CAN protocol was designed more than 40 years ago and has still widespread use in automotive, aerospace, and many other industries because of its low cost, error detection capability, reliability, and high-speed transmission rate. The CAN protocol does not include, however, security measures such as authentication [OWMW2015].

Fig. 1: Example of connected vehicle [Othmane2015]

[Increase of cyber-attacks on connected vehicles.]   [Impacts of cyber-attacks on connected vehicles]

Fig. 2: Growth and distribution of cyber-attacks on connected vehicles between 2010 and 2019 [UpstreamAuto2020].

Several recent research works demonstrate that attackers can access the CAN Bus using a variety of interfaces such as TPMS, bluetooth, telematics, and OBD-II units to inject messages into the CAN Bus. Hoppe et al. [10.1007/978-3-540-87698-4_21] were the first researchers to point out the security weaknesses of the CAN Bus protocol. Their findings were later confirmed by Koscher et al. [5504804]. Furthermore, Checkoway et al. [10.5555/2028067.2028073] performed a security analysis of attack surfaces, including physical, short-range and long-range wireless communication, and demonstrated the exploitation of the flaws that they have identified to fully take control of the vehicle’s systems. Recently, Upstream’s research team identified 367 publicly reported incidents for a decade long [UpstreamAuto2020]. The analysis of these incidents shows an exponential growth of attacks, as depicted by Figure 2. Among these attacks, 27% involved taking control of the car, as depicted by Figure 2.

Intrusion Detection Systems (IDSs) have been proposed as an alternative to the attack prevention approach on connected vehicles. Wu et al. [8688625] and Young et al. [8640808] provide comprehensive surveys on IDS for connected and autonomous cars. In previous work [9076852, 9490207] we developed Machine Learning (ML)-based IDSs for connected vehicles and evaluated them using CAN data extracted from a moving vehicle under malicious RPM and speed readings messages injections into the in-vehicle network of the vehicles. Most of the ML-based IDS for the connected vehicle are evaluated offline using datasets of CAN logs, including the ones that we designed, which limits the confidence in the capabilities to use them for real-time intrusion detection in vehicles.

ID Requirements
1 The response time of IDS must be small enough to trigger reactive safety mechanisms, such as breaking.
2 The IDS must not lose CAN data.
3 The IDS must run on an ECUs that has limited capabilities in terms of processing speed and memory size.
TABLE I: Major requirements for IDS for connected vehicles.

There are three requirements for IDSs for connected vehicles, which are listed in Table I. We describe in this paper four architecture alternatives for real-time IDS for connected vehicles.111We focus on CAN message injection attacks. Other attacks on connected vehicles, e.g., on V2V could be considered in the future. Then, we assess their satisfaction of the requirements by evaluating their (1) anomaly evaluation time, and (2) reliability in terms of losing CAN messages. The findings demonstrate the possibility to deploy effective ML-based IDSs for connected vehicles.

This paper is organized as follows. Section II describes the related works, section III describes the used IDS, Section IV describes the architecture alternatives of the real-time IDS for connected vehicle, Section V describes the evaluation of the architecture alternatives, and Section VI concludes the paper.

Ii Related work

Several preventive security countermeasures have been developed to defend and enhance in-vehicle network security against cyber-attacks, such as authentication protocols [8590911, 8939382, 7934878]. The main issue with these mechanisms is that these address only a subset of the attacks on the connected vehicles and require modification of the protocol used by the ECUs of the given vehicle, which cannot be used for aftermarket vehicles. In addition, most of the remote attacks exploit software vulnerabilities in the protection mechanisms, such as in [MiVa2015, hacking-tesla, BMW, Tesla_key].

IDSs have been proposed as an alternative to prevention mechanisms from attacks on connected vehicles. Wu et al. [8688625] and Young et al. [8640808] provide surveys on IDS for connected and autonomous cars. These mechanisms discriminate groups of messages associated with attacks and those that are not, with acceptable accuracy and false positive [10.1371/journal.pone.0155781]. Neural Networks (NN) has been the commonly used ML-based approach for designing IDSs for the CAN bus, e.g., [Lokman2019, 8687274, 9262960, SGL2019]. In previous work [9076852, 9490207] we used machine learning techniques to develop IDSs for connected vehicles including

Hodden Markov Model

(HMM), Long Short-Term Memory (LSTM), cosine graph-similarity, and change-point detection and evaluated them using CAN data extracted from a moving vehicle under malicious RPM and speed readings messages injections into the in-vehicle network of the vehicles. The detection accuracy of cosine graph-similarity threshold reaches 97.32% of accuracy and the detection speed of 2.5 milliseconds. Unlike most ML-based studies in the literature, the technique proposed in [9490207] neither depend on the make or model of the car nor its proprietary information (i.e., CAN ID).

The common data sources that are used to assess the accuracy of the ML-based IDS solutions include; data from the owner devices [ChSh2016], synthetic/artificial data  [TLJ20168], simulated data [LMKA2017], and data from a stationary/parked vehicle [SHH2018, 9235336, IRYM2020]. This limits the confidence in the results and threatens its validity [CROT2016]. To the best of our knowledge, our previous work [9076852, 9490207] and the works of Stachowski [SGL2019] are the only studies that used datasets collected from moving vehicles under messages injection attacks.

Fig. 3: Similarity of two messages-sequence graphs at successive time-window (left) and (right). The labels of the nodes are the CAN ID and the labels of the edges are number of times a message with the CAN ID source of the edge is followed by a message with the CAN ID destination of the same edge during the time-window. For example, 33 messages and 56 messages with ID 344 followed messages with ID 342 at resp. time-windows and , which indicate a possible change of the behavior of the vehicle [9490207].

Valasek and Miller are among the pioneer to propose real-time IDS for connected vehicles [Chris&Miller]. They developed a small device that reads data from the CAN Bus through OBD-II port, learns the traffic pattern to detect anomalies, and shorts the circuit disabling all CAN messages when anomalous detected. Matsumoto et al. [6240294] proposed a real-time IDS that prevents authorized messages from reaching the receiver ECU. The system monitors the traffic of the CAN Bus and transmits Error Frame to override the unauthorized messages when it detects them. The technique requires, however, modification of the CAN Bus protocol.

Boddupalli and Ray assessed the requirements for real-time attack detection and mitigation in connected and autonomous vehicles and emphasized the importance of the basic safety of such a mechanism. [SrSa2020, BSR2021]. They trained a NN-based IDS and proposed an architecture that addresses the requirements for the case of collision avoidance using vehicle-to-vehicle mechanisms. The main components of the architecture are: (1) a predictor of abnormal behavior that uses the trained IDS

, (2) machine learning models for computing the application decision trained from driving a vehicle in different weather conditions (windy, raining, snowing, and clear) and road types (city, suburban, and highway) which are used to estimate the response of the module, and (3) a plausibility check module that checks the safety of using the output of the response estimator. When the system detects an anomaly, it replaces the response computed by the collision application with the output of the response estimator when the plausibility check is positive; that is, the estimated response is safe. The system triggers service degradation if it detects an anomaly and the plausibility checks of the output of the estimated response is negative.

Iii Intrusion Detection System

ECUs collaborate to perform tasks such as increasing speed, break, etc., by sending messages through the CAN bus [9076852]. Attackers take control of a connected vehicle by injecting messages into its CAN bus. To mitigate such attacks, we have developed an ML-based IDS that captures the pattern of the sequences of the CAN messages and represent them with a directed graph, which we call Messages-Sequence Graph (MSG), where the nodes represent the CAN IDs of the messages and the edges represent the sequences order of the messages, as depicted by Figure 3 [9490207].

To construct the MSG, we first create a dictionary of the CAN IDs exchanged in the CAN bus. Then, we label the nodes of the MSG with the CAN IDs and the edges with value "0". Next, we loop over the batch of the CAN messages of size (e.g., 1000 successive messages) that were exchanged from time . For each of the messages, we increase the label of the edge linking the node representing the CAN ID of the message to the node repressing the CAN ID of the previously processed message. Equation 1 represents the distribution of the messages-sequences at time .

(1)

where is for node and is for the edge from node to node .

We consider that a MSG representing the messages exchanged in a CAN bus at time is to the MSG representing the CAN messages exchanged during the following time slot in the case of normal driving behavior and that injection of messages into the CAN bus disrupts this pattern [9490207]–see Figure 3 [9490207]. Equation 2 formulates the Similarity concept for the IDS. That is, the similarity Sim at time t+1 is the similarity of the distributions of the messages-sequences at time t and at time t+1.

(2)

We use cosine similarity to measure the similarity between two

MSGs of two successive time steps and

. The metric measures the angle between two vectors, where the closer the value is to 1, the more similar the two vectors are. Then, we use

similarity threshold technique to identify CAN message injections. The technique provides an accuracy of 97.32%/–see Ref. [9490207] for details about the construction and accuracy of the technique.

Fig. 4: High-level architecture of the system.

We discuss in the following sections the architecture alternatives for monitoring the CAN Bus and using the similarity threshold technique to identify in real-time malicious injections of CAN messages.

Constraint Value
Recommended maximum rate of injection of CAN messages 1908/sec
Rate of injection of CAN messages 1000/sec [9076852]
Reaction time constraint 2.5 sec
Detection speed of 1000 messages using the similarity threshold technique 0.094 seconds
  • - The CAN bus is designated for a maximum signaling of 1 Mbits/s [TeIn2016] but 250kb/s is the recommended rate by the Society of Automotive Engineers (SAE) in J1939 standards [J1939], which transports up to 1908 CAN frame per second – 1908=250000/(128+3) where the data payload is of 8 bytes.

  • - The break reaction time is less than 2.5 second for 90% of the drivers.[breakreact]

  • - We postulate that the IDS needs to process the CAN messages batch file in less than 2.5 seconds, which is the upper bound of breaking reaction time [breakreact].

TABLE II: IDS architecture constraints.

Iv Real-time architecture alternatives of Ids for Connected Vehicles

In a typical IDS environment, a set of ECUs inject CAN messages into the CAN bus, a CAN Bus monitor captures the messages exchanged in the bus, and the anomaly detection engine analyzes these messages to identify malicious message injections, as depicted in Figure 4. The CAN bus monitor reads the messages available in the CAN bus continuously and sends them to the anomaly detection engine in batches of, e.g., 1000 messages. The anomaly detection engine constructs a MSG from the received batch of messages, applies the cosine similarity threshold attack detection technique, and outputs the results of the evaluation.

The IDS must address three main requirements depicted in Table I. First, the response time of the IDS must be less than the expected breaking reaction time; satisfying this requirement makes the IDS a good candidate for e.g., a safety resilience mechanism that activates the breaks in case of detection of cyber-attacks. Second, the loss of CAN messages is not allowed, which is important for the reliability of IDS. Third, The IDS must run on a cheap ECU that has limited capabilities in terms of processor speed and memory size. In addition, Table II enumerates a set of constraints that the architecture shall satisfy. Since the goal of this paper is to assess architectures of real-time IDS, we use the existing anomaly detection module developed in previous work [9490207] and do not create new ones.

Architecture scenario Concurrency technique of CAN Bus monitor Concurrency technique of the anomaly detection agent Use of a queue
Scenario 1 - The two components run in a single process main process main process no
Scenario 2 -The two components run in a single process that includes one sub-process s main process child-process no
Scenario 3 - The two components run in a single process with two threads thread thread yes
Scenario 4 - The two components run in two processes main process main process yes
TABLE III: Practical concurrency scenarios of the IDS main components.
Fig. 5: IDS flowchart diagram.
Fig. 6: Flowchart diagram of the IDS architecture of scenario 2.

To satisfy the first requirement, we implement the CAN Bus monitor and the anomaly detection engine using the C language. The anomaly detection engine uses PyObject library to call the data analysis module, passing it the CAN messages batches as a parameter, and parsing the analysis result.

Typically, each ECU services/reacts to each message it reads from the CAN bus (process or ignore) at a rate higher than the sending messages rate, to avoid loss of messages. In contrast, the IDS techniques process the CAN messages in batches, which takes much longueur than one millisecond. Thus, the IDS would lose data if the CAN bus monitor and anomaly detection engine operate sequentially, as depicted in Figure 5, which violates the third requirements that we set for our real-time IDS: loss of CAN messages is not allowed. This problem could potentially be addressed by running the CAN Bus monitor and anomaly detection engine concurrently. The three concurrency techniques that could be used are: using separate processes, using sub-processes, and using threads. Table III shows the four potential architecture scenarios for using concurrency techniques of the CAN Bus monitor and the anomaly detection engine components to address the second and third requirements discussed above. We discuss in the following each of the architecture scenarios.

Scenario 1 - The two components run in a single process. In this scenario, the CAN Bus monitor and anomaly detection engine run sequentially as depicted by Figure 5. The CAN Bus monitor can potentially lose CAN messages that the ECUs send while the IDS busy analyzing the CAN messages batch it receives to identify potential attacks.

Scenario 2 - The two components run in a single process that includes one sub-process. The CAN bus monitor and the anomaly detection agent run in a single process. As depicted by Figure 6, the main process continuously reads the CAN messages from the CAN bus and creates a sub-process, that evaluates the batch for attacks, that is executed when there are enough messages for the batch. Note that the sub-processes become zombies at the end of their executions, and it is complicated to shut them down from the main processes.

Fig. 7: Block diagram of the IDS architecture of scenario 3.

Scenario 3 - The two components run in a single process with two threads. The CAN bus monitor and the anomaly detection agent run in a single process but in separate threads, as depicted by Figure 7. We use a queue to pass data between the two components to prevent losses of CAN messages. The anomaly detection engine uses Inter-Process Communication (IPC) technique to execute the data analysis module, which implements the technique discussed in Section III.

For completeness, there are two other variants of using threads with a single process besides the one discussed above, which are: (1) implementing the CAN bus monitor in a thread and the anomaly detection engine in the main process and (2) implementing the anomaly detection engine in a thread and the CAN bus monitor in the main process. We do not report the evaluation of these variants because they use the same concurrency technique and do not outperform the two threads variant.

Fig. 8: Block diagram of the IDS architecture of scenario 4.

Scenario 4 - The two components run in two processes. The CAN bus monitor and the anomaly detection agent run in separate independent processes, as depicted by Figure 8. We also use a queue to pass data between the two components to prevent losses of CAN messages.

V Evaluation of the four real-time IDS architecture scenarios

This section describes the evaluation environment and datasets, compares the anomaly evaluation times of the four architecture scenarios, and analyses the impact of CAN message injection speed on the message loss ratio of the four architecture scenarios.

V-a Evaluation environment and datasets

We implemented the four architecture scenarios and deployed them to a Raspberry Pi that runs Raspbian 10, with four core processors of 1.2 MHz speed and 1GB of memory. Figure 9 shows the evaluation environment of the architecture scenarios. The environment uses an ECUs emulator that simulates car ECUs, which sends messages periodically into Linux virtual CAN bus.

Fig. 9: Block diagram of the evaluation environment of the real-time IDS architecture scenarios.
No Description # of CAN messages
1 CAN Data for no injection of fabricated messages 23,963
2 CAN Data with injection of "FFF" as the speed reading 88,492
3 CAN Data with injection of "FFFF" as the RPM reading 30,308
TABLE IV: CAN messages datasets
Normal messages messages with injection of seed readings messages with injection of RPM readings
Time to send 1000 CAN messages in seconds
Min 0.946 0.925 1.015
Max 1.120 1.217 1.056
Average 0.992 1.008 1.023
Time to evaluate 1000 CAN messages in seconds
Min 0.115 0.116 0.131
Max 0.271 0.274 0.180
Average 0.154 0.143 0.151
TABLE V: Speed of simulating injection of CAN messages and evaluating them for attacks using architecture scenario 1.

In this evaluation, we use datasets [othmane2020b] of CAN bus messages for (1) normal driving behavior, (2) injection of fabricated speed reading messages onto the CAN bus, and (3) injection of fabricated RPM reading messages onto the CAN bus collected from an in-motion Ford Transit 500 2017  [9076852]. Table IV lists the datasets and the number of CAN messages in each of them.

The ECUs emulator reads the CAN messages stored in the dataset files and sends them through the virtual CAN bus. The messages are processed by the CAN bus monitor and anomaly detection engine in the four architecture scenarios. Table V provides the time that the ECUs emulator takes to send 1000 messages into the CAN Bus and the time that the IDS takes to evaluate 1000 CAN messages for the cases of normal messages dataset, messages with the injection of speed readings dataset, and messages with the injection of RPM readings dataset. We do not observe a big difference in processing a batch of normal CAN messages, messages with the injection of speed readings, and messages with the injection of RPM readings. We observe that the IDS takes an average 149 milliseconds to evaluate a batch of 1000 messages, while the time to send 1000 messages into the CAN bus is about 1,007 seconds. The IDS would lose about 149 CAN messages from each batch, which can impact the attack detection rate–i.e., ignoring 14,9% of the messages can impact the detection rate.

Fig. 10: Anomaly evaluation time of the four architecture scenarios.
Architecture Average time of sending 1000 CAN messages Average evaluation time Response time
Scenario 1 - Single process with no threads 998 ms 152 ms 1.15 sec.
Scenario 2 - Single process with one sub-process 944 ms 865 ms 1.809 sec
Scenario 3 - Single process with two threads 950 ms 90 ms 1.04 sec.
Scenario 4 - Two processes 945 ms 81 ms 1.026 sec.
TABLE VI: Anomaly evaluation times and response times for the four architecture scenarios.

V-B Analysis of the anomaly evaluation time of the four architecture scenarios

We configured the ECU emulator to send the dataset "CAN Data with injection of "FFF" as the speed reading" with a rate of about 1000 messages through the virtual CAN Bus. Table VI shows the average time of sending a batch of CAN messages, the average evaluation time, and the average response time (time taken from collecting the first CAN message of the batch to output the result of the evaluation of the batch) of the four IDS architecture scenarios. We consider the anomaly evaluation time as the difference between the end of evaluating the messages batch for attacks and the end of reading 1000 CAN messages from the virtual CAN Bus. The data shows that anomaly evaluation time is way below the batch messages sending time. Figure 10 shows the anomaly evaluation time of the four selected real-time IDS architecture scenarios. The diagram shows that anomaly evaluation times of a real-time IDS designed as a single process, single process with two threads, and as two processes are below the brake reaction time (0.5 to 2.5 seconds), In addition, the anomaly evaluation times of the IDS designed as a single process with two threads and as two processes are too close.

We conclude that the IDS response time of a real-time IDS designed as a single process with no threads, as a single process with two threads, and as two processes is below the brake reaction time (assuming the rate of sending messages through the CAN Bus is about 1000 messages/second) which makes them a good candidate for real-time IDS for connected vehicles.

Fig. 11: Ratio of messages losses vs speed of sending 1000 CAN messages through the CAN bus in the four architecture scenarios.

V-C Analysis of the reliability of the four architecture scenarios in terms of CAN message losses

Message loss ratio is the ratio of CAN messages that are sent through the CAN Bus by the ECUs emulator but are not received and processed by the anomaly detection engine. Theoretically, the CAN messages that the ECUs emulator sends through the CAN Bus while the single process IDS architecture is busy analyzing a batch of previously received messages are lost. Figure 11 shows the relationship between the messages loss ratio and speed of sending messages into the virtual CAN bus by the ECUs emulator. The figure shows that the ratio of messages loss decreases as the speed of sending CAN messages increases and it reaches 0% for the case of architecture scenario 1 and 2 and that there are no messages losses for the case of IDS designed as as two processes–because the servicing time is smaller than the time to send the messages batch into the CAN Bus.

Vi Conclusion

Machine-learning-based IDSs techniques have been proposed for the detection of malicious injection of messages into the in-vehicle network of connected vehicles. Evaluations of such IDS have been performed offline, which limits the confidence in their use for real-time protection scenarios. We evaluated in this paper four architecture designs for real-time IDS for connected vehicles using an anomaly detection engine that uses similarities of graphs representing sequencing of CAN messages during a given period and a CAN dataset collected from a moving vehicle under malicious speed reading message injections. The evaluation shows that a real-time IDS for a connected vehicle designed as two processes are reliable (no loss of messages) and have a low anomaly evaluation time that makes them a good candidate for real-time resilience mechanisms as a response to cyber-attacks.

Vii Acknowledgement

The authors thank Arun Somani from Iowa State University for the thorough discussions about the research. This research is partly funded by Iowa State University’s Regents Innovation Fund (RIF).

References