Connected and Autonomous Vehicles (CAVs) will play a pivotal role in our everyday lives in the future. CAVs, fully integrated with sensors, will be able to provide awareness of the surrounding environment and coordination with fixed network nodes and nearby vehicles [qosRequirements]. To do so, they will require to exchange a vast amount of data, accommodated over various means of communication frameworks, and operated in a heterogeneous fashion [hetNet, TassimmWave]. The broadcast and public nature of these communication links make them susceptible to cyber-security threats, and hence, impacting the passengers’ safety. The vulnerability of automotive systems was recently demonstrated by the Chrysler Jeep hack, where the attackers exploited breaches in the vehicles internet-connected entertainment system111https://www.bbc.co.uk/news/technology-33650491.
Responding effectively to cyber-security incidents requires a number of different technological, procedural and organizational elements to be in place and work effectively. In many ways, the cyber-security requirements of CAVs are like any other business critical or safety-related system. Commercial Off-The-Shelf (COTS) equipment usually provides its information in forms of events and alarms. Correlation can be used to turn this data into more meaningful information. Machine Learning and Artificial Intelligence could be used to detect anomalies, which could affect the system[machineLearning]. New devices (such as roadside devices) could pass information (data and security oriented) to a centrally managed system [fogArchitecture, fogComputing]. The authenticity and integrity of that information is of paramount importance. Being empowered by this knowledge and having the capability of detecting when the system is compromised, allow system administrators to determine the right course of actions [anomalyDetectionActions].
In this paper, we focus on the pressing concern of empowering cloud-based services to detect incidents and respond to them by defining an agile strategy for data collection. In particular, we will propose a novel data offloading strategy allowing CAVs to share their sensory information with the Fog Computing Infrastructure and then, with cloud-based services. We envisage a system where our network will incorporate Fog Computing capabilities, giving us the leverage to use powerful processing nodes one-hope away from each Road-Side Unit (RSU). This will significantly reduce the latency of the data processing, minimizing the time required to identify potential threats.
The data offloading from a CAV to an RSU takes place when CAVs are in within range with one or more RSUs. The sensor data exchange happens in a broadcast fashion with the receivers and is organized as streams of information. However, the heterogeneity in driving behaviors within a city and the sparsity of the RSUs lead to intermittent communications. Reconciling the data at the infrastructure level becomes highly inefficient. Employing network coding techniques for achieving secure systems has attracted the interest of the research community lately, especially for wireless broadcast applications [networkCoding] and secure vehicular networks [networkCodingVehicular]. Inspired by that, and taking into account the fragmentary nature of the data offloading, we propose a new framework that uses Random Linear Network Coding (RLNC) to achieve an agile data offloading in Intelligent Transportation Systems (ITSs).
More specifically, our idea is based on the existing ITS-G5 Dedicated Short Range Communication (DSRC) protocol stack [etsiStandard]. We propose an amendment as well as a RLNC version of the Cooperative Awareness Message (CAM), that is expected to be used for broadcasting sensor data. The amendment takes place at the Facilities layer. Later in this work, we will describe the proposed sublayer and its functionality. The coded packets generated from this sublayer are then mapped onto the proposed RLNC-CAM and broadcast to the network. When received from a RSU, the Fog Computing Infrastructure and the RLNC decoder are responsible for decoding the packet that could be then be forwarded to other cloud-based services.
The rest of the paper is organized as follows. Sec. II describes our system model. The proposed Fog Computing implementation and Cloud Computing capabilities are presented at first. Then, the need for RLNC and how it could operate within the context of a vehicular network are introduced. Following, we present the proposed extension to the ITS-G5 stack (Sec. III). More specifically, we start by describing the new RLNC-CAM message and the information encapsulated in that. In addition, we outline the design of the RLNC-Facility sublayer responsible for the generation of RLNC-CAMs. In Sec. IV, we present our numerical results that confirm the feasibility of our proposal. We focused our performance investigation on real-world data traces captured during an extensive experimental campaign that took place in the City of Bristol, UK. Finally, we summarize our findings in Sec. LABEL:sec:conclusions and present a critical analysis of our results.
Ii System Model
We consider a city-scale network providing wireless connectivity to CAVs. In our system model, each CAV wishes to offload its sensing acquired data onto a cloud-store facility by means of RSUs connected to a network of Fog computing infrastructure nodes, called Fog Orchestrators (FOs).
Ii-a Fog Computing Infrastructure for Cyber-Secure ITSs
We assume our network being clustered in different management areas called Fog Areas. Each Fog Area consists of a number of RSUs, all being connected to a particular FO. We also assume that one FO serves each Fog Area. The FOs represents the logical entities encapsulating the core components of our system. FO ensures the programmability of the system simplifying the deployment of new functionalities, which rely upon a network of sensors and actuators.
In this particular system model, the FO is responsible for collecting the transmitted RLNC-CAMs, removing duplicated messages that may arise as a CAV is in range with two or more RSUs, recovering the encapsulated message, and passing them to cloud services. There, the threats can be effectively detected, and cyber-security decisions are taken.
We assume that a cloud-based data controller is responsible for the processing of all the sensor data transmitted by each CAV. Ultimately, RSUs are solely responsible for relaying messages to and from the CAVs. As shown in Fig. 1, this solution interacts with a cloud-based city-wide connection. In particular, the cloud-based service will only be in charge of recording city-scale data, interconnecting the different Fog Areas and enforcing city-scale policies to be put into practice.
This system model provides the necessary abstraction for a next-generation ITS system. This will also give us the leverage to deliver cyber-secure ITSs. In fact, as CAV technologies come to play an ever-increasing role in the safe and smooth running of road networks, network operators will increasingly need to implement effective cyber-security measures. These measures need to ensure that road users’ personal data is not compromised but just as importantly (arguably more importantly) they must ensure that the systems which CAVs rely on work as designed. An attacker could take more focused actions, which could result in an even greater impact on journeys, they may choose to spoof communications to systems for example which could then possibly have an impact on safety.
Ii-B RLNC for Agile Data Offloading
For the data offloading to happen, each CAV broadcasts its sensor data as it gets within range with one or more RSUs. However, as the CAV leaves the coverage area of a RSU, the data offloading process may be interrupted to be resumed as the CAV reaches the next coverage region. This particularly applies, in sparsely deployed RSU networks. For this reason, traditional handover procedures cannot be put in place – thus making the reconciliation of data received by multiple RSUs very inefficient and not scalable, as the number of CAVs increases. With these regards, RLNC promises to overcome these issues in a seamless fashion .
Consider the case of a CAV wishing to transmit to one or multiple RSUs a source message over a broadcast channel. In our system, a source message represents a stream of sensor information packets. Transmissions experience a certain packet error probability. According to the RLNC principle, the CAV segments the source message intosource packets where is made of elements from a finite field . The CAV also linearly combines at random the source packets to obtain coded packets for transmission. Each coded packet consists of elements from and it is obtained as:
where the coding coefficient . Besides, let be a random matrix over where its -th element is defined by , the relation
Let be the set of coded packets pertaining to the same CAV and successfully received by each of the RSU, for . In turn, each RSU forwards to the FO and then to a cloud-based service the received coded packets. Hence, the cloud-based service can populate a decoding matrix where the columns of are defined by the columns in associated with coded packets in . The source message is recovered as soon as the rank of becomes equal to . In particular, the probability of a source message of being recovered, as a function of , can be expressed follows :
In our system, the transmission of each source message takes place in an unacknowledged fashion. After that the -th coded packet associated with a source message has been transmitted by a CAV, the broadcasting of the following source packet begins. In the following section, we will describe how a sensor data stream can be partitioned into a sequence of source message and how the RLNC principle can be integrated into the ETSI’s ITS-G5 communication stack.
Iii Proposed Extension to the ITS-G5 Stack
In our system, we assume that CAVs and RSUs communicate using the ETSI’s ITS-G5 standard [etsiStandard]. The ITS-G5 standard has been derived from the US Wireless Access in Vehicular Environment (WAVE), and both of them build upon the IEEE 802.11p DSRC physical layer. As shown in Fig. 2, the Medium Access Control (MAC) layer of the ITS-G5 protocol stack employs the Decentralized Congestion Control (DCC) protocol to dynamically optimize the channel load. This is done by adapting the transmission power, the rate and the sensitivity of a transceiver. Then a simplified Logical Link Control (LLC) layer acts as an adaptation layer between the DCC and the upper layers. With regards to the Networking and Transport Layer, packets can be forwarded in a multi-hop fashion by the GeoNetworking protocol, which employs the Contention Based Forwarding (CBF) algorithm. In particular, on the basis of geolocation information, the CBF algorithm elects as multi-point relay node the CAV at the greatest distance from the source node. On the other hand, point-to-point communications are handled by the Basic Transport Protocol (BTP), which is responsible for multiplexing/demultiplexing packets originated by the Facility layer .
At the level of the Facility layer, CAMs and Decentralized Environmental Messages (DENMs) are transmitted/received. In particular, CAMs are periodically broadcast by each CAV and convey information pertaining to the position of a CAV, its engine status, etc. On the other hand, DENMs are transmitted in the response of an event and are used to alert road users for a danger ahead, adverse weather conditions, etc . As per Sec. II, in order to enable cloud-based services to detect cyber-security threats efficiently, CAVs are expected to share their sensor information periodically. As such, building upon the standard definition of a CAM, we propose the RLNC-CAM version that each CAV is expected to use to broadcast sensor information. In particular, as shown in Fig. 2, when a CAV wishes to employ RLNC-CAMs, sensor information is provided as an input to the RLNC-Facility sublayer, which is responsible for the following:
Segmenting the sensor data stream into a sequence of source packets with the same bit length.
Organizing the sequence of source packets into source messages defined by consecutive source packets.
Assigning an ID to each source message.
Generating a sequence of coded packets per each source message according to the RLNC principle (see Sec. II-B).
Each coded packet generated by the RLNC-Facility Sublayer is mapped into an RLNC-CAM having the structure shown in Fig. 3. Each RLNC-CAM comprises the same fields as a standard CAM [etsiCam]. In addition, the RLNC fields are added to the message. More specifically, the header of an RLNC-CAM includes the protocol version in use, a general CAM ID and a generation timestamp. The body hosts information pertaining the CAV transmitting the message, such as: a transmitter ID, the nature of the CAV (mobile, public authority, private, etc.), its position (latitude, longitude, elevation, and heading) and a set of optional attributes.
The RLNC fields that follow the optional CAM attributes are defined as follows:
Source Message ID contains the ID of the source message that is being transmitted. Similarly to the Station ID filed, this field is bits long, and it is defined as an integer value ranging between and .
Finite Field Size encodes the value of that is being used by the RLNC-Facility Sublayer to generate each coded packet. In an embedded application, finite fields with a characteristic equal to are generally preferred because of the high level of optimization that it is possible to achieve in the implementation of the RLNC encoder/decoder . Thus, for practical application, values of are restricted to . We propose that Finite Field Size field represents the value . Thus, if we limit the value of to , it is sufficient for the field to be bits long to encode the values .
: Contains the seed used to initialize the pseudo-random number generator (PRNG) used to generate the coding vector associated with a coded packet. As such, provided the receiving end is equipped with the same PRNG, each coding vector can be precisely recovered. This solution is more efficient than including into each RLNC-CAM the entire coding vector represented as an array of integer values. Only a single integer representing the seed is included. By following the same reasoning as in, we suggest the Coding Seed field to be bits long – thus capable of expressing integer values ranging between and .
Coded Packet: Contains the actual coded packet generated by the RLNC-Facility sublayer. Its bit length is variable and is specified by the RLNC-Facility sublayer.
As soon as an RSU receives an RLNC-CAM, it is forwarded to the FO and then to the cloud. Then, for any CAV and any source message with a given ID, a could-based service is responsible for checking if a number of RLNC-CAM carrying linearly independent coded packets have been received. If so, then an that particular source message can be recovered and forwarded to the other could-based services.