Protecting Actuators in Safety-Critical IoT Systems from Control Spoofing Attacks

In this paper, we propose a framework called Contego-TEE to secure Internet-of-Things (IoT) edge devices with timing requirements from control spoofing attacks where an adversary sends malicious control signals to the actuators. We use a trusted computing base available in commodity processors (such as ARM TrustZone) and propose an invariant checking mechanism to ensure the security and safety of the physical system. A working prototype of Contego-TEE was developed using embedded Linux kernel. We demonstrate the feasibility of our approach for a robotic vehicle running on an ARM-based platform.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/19/2018

IoT Security Techniques Based on Machine Learning

Internet of things (IoT) that integrate a variety of devices into networ...
08/01/2019

Learning-Aided Physical Layer Attacks Against Multicarrier Communications in IoT

Internet-of-Things (IoT) devices that are limited in power and processin...
10/22/2019

Blockchain Methods for Trusted Avionics Systems

Blockchain is a popular method to ensure security for trusted systems. T...
01/23/2020

SeCloak: ARM Trustzone-based Mobile Peripheral Control

Reliable on-off control of peripherals on smart devices is a key to secu...
04/19/2017

TrustShadow: Secure Execution of Unmodified Applications with ARM TrustZone

The rapid evolution of Internet-of-Things (IoT) technologies has led to ...
06/26/2021

Evaluation of Cache Attacks on Arm Processors and Secure Caches

Timing-based side and covert channels in processor caches continue to be...
02/19/2021

Two-Point Voltage Fingerprinting: Increasing Detectability of ECU Masquerading Attacks

Automotive systems continuously increase their dependency on Electronic ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Today’s embedded and cyber-physical systems are ubiquitous. A large number of critical cyber-physical systems (e.g., autonomous cars, drones, manufacturing systems, power grids, industrial control systems, etc.) have real-time (RT) properties (e.g., strict timing and safety requirements). The current trend is to connect embedded RT devices to the Internet (e.g., remote surveillance over wired/wireless network, connected vehicles through cellular wireless networks, etc.) and this gives rise to the real-time Internet-of-Things (RT-IoT) (mhasan_rtiot_sensors19). RT-IoT systems are intended to provide better user experience through stronger connectivity and better use of next-generation embedded devices, albeit with safety-critical properties. RT-IoT systems are also increasingly becoming targets for cyber-attacks. A number of high-profile attacks on RT-IoT systems, e.g., denial-of-service (DoS) attacks mounted from IoT devices (ddos_iot_camera), Stuxnet (stuxnet), attack demonstrations by researchers on medical devices (security_medical) and automobiles (checkoway2011comprehensive) have shown that the threat is real. Successful cyber attacks against such systems could lead to problems more serious than just loss of data or availability because of their critical nature (mhasan_rtiot_sensors19). Enabling security in RT-IoT, however, is often more challenging than generic IoT due to additional timing/safety constraints imposed by RT-enabled systems. Since RT-IoT systems are largely based on sensing and actuation, any false/spoofed command to the actuators can disrupt the normal operation of the physical plant. Commonly used open-source RT-IoT development stacks (such as Linux) do not provide explicit control over actuation signals. For instance, if the application task obtains permission (say, root or other privileged user access) to the peripheral interface (e.g., I2C (i2c)), it is possible to send arbitrary signals to the actuators. Let us consider an industrial robotic arm (running an embedded variant of Linux in an ARM Cortex-A53 platform (rpi3)) that periodically opens and closes the grip to drop off and pick up objects in an assembly line. The movement of the grip is controlled by a servo. We use an open-source implementation (robot_arm_src) for this robotic arm where each operation is represented by a pulse value (where for grip_open() and for grip_close()) and each pulse sends the following four byte command sequences to the servo registers: , , , . An example of a spoofing attack for this control arm is presented in Fig. 1 (x-axis is the servo access sequence number and y-axis is the corresponding pulse value). Without any actuation command validation, it is possible to send arbitrary (high) pulses to the servo registers that prevents the grip from picking up/dropping objects (showing in the shaded region, see the top figure) that is not otherwise possible when our scheme (called Contego-TEE, see §3 for details) is enabled (bottom figure).

Figure 1. Demonstration of a control spoofing attack on a robotic control arm running embedded Linux.

Our proposed framework, Contego-TEE, prevents the sending of malicious/undesired commands to physical actuators and ensures safety of the system. Specifically, we use the concept of Trusted Execution Environments (TEEs) (7345265_tee) available in commodity processors (e.g., ARM TrustZone (trustzone_survey), Intel SGX (intel_sgx)) to ensure that our protection mechanisms can not be disabled even if the host OS is compromised. We develop a rule-based invariant checking and access control mechanism as well as design-time (schedulability) tests to ensure timing and safety requirements of the system. Contego-TEE specifically designed for legacy systems developed with Commodity-Off-The-Shelf (COTS) components and does not require any modification to the application code/logic. In this paper we present the following contributions.

  • A new framework called Contego-TEE to secure COTS-based RT-IoT systems against attacks that spoof control signals (§3.1).

  • A runtime, rule-based invariant checking mechanism as well as design-time analysis to ensure security (and safety) of the physical plant (§3.2).

  • An open source implementation and patch to the (embedded) Linux kernel that includes the Contego-TEE functionality (§4.1).

We use ARM TrustZone as a TEE and implemement our solution in an ARM Cortex-A53 board (i.e., Raspberry Pi (rpi3)). We also demonstrate the viability of our approach using a COTS rover platform (§4.2).

2. System and Adversary Model

In the following, we first present background on RT-IoT systems and give an overview of a TEE-based architecture (e.g., ARM TrustZone). We then introduce our system model (§2.2) and describe our assumptions on the adversarial capabilities (§2.3).

2.1. Preliminaries

2.1.1. RT-IoT Systems:

RT-IoT systems comprise IoT edge devices with RT capabilities. RT systems are those that, apart from a requirement for functional correctness, require that temporal properties be met as well. These temporal properties are often presented in the form of deadlines. The usefulness of results (say the performance of the actuators) produced by the system drops after the passage of a deadline. Some of the common properties and assumptions related to RT systems include (mhasan_rtiot_sensors19): (i) periodic/sporadic execution of set of tasks111The ‘task’ in RT-IoT systems can trivially be mapped with the concept of process or thread in general-purpose OSes., (ii) strict timing and safety requirements, (iii) well-characterized execution time (e.g., execution times in the worst-case are known for all loops), (iv) limited resources (e.g., memory, processing power and energy). RT-IoT systems are often designed based on the periodic task model (Liu_n_Layland1973), i.e., each task is characterized by a tuple: where is the Worst Case Execution Time (WCET), is the period (e.g., inter-invocation time) and is the deadline. Schedulability tests (res_time_rts; bini2004schedulability) are used to determine if all tasks in the system meet their respective deadlines. If they do, then the taskset is deemed to be ‘schedulable’ and the system is considered safe.

2.1.2. TEE and ARM TrustZone:

TEE is a set of hardware and software-based security extensions where the processors maintain a separated subsystem in addition to the traditional OS components. TEE technology has been implemented on commercial secure hardware such as ARM TrustZone (trustzone_survey) and Intel SGX (intel_sgx). In this work we consider TrustZone as the building block of Contego-TEE due to wide acceptability of ARM processors for embedded IoT systems – although our framework can be ported into other TEE platforms without loss of generality. ARM partnered with GlobalPlatform and has defined new TEE APIs (globalplatform_tee_api). TrustZone encompasses the following major features (liu2018alidrone): (a) safe and secure boot (to ensure all software components are in a trusted state before launching the OS); (b) isolated execution of critical applications (i.e., in a secure enclave) and (c) protection for trusted applications data (in terms of integrity and confidentiality). ARM TrustZone contains two different privilege blocks: (i) Normal World (NW) and (ii) Secure World (SW). The NW is the untrusted environment running a commodity untrusted OS where SW is a protected computing block that only runs privileged instructions. SW in TrustZone defines the memory regions that can only be accessed by privileged instructions and the code that runs in the SW has higher privilege than the NW. Hardware logic ensures that the resources in the secure world can not be accessed from the normal world (e.g., if the code running in the NW tries to access protected memory regions, TrustZone throws a hardware exception). The SW instructions are triggered when a specific flag in the processor e.g., Non-secure (NS) bit in the Secure Configuration Register (SCR) is not set. These two worlds bridge via a software module referred to as Secure Monitor. The context switch between the NW and SW is performed through a Secure Monitor Call (SMC). In this work we use the TrustZone functionality to prevent the malicious commands from being sent to the actuators (See §3 for details). We now present our system and adversary model.

2.2. System Model

Figure 2. High-level schematic of a RT control system.

In Fig. 2 we present a high-level illustration of a RT control system. We consider a set of periodic RT control tasks that execute on single processor222Since majority of the RT-IoT edge devices still use single core chips due to simplicity and determinism.. The physical system consists of a set of actuators (e.g., servo, motor, buzzer): . RT tasks periodically issue commands to the actuators to control physical entities (e.g., wheel, propeller, alarm, robotic grip, etc.). We assume that each task is allowed to access a subset of peripherals. We represent this access permission as an Boolean matrix where represents task can send commands to actuator . We also assume that the RT tasks finish computation before their deadline, e.g., the tasks are schedulable333In the Appendix we present formal expressions to determine schedulability of the tasks..

Figure 3. Overview of Contego-TEE system design (left) and high-level control flow of RT tasks in Contego-TEE (right).

2.3. Adversary Model

We consider the following adversarial capabilities: (a) Integrity Violation – an adversary may insert a malicious task (that respects the RT guarantees) and/or modify exiting control logic to manipulate actuator commands and control system behavior in undesirable ways; (b) Denial of Service (DoS) – the attacker may take control of the RT task(s) and destabilize the physical plant e.g., by sending multiple control requests in a burst that may result in a malfunctioning actuator, or worse, damage the actual hardware/actuator and even threaten the safety of the system. The attacker can gain privileged (e.g., root) access to perform adversarial actions (e.g., to spoof control signals). We do not make any assumptions as to how the compromised tasks enter the device. For instance, bad software engineering practices leave vulnerabilities in the systems (loi2017systematically). When the system is developed using a multi-vendor model (sg2) (where its components are manufactured and integrated by different vendors) a malicious code logic may be injected (say by a less-trusted vendor) during deployment. The adversary may also induce end-users to download the modified source code, say by using social engineering tactics (securecore_syscal). We also assume that the attackers do not have any physical access (e.g., they can not physically control/turn off/damage the actuators).

3. Actuation Monitoring Framework

In the following we first introduce the Contego-TEE framework (§3.1). We then present mechanisms to detect any abnormal control commands issued by (rogue) tasks and analyze schedulability conditions that ensures our (invariant) checking techniques can be enforced at runtime (§3.2).

3.1. Overview and Architecture

As mentioned earlier, to secure RT-IoT platforms we propose a TEE-based architecture that monitors actuation commands send to the physical entities. At the high-level, our design is based on the Simplex architecture (sha2001using). Researchers use Simplex-based architecture for time-critical cyber-physical systems to provide fault-tolerance (liu2008ortega; l1_simplex) and recently, security (mohan_s3a; mhasan_resecure16; securecore). A Simplex system consists of the following main components: (a) under normal operating conditions a High-Performance (complex) Controller actuates the physical plant (such a controller may be unverifiable due to its complexity, yet it must actuate a safety-critical system); (b) if, during operation the system state becomes unstable (e.g., it is in danger of violating a safety condition), a Safety Controller takes over and (c) the exact switching behavior is implemented by a Decision Module that decides which controller output will drive the plant. In our context, we use a trusted (and verified) computing module (this is analogous to the safety controller) executed in a secure enclave (viz., SW) and ensures that even if the (potentially untrusted) NW RT tasks are compromised, an adversary can not send false signals to the physical actuators. In Fig. 3 we illustrate the high-level overview of Contego-TEE design and control flow of the RT tasks. Contego-TEE contains the following essential components: (a) a TEE-enabled SoC (System-on-Chip) such as those supported by ARM TrustZone (trustzone_survey) (block 2⃝ in the figure); (b) an Enclave Client (block 7⃝) that is used to communicate between NW and SW and (c) an Invariant Checker (block 8⃝) that is used to monitor (and validate) the actuation commands. The physical plant (1⃝) is connected with sensors (2⃝) and actuators (3⃝) and controlled by the (potentially vulnerable) RT tasks (5⃝). RT tasks execute in untrusted NW and issue system calls (e.g., read(), write(), ioctl()) to access the sensors/actuators using specific interface such as I2C (i2c) and/or SPI (spi). Contego-TEE ensures that RT tasks cannot directly send any actuation commands (e.g., it breaks the bridge between 6⃝, 2⃝ and 4⃝). We do this by placing a dispatcher (e.g., enclave client) between the peripheral subsystem and actual hardware. As a result, before issuing any command to the physical actuators, it will be validated by our trusted application (e.g., invariant checker) running inside the secure enclave (i.e., in the SW). In particular, when a RT task sends an actuation command to any peripheral at time , enclave client traps those request and forwards the command to the invariant checker using SMC. Depending on the access permission matrix and current system state , invariant checker then decides whether the given command can be issued to the actuator (refer to §3.2 for details). In Contego-TEE, both the enclave client and invariant checker operate in the privileged mode (e.g., kernel space) so that it can directly control low-level hardware. By using the enclave client (to invoke context switching) and invariant checking mechanisms, Contego-TEE ensures that even if the NW RT tasks are compromised, an adversary can not send false signals to the actuators. We note that unlike NW RT tasks that may perform other computation, the invariant checker contains a small, verified, code blocks that is used to monitor only actuation requests. We also note that Contego-TEE does not require any application-level modifications, e.g., developers can execute unmodified, existing legacy RT tasks, using our Contego-TEE enabled OS-kernel (refer to §4.1 for implementation/porting details).

Platform Application Domain Actuators Possible Invariant Conditions* Response Remarks
Water/air monitoring system Home/industrial automation Buzzer, display (a) Send high pulse to buzzer only if water-level is high/air quality abnormal/detect smoke; (b) do not display alert if the system state is normal IGNORE Ignore all commands that fail invariant checking
Surveillance system Home/industrial automation Servo, buzzer (a) Trigger alarm only if there is an impact/object detected in camera; (b) rotate camera (using servos) only within allowable pan/tilt angle IGNORE Ignore all commands that fail invariant checking
Infusion/syringe pump Health-care Motor, display (a) Drive the motor only to allowable positions/rates (b) display only the amount of fluid infused (e.g., obtained from motor encoders) IGNORE Ignore actuation when the task tries to infuse wrong amount of fluid
Robotic arm Manufacturing Servo, buzzer (a) Check the servo pulse sequences matches with the desired (design-time) sequence; (b) do not raise alarm if the pulse sequence is normal IGNORE, FAIL-SAFE If mismatch, use the predefined sequence; ignore other pulses using rate-control rule
Robotic vehicle (aerial/ground) Manufacturing, surveillance, agriculture Servo, motor (a) Check if the robot is following the mission; (b) allow only predefined number of actuation commands per period IGNORE, FAIL-SAFE Ignore command using rate-control rule. If it deviates from the mission, use predefined command and/or state-observations

*We omit mathematical expressions for readability.

Table 1. Applicability of Contego-TEE for Various RT-IoT Platforms

3.2. Invariant Checking and Timing Analysis

3.2.1. Invariant Checking:

In order to validate each actuation command invoked by the RT tasks, Contego-TEE performs various actions. One obvious access control mechanism is to ensure that a task can access a given actuator only if the task has the required permission (e.g., ). Contego-TEE therefore denies all the actuation commands from tasks if the corresponding access flag is zero. However, if the attacker can compromise a task with legitimate access (to a given set of actuators) then the (victim) task may send arbitrary commands to the actuators. Therefore in addition to checking access matrix , Contego-TEE also performs checking of system invariants and monitors the number of actuation commands for a given time interval as we discuss below.

State Invariant Checking:

Invariant checking (adepu2017design) is useful to detect control spoofing attacks. For a given RT-IoT platform we do this by considering the availability of an invariant checking function that predicts the actuation signal and only allows access if the output of the function matches that of the requested command. In particular, if a task sends actuation command at time to any peripheral and the task has the required permission (i.e., ), first obtains system state by observing a set of signals and decides whether is valid for current state . For example, consider a warehouse water monitoring system where an alarm is triggered only if the water level of the tank (measured by the sensor ) is higher that a predefined threshold () and/or the water temperature () is not in expected range (i.e., ). We represent this as the following invariant rule: , e.g., Contego-TEE will only allow the sending of the high pulse (i.e., ) to the alarm system (say a buzzer) only if the invariant conditions are satisfied. We note that since Contego-TEE operates at the kernel-level, it can directly access raw signals without any interaction of NW RT tasks or other (user space) libraries.

Rate Control:

Note that since RT systems are deterministic by design, the (worst-case) number of actuation requests can be bounded at design time (wcrt_survey). Therefore, if a task tries to access actuator(s) more than expected within a given time interval (e.g., ), it may be indication of a possible attack. In such cases Contego-TEE will limit subsequent access requests from and prevent the sending of actuation commands to the hardware. We enforce rate control using the following invariant rule: , i.e., Contego-TEE ignores further actuation commands if the number of requests from any job of within the (relative) time window is exceeded a design-time threshold . Such a rate control mechanism is specially useful to defend against DoS attacks where an attacker sends multiple actuation commands in a burst (say to quickly change the speed of wheels/propellers in robotic ground/aerial vehicles, abruptly move robotic arms, falsely toggles buzzers, etc.) to disrupt normal operations of the system.

3.2.2. Response Mechanisms:

When there exists a mismatch between output of the and the requested actuation commands, Contego-TEE makes use of the following strategies to keep the physical system safe.   IGNORE: this strategy prevents the execution of any actuation commands requested by RT tasks. Hence, actuators will not receive any signals from Contego-TEE and will continue to operate using the last known (uncompromised) commands. Contego-TEE will also ignore commands if the task makes multiple requests in a short time window (e.g., by using rate control rule).  FAIL-SAFE: while the IGNORE strategy ensures that actuators will not get any abnormal signals, ignoring actuation commands (for a long time) may not be acceptable for highly dynamic systems such as unmanned ground/aerial vehicles (e.g., it may crash). Therefore, Contego-TEE also allows operation of a FAIL-SAFE mode, i.e., if it finds any mismatch, it ignores the requests from RT tasks and sends the predetermined (and/or based on the output of ) commands to make the system safe/operational. As an example, if there is a sudden change in the propeller speed of a UAV, the FAIL-SAFE strategy sets a safe, predefined speed, based on the current state of the UAV. Depending on the target system, both of the above strategies may be required to keep the physical system operational. We note that invariant checking and response mechanisms are application dependent. Contego-TEE provides flexibility for the system engineers to develop appropriate mechanisms depending on the application requirements. In Table 1 we summarize possible invariant conditions and response mechanisms that are applicable for various RT-IoT platforms – however, this is by no stretch meant to an exhaustive list.

3.2.3. Schedulability Analysis:

In order to perform invariant checking and execute the response mechanisms at runtime, we need to ensure that our framework should not cause delays and the timing requirements of RT tasks are satisfied (e.g., they complete execution before deadline). We therefore develop design-time schedulability tests that ensure the taskset is schedulable (refer to the Appendix for details). For instance, the RT task is schedulable in Contego-TEE if the Worst Case Response Time (WCRT) is less than deadline, i.e., , where is the task WCET (including the time for world switching and invariant checking) and is the interference444In RT scheduling theory, the term ‘interference’ refers to the amount of time (from release to deadline) the task is ready but can not be scheduled due to execution of other tasks. from other tasks. The taskset is referred to as schedulable if all the tasks are schedulable, viz., .

4. Evaluation

In this section we first present the implementation details of Contego-TEE (§4.1) and then show the viability of our approach using a case-study on a robotic vehicle (§4.2). Table 2 summarizes the system configurations and implementation details.

Artifact Configuration
Platform Broadcom BCM2837 (Raspberry Pi 3)
CPU 1.2 GHz 64-bit ARM Cortex-A53
Memory 1 Gigabyte
Operating System Linux (NW), OP-TEE (SW)
Kernel version Linux kernel 4.16.56,
OP-TEE core 3.4
Peripheral interface I2C
Boot parameters dtparam=i2c_arm=on,
dtparam=spi=on
Table 2. Summary of the Implementation Platform

4.1. System Implementation

We implemented a proof-of-concept prototype of Contego-TEE on Raspberry Pi 3 (RPi3) Model B (rpi3) (equipped with 1.2 GHz 64-bit ARMv8 CPU and 1 GB RAM). We selected RPi3 as our implementation platform since (a) it supports ARM TrustZone and (b) previous research has shown feasibility of deploying multiple IoT-specific applications on RPi3 (securecore_syscal; cheng2017orpheus; virtsense_liu2018; protc_liu2017; liu2018alidrone). We developed Contego-TEE using the Open-Portable Trusted Execution Environment (OP-TEE) (optee) software stack that uses GlobalPlatform TEE APIs (globalplatform_tee_api) to provide TrustZone functionality. OP-TEE provides a minimal secure kernel (called OP-TEE core) that can be run in parallel with the NW OS (e.g., Linux). In particular, we used Ubuntu 18.04 filesystem with a 64-bit Linux kernel (version 4.16.56) as the NW OS and our invariant checker is running on OP-TEE secure kernel (version 3.4). The enclave client was statically built with the Linux kernel. In order to implement the enclave client, we extended the Linux TEE interface (/linux/drivers/tee/) and enabled SMC from Linux kernel space555Since GlobalPlatform APIs only support SMC from user space.. We implemented the invariant checker as an OP-TEE kernel-level trusted application666This is known as PTA (Pseudo Trusted Application) in OP-TEE terminology. (e.g., in /optee_os/core/arch/arm/pta/). In our current implementation Contego-TEE supports actuators that are controlled via the I2C interface. Specifically, we modified the built-in structure i2cdev_fops (e.g., in /linux/drivers/i2c/i2c-dev.c) with our enclave client functions that is then switch the control to the invariant checker (e.g., by using SMC). Our implementation code is available in a public repository (contego_tee_impl).

4.2. Case-Study: Robotic Vehicle

We implemented Contego-TEE in a COTS rover (named GoPiGo2, manufactured by Dexter Industries (gpg2)) that can be used in multiple IoT-specific applications such as remote surveillance, agriculture, manufacturing, etc. (guo2018roboads). The rover is equipped with two optical encoders that are connected to the motors (e.g., actuator in this setup): it can turn left by switching off the right encoder and vice-versa. The detailed specifications of the rover are available on the vendor website (gpg2).

4.2.1. Results

We first demonstrate how Contego-TEE can be used to protect such systems from actuation attacks and then measure the performance overheads.

Security Analysis:

For the following experiments, we conducted a line following mission where the robot steered from an initial location to a target location by following a line. The controller task was running as a NW Linux application and executed vendor-provided PID (Proportional–Integral–Derivative) closed-loop control (gpg2_lf) to track the planned path using the data received from sensors. The rover used the following commands: , , , for navigating the rover forward/left/right and set the speed to , respectively, where each command sent a 5-byte value to the actuator registers (e.g., wheel encoders/motors) using the I2C interface. For this mission we defined the following three invariant conditions777We manually inspected the vendor-provided control code and translated them into invariant conditions. that were used to monitor control signals (e.g., ): , and where was the readings from the sensor, was a vendor-provided threshold (e.g., to follow the line) and were used to set the speed of the rover. We show the case where the access flag is set (e.g., ) since Contego-TEE will trivially deny requests if the corresponding flag is zero. In our experiments we used both the FAIL-SAFE and IGNORE (to enforce rate control) strategies. For each actuation signal, our invariant checker matches with the desired signal and choose the appropriate strategy as we present in the following.

(a)
(b)
Figure 4. Illustration of Contego-TEE under (a) control spoofing and (b) DoS attacks. Contego-TEE prevents the sending of malicious commands to the motors and ensures that the rover moves at a steady speed.

In Fig. 3(a) we illustrate our invariant checking mechanism with FAIL-SAFE strategy. The x-axis of the figure shows the time (e.g., count of the controller job) and the y-axis is the total distance travelled by the rover (e.g., readings from the encoders). In order to demonstrate malicious behavior, we followed a strategy similar to that considered in prior work (choi2018detecting; securecore; guo2018roboads; mhasan_resecure16). In particular, during program execution, we injected a logic bomb (during the shaded region in Fig. 3(a)) and sent erroneous commands to the controller. In this case, during the control spoofing attack, the rover deviated from the mission (e.g., PID control loop) and falsely sent commands to turn off one of the motors. As a result, when Contego-TEE was not active, the rover was not following the line and the encoder readings (i.e., traversed distance) remained same (see the maroon line in the figure). We next executed the same code with Contego-TEE enabled (green curve in the figure). In this case, when each control command was issued, our checker followed the invariant conditions (e.g., ) and sent desired commands to the motors (and hence the rover was moving as expected). We next show the effect of our rate control mechanism (Fig. 3(b)). In this experiment, when the DoS logic bomb was triggered (shaded region in the figure) it sent multiple requests to increase the speed of the rover. When Contego-TEE was not enabled, this caused the rover to move faster and hence there was a rapid increase in the encoder readings (e.g., maroon line, shaded region in the figure). In contrast, when Contego-TEE was active (green line), it disallowed multiple increase speed requests per period (e.g., according to IGNORE strategy) and hence the rover followed the line with a steady speed.

(a)
(b)
Figure 5. Runtime of rover control tasks with and without Contego-TEE: (a) for 99-th percentile and (b) worst-case. Contego-TEE increases the execution time by upto 43.47 ms (worst-case) and 23.31 ms (99th-percentile).
Overhead Analysis:

To measure the runtime overheads we conducted experiments with the vendor-provided control tasks (gpg2) as a benchmark (Fig. 5). In this setup our invariant checker was following a rate control policy and ignored more that one actuation request per period ( ms). The x-axis of Fig. 5 shows the control tasks and y-axis represents execution time (a) when Contego-TEE is not enabled (dark bar) and (b) with Contego-TEE enabled (light bar). We present the timing results for 99th percentile (Fig. 4(a)) and worst-case (Fig. 4(b)). The timing values were measured using the Linux clock_gettime() system call with CLOCK_MONOTONIC clock parameter and we present data from trials. As we see from the figure, Contego-TEE increases the execution time – this is expected due to (world) context switching as well for invariant checking. From our experiments we found that Contego-TEE increases execution times by (i) 34.11 to 43.47 ms (worst-case), (ii) 22.87 to 23.31 ms (99-th percentile) and (iii) 19.55 to 19.60 ms (average-case) for the various control tasks and hence can be used with Hz (or slower) controllers (for this setup). This extra overhead results in increased security and we expect this could be acceptable for various RT-IoT platforms.

5. Related Work

Enhancing security in time-critical cyber-physical systems is an active research area (see the related survey (mhasan_rtiot_sensors19)). Perhaps the closest line of work to ours is PROTC (protc_liu2017) where a monitor in the SW enforces secure access control policy (given by the control center) for some peripherals of the drone and ensures that only authorized applications can access certain peripherals. Unlike our scheme, PROTC is limited for specific applications (e.g., aerial robotic vehicles) and requires a centralized control center to validate/enforce security policies. In early work we proposed mechanisms to secure legacy time-critical systems (mhasan_ecrts17; mhasan_rtss16; mhasan_date18)

. Researchers also proposed anomaly detection approaches for robotic vehicles 

(guo2018roboads; choi2018detecting; fei2018cross). However these (prior) approaches do not provide any response mechanism and are vulnerable if the adversary can compromise the host OS. There exist various hardware/software-based mechanisms and architectural frameworks (mohan_s3a; securecore; securecore_memory; securecore_syscal; mhasan_resecure16; mhasan_resecure_iccps) to secure RT-IoT systems. However those frameworks are not designed to protect against control-specific attacks and may not be suitable for systems developed with COTS components. There also exist large number of research for generic IoT systems as well as use of TrustZone to secure traditional embedded/mobile applications (too many to enumerate here, refer to the related surveys (yang2017survey; ammar2018internet; trustzone_survey; trustzone_survey_2)) – however the consideration of time-critical and control-centric aspects of RT-IoT applications distinguish Contego-TEE from other research.

6. Conclusion

In this paper we presented a new framework named Contego-TEE that enhances the security and safety of the RT-IoT systems. We use a combination of trusted hardware, intrinsic real-time nature and domain-specific characteristics of such systems to detect control intrusions and prevent the physical plants from being misbehaved under attacks. We believe our framework is tangential and can be incorporated into multiple RT-IoT and cyber-physical domains.

References

Appendix

Response Time Analysis for RT Tasks

Our schedulability test is based on the fixed-priority response time analysis proposed in RT literature (res_time_rts). Let be the number of actuation request for and is the additional computation time due to world switch and invariant checking. Then the WCET of can be represent as . Since our enclave client and invariant checker can serve one actuation request at a time (e.g., an atomic process), may be delayed due to processing requests of lower priority tasks. Let denote the ‘blocking’ factor from tasks that are with lower-priority that (denoted as ). We note that the maximum computational demand for a given task in any interval length can be no more than the maximum execution time required by one job of multiplied by the maximum number of jobs of that can execute in that interval (res_time_rts; bini2004schedulability). The maximum interference experience by from other tasks for an interval can be expressed as: where denotes the set of tasks with a priority higher than . Therefore, we can calculate the response time of (denoted as ) as follows:

(1)

The WCRT then can be obtained by solving this recurrence using an iterative fixed-point search, e.g., for some iteration with initial condition . The iteration is guaranteed to be converged if we assume that the total processor utilization (i.e., ) is less than  (joseph1986finding). The taskset is considered as ‘unschedulable’ if there exists an such that . Such unschedulability result will hint the designers to update parameters (e.g., periods, number of tasks, invariant checking policies) to incorporate Contego-TEE framework for the target system.