SoK: Hardware Security Support for Trustworthy Execution

10/11/2019 ∙ by Lianying Zhao, et al. ∙ 0

In recent years, there have emerged many new hardware mechanisms for improving the security of our computer systems. Hardware offers many advantages over pure software approaches: immutability of mechanisms to software attacks, better execution and power efficiency and a smaller interface allowing it to better maintain secrets. This has given birth to a plethora of hardware mechanisms providing trusted execution environments (TEEs), support for integrity checking and memory safety and widespread uses of hardware roots of trust. In this paper, we systematize these approaches through the lens of abstraction. Abstraction is key to computing systems, and the interface between hardware and software contains many abstractions. We find that these abstractions, when poorly designed, can both obscure information that is needed for security enforcement, as well as reveal information that needs to be kept secret, leading to vulnerabilities. We summarize such vulnerabilities and discuss several research trends of this area.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

$\ddagger$$\ddagger$footnotetext: The work was started when the first author was at the University of Toronto.

The trustworthiness of a computer system entails that the system can correctly fulfill tasks as intended by the user. This can include user tasks (e.g., applications) and system tasks (e.g., a system integrity monitor). Enabling techniques for trustworthy execution can be implemented in software, hardware or both. For instance, there exist proposals for both software-based [17] and hardware-based [26] solutions for Control-Flow Integrity (CFI); and likewise, isolation for code can be achieved with either software (e.g., containerization and sandboxing) or hardware components [109].

Nonetheless, there exists a common belief in hardware’s advantages over software in consideration of the following aspects: 1. Relative immutability. Tampering with hardware is non-trivial and requires physical access. This makes hardware support a necessity in mitigation techniques under a strong adversarial model, e.g., rootkit-level threats [76] and hypervisor threats [66]. 2. Efficiency. Contrary to software that introduces a layer of architectural abstraction, direct hardware implementation can save redundant processing cycles (e.g., instruction decode). This allows for better efficiency for certain iterative/repetitive operations (e.g., consider how DSPs outperform regular CPUs for signal processing). It can further contribute to lowering power consumption, which is critical for resource-constrained devices.

Furthermore, hardware is the Root of Trust (RoT) [48], as it bridges the physical world (where human users reside) and the digital world (where tasks run as software). To securely perform a task or store a secret, the user trusts at least part of the computer hardware.

Dedicated hardware security support has seen its proliferation since the early days of computers. It can take a straightforward form as discrete components to assist the CPU, ranging from the industrial-grade tamper-responding IBM Cryptocards (e.g., 4758 [37]), Apple’s proprietary secure enclave processor (SEP [84]) for consumer electronics, to the more open Trusted Platform Module (TPM), smart cards and other security tokens. Meanwhile, architectural enhancements are being introduced to the CPU, as well as its chipset, such as Trusted Execution Environments (TEEs, see Section III), the NX-bit, Intel MPX, MPK, etc.

Driven by demand over the few decades, hardware is being “patched” cumulatively, as reflected in the increase of the x86 ISA extensions (according to Baumann [10], hardware is becoming the new software). The added complexity and interaction between hardware features has led to numerous problems, which has motivated us to reflect on the role of hardware in security, and what key features make a good hardware security mechanism.

Most hardware security features achieve their purpose by hiding information or access from an entity, and/or exposing a new/reduced interface to it. For instance, memory protection between processes is implemented by virtual memory, which prevents processes from accessing each other’s memory by presenting each process with its own virtual address space. The design of virtual memory can be considered an instance of “abstraction” [22, 72], which 1) removes or attenuates information (by hiding the true implementation of memory) and 2) creates new semantics (such as page tables), which are also called “abstractions”. Hence there is a relationship between security and abstraction, which is particularly important considering the heavy use of abstractions in computer hardware to separate its implementation from that of the software it runs.

We propose to use abstraction (Section II-A) as a lens through which the workings of hardware security support and various related issues are examined. Specifically, we find that: 1) beyond the commonly perceived immutability and efficiency, the properties of hardware security support can be explained as mechanisms that hide/expose something; 2) certain low-level attacks can be attributed to insufficient (Section IV) or excess (Section V) abstraction.; and 3) many proposals seek to leverage abstractions to increase security (Section VI) or propose new abstractions to increase the flow of information between hardware and software for better security (Section VII).

While traditionally Trusted Computing [91] emphasizes merely the protection for initial integrity and isolation during execution, we expand the scope to be more holistic and to ensure the eventual correct execution outcome. This includes run-time dynamic aspects such as control/data flow integrity of the running program and secure input/output. We also include firmware in the processor, system chipset and peripheral devices, which while they do not execute trusted application software, are still nonetheless part of the trusted computing base of much of that software. We examine how hardware-enforced abstraction affects execution within this entire scope.

It is not our intention to focus on improper implementation that fails to properly implement a specification. However, a hardware-related fault may not clearly be attributable to either a design flaw or an implementation error. We will not try to draw a boundary (which might be difficult to do in any case).

Contributions.

  1. We develop a general model for hardware security support based on the notion of abstraction, and examine various Trusted Execution Environments (TEEs) along several key properties of the abstractions the TEEs provide.

  2. We associate the extent of abstraction (excessive/insufficient) with the root cause of a series of vulnerabilities, represented by side channels on one hand and insufficient information flow on the other. Namely, when the abstraction specification does not hide all that is supposed to be hidden, leaked information leads to attacks. On the other hand, when such leakage is inevitable, too much hiding across components also hinders effective reference monitoring, which in turn leads to compromises.

  3. We systematize the application of state-of-the-art hardware security features and the proposal of new hardware enhancements in the literature, based on their properties from the perspective of abstraction.

Ii Model of Trustworthy Execution

In this section, we explain our model of trustworthy execution, using abstraction as a way to analyze and understand the various hardware support towards trustworthy execution that has been implemented over the years. We begin by defining abstraction in this context and show how it maps to the operation of hardware. We then discuss its role in improving hardware support for trustworthy execution.

Ii-a Abstraction

In computer science, an abstraction can be thought of as a model of a concrete artifact that retains the artifact’s essential properties while eliding details of its implementation [72]. Abstraction is used heavily across the interface of components to reduce the complexity of interactions across those components. Thus, it is quite natural that abstraction is heavily used at the interface between hardware and software, one of the most complex interfaces in modern computing. Some key properties of an abstraction, which is the interface created when one abstracts a concrete implementation, are 1) the attenuation of information and 2) the creation of new semantics. For example, consider the implementation of modern processor caches:

  1. Attenuation of information: Explicit access to and control of cache memories is hidden, only revealing faster code execution.

  2. New semantics: Caches expose the abstraction of a cache line, the size of the cache memory, a mapping function between cache lines and memory and certain instructions (e.g., WBINVD/PREFETCH/CLFLUSH), which influence cache behavior.

From this example, we can see that abstraction both hides and exposes information. Depending on the components involved, what can be hidden is the existence of resources or metadata. Other examples of hardware-software abstractions include the processor ISA, which exposes the semantics of an instruction for software to utilize hardware resources, but hides implementation details such as pipelining, out-of-order-execution and speculation.

We distinguish abstraction from isolation because while both reduce the flow of information, total isolation as classically defined by Lampson [74] means that there is no information flow at all, while abstraction merely attenuates it, still allowing information to flow over the new semantics it defines. Similarly, virtualization is a particular instance of abstraction, which seeks to maintain one abstraction with a vastly different implementation (i.e. backing RAM memories with disk).

The security concern with abstraction is that as hardware has become a critical piece of security for software, hardware abstractions have been thrust into a role for which they were not explicitly designed, which is to provide access control between mutually distrustful, actively malicious entities. Failures of these abstractions in this role can have disastrous consequences. For example, the Meltdown/Spectre vulnerabilities can be attributed to a failure in the ISA abstraction to properly restrict memory accesses by one process to another (or to the kernel), which we discuss in Section IV. In another example, the hiding of information via the SMM abstraction key information about the SMM region of memory from flowing to reference monitors in the chipset and CPU caches, allowing unauthorized access to SMM state, which we discuss in Section V. From these examples, we can see that the correct design of hardware abstractions is crucial to providing trustworthy execution environments, and it is through abstraction, that we will analyze and model TEEs in this paper.

Ii-B Execution Environments

While the ISA abstraction describes the abstraction over which software can use hardware resources, the more general interface that describes all interactions between software and hardware is the Execution Environment (EE), which includes not just the ISA, but other abstractions, such as concurrency and level of confinement.

We call an EE that implements trustworthy execution a TEE. What is trustworthy execution? Trustworthy execution is an execution system whose goal is the correct execution of a task (as opposed to endpoint security, which concerns the whole computing device). As such, the goal of trustworthy execution can be subdivided into two sub-goals:

  • Initial integrity: If the task is started wrong, no correct execution can be expected. Related to this, one TEE may invoke another TEE and propagate trust to it using a chain of trust.

  • Run-time security: Once started, the task is subject to both attacks from the outside and misbehavior of internal code. Such attacks may seek to directly subvert the task or indirectly exploit a defect in the task to subvert it. If such attacks are successful, then the task will not execute correctly.

Several properties of TEEs are relevant to their ability to provide trustworthy execution. As such, these properties can form a basis for evaluating and comparing TEEs to see if they are “fit for purpose”. The following are major security properties we consider for TEEs:

  1. Initial integrity assurance. This describes how the TEE provides initial integrity, and in general, comes in one of two forms: static image integrity and launch integrity. The former checks only the image at installation (or update) time, applying mostly to firmware update mechanisms, while the latter checks both the image and initial inputs right before execution.

  2. Addressing/memory protection. This describes both what memory the TEE can access, and whether its memory can be accessed by other EEs. This determines the advantage/disadvantage of a TEE, i.e., if defense code is hosted inside, to what extent it can be protected from being manipulated (integrity) or even seen (confidentiality) from outside, and at the same time, what ability it has to monitor or control the execution of other EEs.

  3. Scope. This determines whether the TEE is per-logical processor (LP), per-core, per-CPU or per-system. For example, it is possible to see whether a task is in an SGX enclave or not at the granularity of an LP, as the CREG CR_ENCLAVE_MODE which indicates this exists for each LP [23]. In contrast, TrustZone is per-system as a single NS bit, which is propagated from the main bus to the APB (peripheral) bridge [97], exists for the entire system.

  4. Developer access. This describes whether the TEE is designed to run third party, possibly untrusted or malicious code.

Note that we explicitly exclude the ISA-defined notion of privilege-level (i.e. the current privilege level (CPL) or “Ring level” in x86) from these properties. This is because, with regards to security, the traditional hierarchy of ISA privilege has become less relevant, as they are not absolute (i.e. Ring 0 provides different protection levels depending on whether it is between a kernel and a user process, or a kernel and a VMM), and it is no longer a strict hierarchy (SGX enclaves run at Ring 3 but still resistant to Ring-0 code and ARM privileges are orthogonal to the secure/non-secure status).

There are a number of ISA extensions that serve to hide or restrict information flow, but do not form a complete TEE as they do not provide all the abstractions necessary to execute code. For example, Intel SMEP/SMAP [59] (or ARM PXN [2]) can only be used to restrict the addressing capabilities of privileged code, while Intel MKTME [61] (cf. TME) lacks proper addressing isolation from outside code they do not protect. Thus, these can be viewed as extensions to existing EEs.

TEE
Type
Scope
Init integrity
Accessible by Dev
 ❶ Microcode F CPU Launch
 ❷ ME / PSP (-3) F System Static
 ❸ S3 Boot Script111We use the S3 boot script as an example to demonstrate typical protection level for BIOS/UEFI firmware. F System Static
❶❷❹❺❻❼❽
 ❹ SMM (-2) F CPU Static
N
 ❺ TXT / SVM A System Launch
❶❸❹
 ❻ VMM (-1) C System X
❶❷❸❹❺
 ❼ SEV-ES (0) A N/A Launch
❶❷
 ❽ OS (0) C N/A X
❶❷❸❹❺❻
 ❾ Application (3) C LP X
❶❷❸❹❺❻❼❽
 ❿ SGX (3) A LP Launch
Y
TABLE I: Properties of example execution environments on x86. C=Chained TEEs, F=Firmware TEEs, and A=Attested TEEs.

Iii Trustworthy Execution with TEEs

In this section, we examine several well known TEEs and formally categorize them into Attested Trusted Execution Environments, firmware TEEs and Chained TEEs. The TEEs we examine and their properties are summarized in Table I. The “Type” column indicates which of the 3 categories each TEE belongs to, and the other 4 columns correspond to the 4 properties of TEEs discussed in Section II-B. The “Accessible By” column gives addressing/memory protection as a relationship between the TEEs by indicating which TEE can access which ones according to their intended design (i.e. not taking into account vulnerabilities). The access here refers to read/write access to the execution memory of the TEE in question, and for this reason, we include only the x86 TEEs for comparability (e.g., excluding ARM TrustZone).

Iii-a Attested Trusted Execution Environments

We define Attested Trusted Execution Environments to include environments that are traditionally thought of as TEEs, such as SGX, TXT, or TrustZone. Today’s processor architectures usually support one or multiple attested TEEs (even on MCU-like platforms, e.g., TrustZone profile M [98, 123]). The defining features that separate attested TEEs from other TEEs are that attested TEEs provide launch initial integrity and attestation capabilities.

Initial Integrity. Attested TEEs implement launch integrity through explicit measurements upon code loading and data sealing. Moreover, the code loading process also measures the integrity of the environment of the attested TEE, including any other software loaded up to that point. TXT/SVM achieves this by collaborating with the TPM chip (firmware-based, integrated or discrete) as secure storage. The program (including the SINIT module) being loaded is first measured by the CPU, and the measurement is stored in TPM’s volatile memory (Platform Configuration Registers) in the form of hash values. If the hash values do not match with the preset values in TPM’s non-volatile memory (policies in NVRAM indices), execution will be aborted. SGX has this functionality implemented as part of the microarchitecture extension (i.e., inside the CPU), without relying on TPM (in the MRENCLAVE field of the protected memory instead, but non-volatile secrets reside in the SPI chip on the motherboard). Data sealing ensures that a piece of data can only be retrieved on a specific platform when a specific program is running. This is implemented by encrypting and decrypting the sealed data with a key derived from secrets in the hardware and program measurements.

Memory Protection. Attested TEEs also have stronger memory protection. The way memory protection is implemented depends on whether the attested TEE is exclusive or concurrent. A concurrent TEE coexists with the unprotected portion of the system (e.g., in a time-sliced fashion), while an exclusive TEE preempts other code and then destroys the execution context before allowing other code to run.

Concurrent attested TEEs, such as SGX, have special processor mechanisms to provide strong isolation from other EEs, such as both privileged and unprivileged EEs on the same system. SGX has its enclave memory allocated in the Enclave Page Cache (EPC) which is part of the Processor Reserved Memory range (PRM). The PRM’s protection is enforced by the CPU against access from any other software (including SMM) and DMA. A unique feature of SGX is that the enclave memory is fully encrypted when exposed outside the CPU with the MEE (Memory Encryption Engine, part of the CPU uncore), immune to various (physical) memory attacks, e.g., the cold-boot attack [49]. Thus even in the case of exposure from the EPC (e.g., when EPC pages are evicted into regular DRAM), memory content is only seen as ciphertext.

In contrast, exclusive TEEs do not need to defend against concurrent software threats and as a result, for these Attested TEEs, the main focus of memory protection is to defend against DMA access from peripherals and physical memory attacks. An Attested TEE such as TXT relies on Intel’s IOMMU technology VT-d to protect its MLE from being accessed via DMA, by including the memory ranges in the DMA Protected Range (DPR) and Protected Memory Regions (PMRs) [58]

. SVM also has AMD’s Device Exclusion Vector (DEV) support 

[3] for the same purpose. Both TXT and SVM are vulnerable to the cold-boot attack and accessible by SMM, unlike SGX. This can be an example of low-privilege TEEs having stronger protection than perceived from their assigned privilege.

Scope. Attested TEEs vary in scope. Many of the earlier attested TEEs, such as TXT, SVM and TrustZone encompassed the entire system, while SGX is per logical processor. This reflects SGX’s goal to be lighter-weight while the other attested TEEs were seen to be an entire virtual machine, complete with its own OS. However, system-wide TEEs, especially TrustZone, take advantage of their system-wide property by being able to have trusted and possibly exclusive access to hardware peripherals. Most system-wide TEEs are exclusive, in that they cannot execute concurrently with another OS. However, TrustZone [103, 97] (the traditional profile A) is an exception, thanks to its I/O partitioning capability. With the TrustZone Address Space Controller (TZASC) [97], memory-mapped devices can be dynamically partitioned [79], e.g., a part of the screen dedicated to the secure world, allowing concurrent access by both the TrustZone OS and an untrusted-OS running outside TrustZone. We note that in the context of Attested TEEs, system-wide TEEs are thought of as privileged TEEs (pTEE), because they can run privileged code, while local-process TEEs, such as SGX, are thought of as unprivileged TEEs (uTEEs).

Developer Access. Universally, Attested TEEs are meant to be open environments that allow arbitrary developer code to be executed in them. As such, Attested TEEs also provide attestation, where the Attested TEE can sign the code measurements taken at launch with a trusted key to assert to another party the identity of the code executing within the Attested TEE. For system-wide Attested TEEs, this comes in the form of remote attestation, as it attests the identity of the entire system to the remote party. LP-scoped Attested TEEs like SGX, are capable of both local attestation to other code on the same system, as well as remote attestation to code on other systems.

Iii-B Firmware Trusted Execution Environments

Firmware is software that implements functionality that is logically part of the hardware. As a result, while it is software, it is implicitly (axiomatically) trusted just like real hardware. Unlike Attested TEEs which can be considered alternative EEs that implement trustworthy execution for software that requires it, firmware is part of the trusted computing base (TCB) of all software, as it is responsible for critical system operations.

Firmware exists in various components of a computer system, which has an effect on where the firmware executes. System firmware (sFW) executes on the main CPU. This generally refers to BIOS/UEFI firmware, which performs the early (but complex) initialization before bootloaders [18] and the OS. However, sFW is not confined to only boot time—SMI handlers (in System Management Mode) and UEFI Boot Scripts (upon S3 wakeup), continue to run even after system boot is complete. sFW also includes CPU firmware, which is used to implement various instructions that may be called by software. Chipset firmware (cFW) executes on a separate dedicated processor, often microcontrollers in a system’s I/O subsystem, or in co-processors on the system board or system-on-chip (SoC). Examples of cFW include the firmware for the Intel Management Engine (ME), AMD Secure Processor (previously PSP) and the baseband processor on mobile phones. Finally, Device firmware (dFW) executes on a dedicated processor as part of a peripheral. The distinction from cFW here is that these devices are even more isolated from the host CPU, often accessed over a well-defined bus interface, which narrows the possible interactions between the firmware and untrusted code on the host CPU, e.g., SATA, PCI Express and USB. An example of dFW could be the firmware on a hard drive or network card. Another distinction is that the sFW and cFW are usually stored in non-volatile RAM that is shared with other sFW or cFW, or may even be loaded off disk and inserted by the BIOS or OS. On the other hand, dFW is almost universally stored on the peripheral device.

There are a large number of firmware TEEs with diverse properties, and we do not discuss each for the sake of space, but highlight some notable firmware TEEs for each property.

Initial Integrity. One property common to all firmware TEEs is that they provide static integrity as opposed to launch integrity like attested TEEs. This means that the integrity of their code is only checked when it is updated and not when they are executed. There are a few exceptions to this rule. Both S-CRTM (Static Code Root of Trust for Measurement) [47] and UEFI Secure Boot [121] will measure certain firmware such as Option ROMs (which are device-specific firmware executed during boot to initialize a peripheral) during boot. S-CRTM stores these measurements in the PCRs on a TPM, making them available for remote attestation. UEFI secure boot, on the other hand, verifies the ROMs against some specified policy. Finally, CPU microcode is loaded from the BIOS and verified on each boot.

Memory Protection. Each firmware TEE has its own custom memory protection mechanism. The mechanisms may also permit access by some TEEs while denying access by others. Table I lists which TEEs are able to access other TEEs. For instance, CPU microcode is able to access all other TEEs as it helps enforce the ISA abstraction, while its own internal state (e.g., the SRAM) is invisible (abstracted away) to the rest of the system. Intel ME as chipset firmware is not accessible by other TEEs (with shared memory encrypted). For management purposes, it has access to most part of the system with the exception of SMM and TXT. SMRAM is protected by the ISA (see Section IV for attacks) and only exposed to microcode. The S3 boot script is a special case in firmware TEEs but represents certain other UEFI modules: it is granted access to the whole system memory and I/O (to resume the pre-sleep machine state) but where it resides is open to any privileged code (if UEFI LockBox is not implemented).

Scope. In view of the special nature of firmware TEEs, we discuss their scope as follows. CPU Microcode runs underneath (and helps create) the ISA abstraction. Its scope can be considered the entire CPU. Intel ME (or similar chipset technologies) has one instance and coordinates the whole computer system, regardless of other code on the main processor(s). Therefore we consider ME’s scope to be System. Certain UEFI/BIOS modules handle specific system stages (e.g., self-check, power state switching, and device initialization), during the inactivity of other tasks on the main processor(s). Hence the S3 boot script’s scope is assigned System.

Proper scope can ensure a firmware TEE, if used to host defense code, of sufficient coverage for enforcement. For example, mechanisms based on Intel ME are able to overwatch code of various privileges on different processors in the system. On other hand, mechanisms using SMM in a multi-processor environment must consider interaction with other processors, as SMI processing is per-processor [27].

Developer Access. Another distinguishing property between firmware TEEs and Attested TEEs is that Attested TEEs were inherently designed to host developer code, while firmware TEEs are designed for the exclusive use of code belonging to the device or platform manufacturer, and as such, are not open execution environments. As a result, firmware TEEs lack features like remote attestation, and even the specific instances where their measurements are included in attestations were bolted on as after-thoughts to increase the trustworthiness of those attestation by encompassing more of the TCB of the host. There have been some academic proposals to inject code via non-officially supported methods into a firmware TEE to improve host security overall (in particular SMM). We discuss these in Section VI.

Iii-C Chained Trusted Execution Environments

Some EEs do not inherently provide trustworthy execution, mainly because they do not provide initial integrity and are not designed to provide run-time security. However, such EEs can be turned into TEEs by chaining trust and adding secure functionality. For example, kernel mode execution is not innately a TEE, but if we boot a verified secure OS (such as SEL4 [67] for example), and chain trust to it by attesting it using a lower-level TEE such as TXT or a TPM, then the kernel mode EE becomes a TEE. We call TEEs created in this manner “chained TEEs”.

Initial Integrity. The initial TEE to execute is called the Root of Trust (RoT) as it is trusted by fiat. It can then chain that trust to other EEs to make them into TEEs using the following sequence of rough steps:

  1. Establish a mechanism to check and protect the next EE.

  2. Ensure that the EE’s coverage is sufficient for the intended task.

  3. Transfer execution to the checked/protected EE and optionally repeat by going to (a).

For example, an initial TEE (e.g., UEFI, as the Root-of-Trust TEE) boots the OS as the next TEE with Secure Boot or Boot Guard, and then the OS can check and run an appropriate anti-virus tool. Most Attested TEEs can act as the Root-of-Trust TEEs. In “late launch” technologies (Intel TXT and AMD SVM), the privileged TEE backed by hardware directly bootstraps an OS/VMM without relying on UEFI/BIOS; furthermore, an unprivileged TEE (uTEE, e.g., Intel SGX) can securely bootstrap the ultimate user task skipping also the OS/VMM. Note that in both cases above, legacy firmware still remains part of the TCB to a certain extent, but is not the Root-of-Trust TEE for the chain of trust establishment.

Memory Protection, Scope and Developer Access. Such chained TEEs are generally the VMM, OS or application code (if trust is chained from the privilege layers below). As such, their memory protection and scope are that of the VMM, OS or application code depending on the program being executed and configuration. We do not consider the scope of the OS (correspondingly the SEV/ES), as one OS may span processors/LPs but there can be multiple instances in the case of virtual machine guests (under a VMM). In general, these systems derive memory protection through the processor MMU and are system-wide for the VMM, and limited to a logical processor for an application. All are open EEs and support the execution of developer-specified code.

more integration
more offloading/discreteness
EEArch ARM x86 System Z
dFW
Stand-alone
entities
HMC
SE
CSS
CU
Director/Switch
Characterizable
Processors
zIIP
IFL
ICF
cFW
Apps on
co-processors
AMT
PAVP
fTPM
i390 code
(on SAPs)
CFCC
(on ICF)
Co-processors
Baseband
Apple SEP
SCP/MCP
Intel ME
AMD PSP
SAP
IFP
sFW Various SMM PR/SM
Privilege
system
EL/PL0–3 Ring0–3 16 keys x 2 states
pTEE TrustZone
Intel TXT
AMD SVM
AMD SEV
uTEE Planned222As of this writing, no official documentation or public information is available for ARM Bowmore, a technology for isolating individual workloads. We note it here for completeness. Intel SGX
  • Refer to Appendix for the glossary and further explanation.

TABLE II: Distribution of EEs across different platforms (examples).

Iii-D TEEs across architectures

In light of the importance of TEEs in establishing (the chain of) trust for a computing platform, we review the presence and positioning of TEEs across mainstream architectures (refer to Table II). ARM (most mobile platforms), x86 (most PCs and servers) and mainframes (aka, System Z, e.g., in data centers) are included.

Observations. We notice that the positioning of TEEs reflects the purpose of the corresponding platform. ARM/x86 aim to be tightly integrated, migrating more towards few co-processors that are either on the board or even on a “system on chip” ; while System Z aims to be highly modular, and contains many discrete components. Compared to commodity platforms like ARM and x86, System Z trades-off cost for greater security, availability, reliability and performance. To reduce the chance of single point of failure and to offload processing to task-specific components, mainframes like System Z have the following distinctions in terms of TEEs:

  • Offloading to co-processors. System Z is a coprocessor-rich platform [122]. There are basically two types of co-processors: 1. Unconfigured generic processors. Physical processor units (PUs) are shipped generically and must be configured (characterized [53]) by the customer for a purpose, such as zIIP (for Java and database workloads) and IFL (for Linux workloads). 2. Configured processors. Certain PUs are configured with a default purpose by the vendor (corresponding to co-processors on x86/ARM), e.g., IFP runs firmware that is specific to certain PCIe features. These co-processors contribute to the performance and availability of System Z.

  • Stand-alone components. Components that are usually integrated on PCs take the form of one or multiple stand-alone devices on System Z, e.g., BMC (on-board) vs. the Hardware Management Console and the Support Element (a separate laptop).

  • Full-stack implementation. Furthermore, compared to regular containers (LXC/Docker) which share the underlying abstraction layers up to the OS, IBM Secure Service Container (SSC [54]) has its dedicated hardware, firmware and OS in a physical box.

  • Fine-grained privilege levels. There are 16 storage access keys (combined with the 2 states/privileges), assigned to different workloads individually. This forms an architectural support for fine-grained privileges.

To maximize security out of co-located entities (as a result of integration for portability), ARM/x86 tends to have rich support for Attested TEEs, which are intended for isolation (enhancing abstraction) in a shared computing environment.

Iv Abstraction Underdone

As an important form of trust establishment, TEEs are expected to abstract information away sufficiently from designated entities (hence achieving protection). However, even if the hardware implementation complies with the abstraction specification (e.g., an ISA), information can still be leaked in one way or another. We now examine how insufficient abstraction opens the door for attacks.

Side channels. The term side-channel attack originated from cryptography [69]. It has been used to refer to secret extraction from unintended channels such as timing, power, electromagnetic and acoustic channels. We generalize the side channel and define it as an unauthorized communication channel caused by implementation details that are not specified by the abstraction specification, such as algorithm, protocol, architecture and interface standard.

A TEE may have been properly implemented regarding what should be hidden and what should be exposed, but it has been shown in multiple incidents that the intended abstraction is insufficient.

Synchronousness. As side channels imply the presence of another entity (the adversary), whether that entity simultaneously and actively extracts information from the victim code determines the synchronousness.

Exclusive attested TEEs (with the scope of System) are naturally immune to synchronous side-channel attacks, as they do not execute in the presence of any other software. We do not consider physical side channels here (e.g., power analysis attacks [62]). This is also reflected in the fact that almost all concurrent TEEs suffer from side-channel attacks to a certain extent (see Section IV-B).

As for exclusive firmware TEEs, direct run-time side-channel attacks are not applicable. However, attacks caused by improper protection of their memory (when inactive) are common, as exemplified by the attacks discussed in Section IV-C.

Iv-a Micro-architectural side channels

The processor architecture (e.g., the ISA) forms an abstraction layer between software and hardware. When the ISA specification is complied with, side channels caused by the underlying implementation are micro-architectural.

Most micro-architectural side channels are timing-based, i.e., extracting information by measuring temporal characteristics of architectural artifacts. This is because while the ISA abstraction specifies instructions and visible architectural resources like registers, it does not provide an abstraction for timing, instead allowing the underlying implementation to determine operational delays in the interest of maximizing performance.

There are two aspects to such side channels:

  • Information extraction. Certain (mis)behaviors effectively convert micro-architectural data to architectural, to be extracted. Two subcategories exist: 1. Memory content. The wide variety of cache-based side channels exploit the sharing of different cache levels to derive secrets from cache access (hits or misses), as represented by Prime+Probe [101] and Flush+Reload [130]. In addition to regular cache, the Translation Lookaside Buffer (TLB) as the cache for the Memory Management Unit (MMU) also leaks information (e.g., TLBleed [43]). Such indicates the abstraction flaw of cache memory beyond the ISA. 2. Execution metadata. The CPU’s Execution Unit, if not properly abstracted, can leak diverse metadata about ongoing execution. PortSmash [1] can time the contention latency of execution engine ports, and infer instruction traces based on port assignment difference. Nemesis [117] learns the current instruction in execution according to interrupt handling latency. Note that memory content can be further extracted based on the execution metadata.

  • Channel control. Incomplete abstraction can also enable the attacker to better control what is leaked over the side channel. For example, vulnerabilities like Meltdown [82] and Spectre [68]

    exploit the incomplete abstraction of speculative execution to select what information is leaked with greater probability over cache-based channels (for the actual information extraction).

Iv-B (In)security of concurrent Attested TEEs

Most attested TEE side channels are also micro-architectural side channels with certain adaptation specific to the Attested TEE.

SGX heavily suffers from side-channel attacks due to its concurrent but unprivileged nature, ranging from branch shadowing [78], cache attacks [12], SgxPectre [19] to the fatal Foreshadow attack [14]. There exist also mature SGX-specific compromise tools to facilitate attacks, e.g., SGX-Step [116] that single-steps enclave code. All in all, regular side channels might be alleviated by programmer diligence and involving another root anchor (e.g., Intel TSX [108, 20]), and Foreshadow is patchable (although its long-term influence is an open question). TrustZone, as a concurrent attested TEE is also vulnerable despite its ability to run privileged OS code. Side channels have been identified as in TruSpy [140] and TruSense [141]. Nevertheless, AutoLock [44] demonstrates that they might be more difficult than expected. SEV, which enhances VM guests with memory encryption, is vulnerable to secret extraction attacks [94, 93], due to a malicious VMM being able to execute concurrently with a victim VM.

Iv-C Abstraction for firmware TEEs

Compared to the abstractions used to construct Attested TEEs, firmware TEEs have comparatively simple abstractions as they tend to execute exclusively and tend to have very little interaction with application code in the host CPU. Instead, the abstraction failures tend to lead to lapses in access control that allow adversaries to corrupt firmware TEEs, leading to compromises of the initial integrity of firmware TEEs. We survey several documented instances of such lapses below.

System Management Mode (SMM): System Management Interrupt (SMI) handlers are stored in a RAM region called SMRAM, which is protected by the CPU at all times (as in the abstraction specification). The content of the SMRAM determines the initial integrity of the SMM TEE (after loaded from the SPI flash).

In early motherboards, access was possible from any kernel-privileged code, because previously the D_LOCK bit in the SMRAM Control Register (SMRAMC) was not taken care of by the BIOS.333Once D_LOCK is set, SMRAMC becomes read-only irreversibly until reboot, locking down D_OPEN (which makes SMRAM visible.) Later, SMI handler compromise could take several forms (in addition to the SPI reflash attacks [107, 128, 64]): 1. through the memory reclaiming mechanism (e.g., intended for saving space wasted by MMIO) [106] (patched on certain machines). Note that remapping is locked in Intel TXT mode [55]. 2. cache poisoning [127, 34]: access to improperly cached SMRAM content (fixed with the SMRR register). 3. SMM callout vulnerabilities [5]: SMI handler branches outside of SMRAM (fixable with SMM_Code_Chk_En). 4. attacking argument passing to SMI handlers [11]: tricking the SMI handler to overwrite SMRAM. See Section V-A for attack details. All these are to do with (previously) unspecified aspects between the SMM TEE and the rest of the system.

UEFI Boot Script (which is run when the system wakes up from S3 sleep) can also be altered maliciously [124], thus allowing arbitrary code execution. The root cause is that the EFI variables or their copies (in this case a pointer to the Boot Script) are not properly protected. An attack on UEFI Secure Boot [65] was based on a similar approach (i.e., modifying an EFI variable storing the boot policy).

Microcode: x86 microcode updates are initiated by writing to model-specific registers (MSRs) and accepted after certain cryptographic verification.444Microcode patches are not persistent and are reloaded during the early boot process (e.g., from CPUCODE.BIN in the SPI flash). An early documented attempt was found in an anonymous report [4] which showed an example of abstraction underdone that even if only vendor-verified updates are allowed, an attacker in control of this process can still choose to patch microcode lines that facilitate his attack (there are multiple slots available).

cFW TEEs. Chipset FW TEEs, on the other hand, can suffer from inadequate abstraction both at run-time (synchronous) or when inactive (asynchronous), due to their concurrency on another processor while sharing the code storage with the main processor.

Intel Management Engine (ME): ME’s intended functionality for full-control out-of-band management requires bulk data transfer capability (in addition to HECI for signalling or small amount of data) with the main processor. Therefore, DMA is constantly active via a mechanism called UMA (Unified Memory Architecture), which is used between the GPU and the CPU. Due to the limited memory space on the ME processor, it uses the UMA region (like stealing part of the host memory) as its execution RAM [110]. This opens up an attack vector. Researchers have already started their exploration since the early days of ME (see [114] by Tereshkin and Wojtczuk). Basically, the approach was similar to that of the SMM compromise: remapping the ME UMA region (otherwise protected) to be accessible by the main CPU. Fortunately (or unfortunately for the defense-purpose community) Intel introduced UMA protection for both integrity and confidentiality (using encryption) [105].

dFW TEEs. Contrary to sFW and cFW, dFW does not share the processor or the code storage, and is exposed to the main processor only through a (limited) peripheral interface. Therefore, dFW’s security rests less on abstractions in the processor and more on those at their interface.

A recent analysis of the (in)security of today’s Self-Encrypting Drives (SEDs) identified several vulnerabilities [88] (similar to regular SSDs). Worryingly, not all attacks require physical access— software-only reflashing attacks using undocumented vendor-specific commands (VSCs)555These VSCs are like regular commands sent through the SATA or NVMe interface. could potentially be performed after a privilege escalation. The attacker’s code would remain on the peripheral even if the operating system is wiped and re-installed.

Indeed, dFW has been a battlefront for attacks for decades. Examples are not rare, e.g., hard-drive backdoor [135], network cards [36] and video cards. Zhang et al. proposed IOCheck [138] using SMM to monitor and verify the integrity of the firmware of various devices. Hendricks and van Doorn [51] gave a very high-level description of how device firmware can be verified in a trustworthy manner. However, the implementation of their proposal remains an open problem, largely due to the heterogeneity of the devices.

V Abstraction Overdone

From the discussion in Section IV, we see that abstractions can still allow information flow in unintended ways, resulting in side channels across shared (though logically separated) hardware resources (e.g., processors sharing the same last-level cache, firmware TEEs sharing the same SPI flash chip, etc.). Conversely, we also find that abstractions, in their goal to attenuate information flows, sometimes go too far and hide information that needs to flow. In many cases, this results in a hardware reference monitor enforcing an incomplete policy due to incomplete information. We call this “Abstraction Overdone”, and analyze some examples of this phenomenon in this section.

V-a Insufficient information flow for enforcement

By design, the CPU maintains a number of registers (or state information in other forms) that are internal, and the chipset also has its own internal state, e.g., in the case of the PCH (Platform Controller Hub). Some of the previous attacks exploited such asymmetric information, e.g., different views between the CPU and the chipset.

With Intel, there is a mechanism to reclaim the memory lost to MMIO below 4GB, exposing two registers REMAPBASE and REMAPLIMIT in the Memory Controller Hub (MCH, aka the NorthBridge) [55]. At the same time, critical regions, such as the SMRAM region should not be remapped and exposed to any software running on the CPU. However, the abstraction of SMM says that this mechanism should not be exposed to any components outside of the CPU (or even software on the CPU), and as a result, the chipset is not aware of this restriction (e.g., remap should be checked against SMBASE, a Model-Specific Register). Rutkowska and Wojtczuk [106] found that this omitted information flow allows SMRAM to be remapped using the reclaiming mechanism and made accessible to code on the CPU, which is a violation of SMM’s security guarantees. A strikingly similar vulnerability was found against the seemingly more powerful Intel ME, by remapping part of the Unified Memory Architecture (UMA) using the same reclaiming mechanism666ME requires UMA for run-time storage, due to the limited memory capacity of its own microcontroller. (see [114]). Even though later Intel introduced UMA protection so that this region is fully encrypted [38], it does not address the root cause, which is that abstractions prevent the reclaiming mechanism from checking remapping requests against some global list of sensitive memory ranges. With AMD, there is still no exception: the Input Output Remap Registers (IORRs) can be used to achieve the same purpose [142].

Caching is another abstraction where insufficient information flow has resulted in vulnerabilities. Code running on the CPU can determine whether/how memory regions can be cached, through the Memory-Type Range Registers (MTRRs). However, since the MCH does not see more than what is specified by MTRRs, it is unaware of memory access restrictions similar to the reclaiming mechanism. For example, a cache poisoning attack proposed by Duflot et al.[34] and again by Wojtczuk and Rutkowska [127], found that an adversary could modify the “exposed” SMI handler while it resided in the cache without accessing the SMRAM directly.777We see some indication [42, 31] that Intel was also likely aware to some extent of these vulnerabilities before their public disclosure. On the next invocation, the modified SMM handler will run. Duflot et al. also discussed a more efficient scheme to make the attack persistent and not confined by the cached size. Both assumed that the original SMI handler did not flush caches before the RSM instruction. The fix for this problem is the System-Management Range Register SMRR [59], which complements the MTRRs by allowing the chipset to specify memory ranges that should not be cached (and can only be modified while the processor is in SMM).

If we examine how SMM attacks and defenses have evolved over the past decade (Figure 1), we see that it has been an arms race. Namely, SMRAM is supposed to be protected, but what would be the defense vectors depends on what has been discovered by the community, gradually in an attack-driven manner. While the MTRRs allowed information about the SMM memory range to cross the privileged-code EE abstraction to the OS kernel, the abstracted interface between the chipset and the CPU did not transmit this information until the SMRR was added. In every instance, the root cause can be attributed to an overly strong abstraction preventing information about an access policy from flowing to the appropriate reference monitor.

Fig. 1: The arduous journey of SMM defences.

Vi Proposed TEE-based security approaches

Defending against malicious code from outside (as defined in Section II-B) is usually the primary goal of security solutions (while avoiding internal code misbehavior can be the next step). This largely depends on TEEs that abstract away information/access from malicious code, e.g., in the form of memory protection or isolation. In this section, we look at proposals that use TEEs to improve the security of systems.

Usage 1: Exclusive pTEEs for initial integrity with a lightweight mechanism for run-time protection. A number of solutions use privileged TEEs (pTEEs) to securely bootstrap the system into a good state, then hook critical operations together with metadata with a run-time mechanism. The rationale is as follows: 1. pTEE’s exclusiveness and load-time integrity measurement is used to bootstrap trust. 2. The exclusiveness determines no monitored code can run in parallel, while switching back and forth with the monitored code imposes significant overhead, which is common for Attested TEEs. Note that hosting the whole solution/system inside the pTEE is infeasible either due to bloated TCB. 3. Therefore, for run-time protection, a common practice is to choose a hardened but lightweight mechanism. For example, SMRAM’s memory protection (inaccessible when SMM is not active) ensures continued integrity, and SMI’s non-maskability enables it to preempt and monitor the rest of the system in a way that cannot be disabled.

Hypervisor integrity: HyperSafe [120] makes use of tboot (which is actually based on Intel TXT) to bootstrap a solution for hypervisor integrity protection. It achieves a hardware-based memory lockdown for hypervisor pages by first protecting the pages with WX, then trapping any writes to page tables with the WP bit in CR0. The enforcement logic (e.g., unlocking) is implemented in the page fault handler. Here the initial integrity enforced by TXT is critical for both the WP bit and other logic such as the page fault handler.

HyperSentry [7] proposes hypervisor-unaware integrity checks (to prevent a scrubbing attack where a compromised hypervisor hides the compromise from the monitor). The triggering logic is located in SMRAM as SMI handlers, but the core checking logic runs outside. To address the checking logic’s lack of access to VM state information, the authors came up with a fallback mechanism with two consecutive SMIs that ensures landing in the VMX root mode. The Intelligent Platform Management Interface (IPMI) is used as out-of-band signalling to trigger SMIs. HyperSentry assumes proper SMRAM protection from BIOS, in the form of trusted boot (S-CRTM) with TPM alone, instead of DRTM with TXT/SVM.

HyperGuard [106] and HyperCheck [119] are two other examples of using SMM for hypervisor integrity. The difference is that HyperCheck outsources the core logic to the network (a remote machine), with the network card driver also in SMRAM; whereas HyperGuard collaborates with a chipset-based mechanism DeepWatch [15] (a trustlet in Intel ME). Both assume proper firmware protection (e.g., initial SMRAM integrity).

Guest OS integrity: CloudVisor [139] protects the integrity of VM guests under the threat of compromised hypervisors. It also relies on TXT/SVM to ensure clean initialization. Then it provides a tiny security monitor using nested virtualization (equivalent to a lower-level VMM) to enforce isolation and protection of resources used by each VM guest. Sensitive operations such as NPT/EPT faults, critical instructions and I/O are trapped and examined by the tiny security monitor.

Usage 2: Monitor directly from concurrent pTEE. A concurrent pTEE can also be used to host an monitor, which directly monitors the OS/VMM without the help of an intermediary like SMM. Such a combined solution is not possible on x86 since the Attested TEEs on x86 are either unprivileged (uTEEs like SGX) and thus cannot monitor an OS/VMM, or switching is too expensive (exclusive pTEEs like TXT/SVM). The advantage of ARM TrustZone is that it is privileged and concurrent ( having low switch overhead as a result) at the same time. Examples of this usage are TZ-RKP [6] (which led to Samsung KNOX) and SPROBES [41] to protect kernel integrity (in the normal world) by handling critical kernel events in the secure world. In both systems, kernel binary rewriting is needed to replace sensitive instructions with invocation of the Secure Monitor Call (SMC, a mode for switching to the secure world), which will invoke the TrustZone monitor.

Usage 3: Containerization/isolation (TrustZone, SGX and SMM). Thanks to TEEs’ addressing restrictions and memory protection (see Table I), they become a good candidate for containerization or isolation enforcement. While low-cost software sandboxes exist [132, 9], a hardware-backed TEE can be “root-secure”, meaning that they can withstand an adversary who has privileged code access (i.e. control of OS or hypervisor).

Considering that there currently exist no fine-grained (i.e., unprivileged) secure application environments in the normal world on ARM, Sun et al. propose TrustICE [112]. Multiple isolated computing environments (ICEs, similar to Intel SGX enclaves) can be created in the normal world, managed by a trusted domain controller (TDC) in the TrustZone secure world.

In an effort to peel the host OS or hypervisor off the TCB, but accommodate secure workloads in parallel, SICE [8] employs SMM to create and manage enclave-like isolated computing environments (aka. ICEs). It supports two modes: time-sharing (legacy host and ICEs run alternately) and multi-core (dedicated cores for ICEs). The latter takes advantage of SMM’s per-core exclusiveness. Note that SICE only uses SMRAM as a shelter to store the ICEs (up to 4GB); SMM as a mode only prepares and enters the isolated environment, with the ICE code running in regular protected mode. As with the use of TrustZone, in SICE the legacy host (e.g., the hypervisor) is still required to add an interface that invokes an SMI to trigger SICE.

Scotch [75] combines a uTEE (i.e., SGX) and SMM to achieve reliable Hypervisor resource accounting, using both SGX and SMM as isolation mechanisms. The strong preemptiveness of SMIs is used to forward all necessary events (interrupts and hypercalls) to SMM where secure accounting code runs and results can be communicated to and stored in the guest-side SGX space.

We can also consider lifting privileged code into uTEEs for finer granularity, as is done by Richter et al. [104] in porting certain OS components to SGX. In addition to containerization for protection granularity, SGX’s strengths over pTEEs are also important for the OS, e.g., immunity to SMM attacks and encryption against memory attacks. To showcase the benefits and feasibility, they have adapted the encryption module dm-crypt and move it to the SGX enclaves. The limited number of eligible components and huge performance loss may hinder its adoptability.

Apart from the TEEs above, researchers also pay attention to underused x86 privileges (currently only Ring 0 for kernel space and Ring 3 for userspace) to form intermediate protection for sensitive user-space tasks. LOTRx86 [77] is proposed for such a purpose. It defines a new mode (PrivUser) run in Ring2-x32 and uses Ring 1 as a Gate mode in and out of the PrivUser mode. Sensitive per-application operations (e.g., operations related to memory safety) can be harnessed at a privilege higher than the rest of the application but still lower than the OS, hence ensuring mutual security.

Usage 4: Secure user-machine interaction. As mentioned in Section I, trustworthy execution also involves secure data exchange with the human user or peripherals, besides external malicious code and internal buggy code. If data input/output is intercepted, the execution logic’s correctness alone is no longer leading to trustworthiness.

Exclusive pTEE without I/O partitioning: If the attested TEE cannot be assured that untrusted code does not have access to peripherals, it has to be exclusive to achieve secure user interaction. Secure user input is usually considered together with what the user sees/perceives (i.e., secure output), UTP [39] hooks pre-configured user data entry events/transactions, such as confirming an online purchase. It redirects the user to an attested TEE session (TXT/SVM reusing the Flicker [86] framework) to see a simplified display of the transaction, takes user input (confirmation) from the keyboard, and sends the attested two pieces of information to the server. The OS is resumed after such a session. In this case, both the physical keyboard and display are occupied by TXT/SVM because of its exclusiveness, hence considered secure. Very similarly, Bumpy [87] also makes use of TXT/SVM with Flicker to protect user keyboard/mouse inputs. It involves more components for better usability and security: a USB interposer (an ARM board that could be later integrated to the keyboard/mouse) and a Trusted Monitor (a smartphone). A keystroke is encrypted by the USB interposer, processed and verified by the Flicker session, user-confirmed on the Trusted Monitor and sent to the server.

TrustLogin [137] employs SMM to secure password-based login with a novel method of short-circuiting the OS, i.e., intercepting keyboard activities (source) and NIC packets (sink). Considering trusted display alone, Yu et al. propose an attested TEE+GPU approach [134] that introduces a microhypervisor-managed GPU separation kernel to serve the unmodified OS/applications and the secure applications at the same time. They employ XMHF [118] as the microhypervisor with the TrustVisor [85] extension, which uses TXT/SVM for initial integrity.

With I/O partitioning: On ARM platforms, secure user interaction seems to have attracted more attention, possibly due to the rich sensor environment and highly personal data storage. As mentioned in Section III-A, TrustZone has I/O partitioning capability with its Protection Controller (TZPC) or TrustZone Address Space Controller (TZASC). This provides two important advantages: 1. It ensures exclusive resource access. Because of partitioning, even outside TrustZone, the normal world still cannot access the protected resource. 2. The allocation of I/O resources between the secure and normal worlds can be changed dynamically.

  • User interaction. TruZ-Droid [133] proposes to move the sensitive UI interaction into the TrustZone secure world, while still maintaining the binding between the UI interaction and normal-world app code through indirect references (server-side). The user enters sensitive data and confirms transactions only with the Trusted Applications (TAs) in the secure world, discernible with a hardware Indicator LED. A similar approach is applied by TrustUI [80], which also provides a trusted path between the user and the mobile device using input and display randomization. Both proposals rely on the TZPC.

  • Peripheral/sensor management. SeCloak [79] is designed to securely and verifiably place the device in a user-approved state (on/off). It leverages both TZASC (as secure memory of the s-kernel) and Central Security Unit (CSU, a custom TZPC).

It is obvious that since secure input/output concerns physical I/O operations (privileged in almost all architectures), the involved TEE/EE must also be privileged to perform any checks.

Vii Exposing More for Security

TEEs assume a model mainly with external threats which are addressed by information hiding (abstraction). TEE-based security solutions achieve trustworthy execution by ensuring that no code or data tampering from outside can slip through. However, except for certain simple programs that can be formally verified, software bugs almost become a destined byproduct in pursuit of higher performance and lower software development costs. If the programs are not written in type-safe languages [90], attackers may feed malicious inputs to exploit some of the dangerous bugs to corrupt or subvert the programs [113]. Another complicating factor is that software per se is not monolithic; user-level programs may load third party libraries, similar to operating systems loading drivers. The code available for inspection may only end up occupying a small portion at run time, and loading a buggy component may result in the entire program to be vulnerable. Consequently, TEEs may not be able to defend against such internal misbehavior.

What is worse, software misbehavior is hard to efficiently address in an either software-only or hardware-only manner alone. Due to the growing complexity of software, instrumented code introduces significant performance overhead, e.g., as in EffectiveSan [32] and WPBOUND [131] (instrumentation is necessary to expose execution metadata). On the other hand, it is almost impossible for hardware alone to distinguish correct execution from misbehaved execution. Software semantics like buffer bound, data type, or dynamic allocation is hidden above the software-hardware interface (i.e., the ISA). This situation is very similar to the abstraction overdone as discussed in Section V, which can be alleviated by exposing more across the interface (decreasing abstraction).

Exposing more across the ISA interface effectively leads to software-hardware collaboration in addressing intra-EE misbehavior. We note that many of these solutions are not TEE specific, and can be applied to any EE, trustworthy or not. However, they can be applied to TEEs to significantly increase the assurance that the software in the TEE is free of internal misbehavior. Corresponding solutions usually fall into two categories based on the information flow direction:

  1. Hardware-to-software (execution metadata). The hardware feature passively collects information from the execution trace of the monitored code, and sends it to a monitoring component to detect misbehavior.

  2. Software-to-hardware (application semantics). The hardware feature learns application semantics from the monitored code through certain hints (explicitly like new instructions or registers or implicitly as in CFI [52]), and stops the execution when specified rules are violated.

Control/data-flow attacks and other memory safety problems have been systematically studied [113] in the literature. We do not repeat the discussion (e.g., the taxonomy of memory attacks) and consider them all as intra-EE misbehavior.

Software-hardware collaboration usually involves a monitoring component that assists the local hardware for policy/metadata management. Its implementation can range from local software for simplicity, a remote server for flexibility, to dedicated hardware for performance. For instance, LiteHAX [28] is a remote monitoring and attestation scheme against both control-flow and data-only attacks on embedded devices, which uses a remote computer for analysis. In the case of local software monitoring component, we assume proper isolation/protection from the monitored code, as is addressed by TEE-based approaches.

Vii-1 Hardware-to-software

Hardware passively collects execution metadata in collaboration with its monitoring component. Note that hardware involvement is required here as the monitored code is not trusted to provide such metadata and other software has no access to it. As the monitored code does not need to cooperate, an advantage of the hardware-to-software approach is backward compatibility, allowing uninstrumented original code to be monitored.

Coarse-grain control flow integrity with existing hardware. The ISA typically provides dedicated instructions to perform indirect control flow change (e.g., call or ret), and thus hardware can infer control flow information directly, assuming that most programs use these instructions in the intended way. For example, the Last Branch Record (LBR) registers store a limited amount of trace information, as is used by ROPecker [21] and kBouncer [102] against Return Oriented Programming (ROP) attacks. They rely on patterns of control flow change as the attack signature, as well as looking for suspicious short code sequences as an artifact of ROP. Similarly, the Intel Processor Tracing (PT) also provides activity traces but with more details and control. GRIFFIN [40] uses Intel PT to enforce both forward-edge and backward-edge control-flow integrity.

Limitations of execution metadata without application semantics. Metadata other than control flow is usually not retrievable with the existing ISA, e.g., data access information. While modified hardware can catch such metadata, it is still insufficient for data-oriented attack mitigation without application semantics. For example, ARMOR [45] (an approach to ensure data accesses within the allocated ranges) cannot confine each access to the intended object due to lack of semantic information.

Monitoring privileged software.. Critical data structures in OS/VMM tend to be more deterministic for hardware to monitor without per-application semantics. Kernel integrity monitor solutions are typical of such type, e.g., Copilot [63] (periodical entire memory scanning with a PCI card), Vigilare [92] (bus traffic snooping), KI-Mon [76] (monitoring with a co-processor) and MGuard [83] (using both a modified DRAM module and a co-processor).

Vii-2 Software-to-hardware

While hardware-to-software abstraction reduction preserves backward compatibility, limited application semantics becomes the major hurdle when better coverage or precision is needed. Therefore, exposing more semantic information from software to hardware becomes helpful. Instead of directly from the monitored code, this is usually done by a dedicated software component (e.g., a compiler) instrumenting the monitored code to expose its semantics (although programmer annotation is also seen). The ISA is augmented with new instructions and/or registers.

Precise control flow integrity with application semantics. With valid code pointers marked by the monitored code, misbehaving execution that jumps to an unexpected address can be more easily identified. Intel CET [60] is a complete CFI defense for all privilege levels. For forward-edge protection, CET introduces a new instruction ENDBRANCH to mark valid indirect control transfer destinations, and the processor uses state machines to ensure proper indirect branch landing. A traditional shadow stack is used for backward-edge protection. HAFIX [25] is based on the observation that a function cannot return if it is not called yet, targeting coarse-grained back-edge control flow protection. Each function is assigned a unique label (marking start and end) and tracked with dedicated label state memory. New instructions are introduced to update the label state.

Memory access control using an application policy.

In hardware-to-software solutions to data-oriented attacks, memory requests are coarsely categorized as instruction fetch, load, and store, and bad requests can slip through if the page table entry permits it. With an augmented ISA, software now can express its semantics in a way that hardware understands, e.g., object bounds and types. Hardware can use specialized logic to accelerate costly metadata maintenance and policy check. Hardbound [29] is based on fat pointers for memory safety. Pointers in the monitored code are extended with a base and bound address pair, allowing hardware to perform bound checks upon each memory access. The compiler and run-time library instrument the monitored code with new instructions to manage bounds. Shakti-T [89] introduces another level of indirection to reduce the overhead of fat pointer bound loading. Instead of carrying pointer bounds with pointers, they carry indices of pointer bounds, and all pointers having the same index will share a pointer bound. HardScope [99] shifts from object bounds to enforcing language variable scoping. A rule stack is maintained by hardware and the compiler instruments the monitored code with new instructions to manage rules. Similar approaches are also applied on COTS computers, e.g., Intel MPX [100], MPK [115] and ARM Pointer Authentication (which PARTS [81] is built on).

Information Flow Tracking with tagged memory. Metadata tags can be associated with memory locations to track dynamic information flow. Tags are propagated by hardware at run-time according to configured policies and anomalies are reacted to also based on policies. Such policies can be specified statically as in Raksha [24] and SDMP [30], e.g., control-flow graph (CFG) passed in at compile time. Also, new instructions/registers can be introduced to the ISA so that policies can be passed in at run-time with better accuracy and granularity, as in HDFI [111]. Loki [136] is a similar approach but aims to offload access control enforcement to hardware for OS kernels. Security labels of kernel objects are translated into tags by a small trusted monitor and checks are enforced by hardware on each memory access.

Vii-3 Software participation in metadata management

While exposing application semantics to hardware is sufficient in most cases, the isolation between the monitored code and hardware-maintained metadata is a double-edged sword. On one hand, it ensures the integrity of metadata even after compromise. On the other hand, hardware must implement all metadata maintenance operations, which can be costly in terms of performance or chip area for hardware. Moreover, certain high-level security policies (e.g. data freshness or confidentiality) may require software support to enforce.

For such participation, dedicated logic is added to the monitored code ranging from compiler-instrumented instructions to full-fledged run-time libraries exposing developer APIs. In this case, extra care is needed for metadata protection as to how it is isolated from potentially vulnerable code. Watchdog [95] uses a traditional fat pointer scheme for spatial memory safety, and a lock-and-key approach for temporal memory safety, where each object is associated with a lock, and only pointers with a matching key can dereference the object. Checks and metadata propagation are performed by hardware (there is also the software-only WatchdogLite [96] using compiler instrumentation). Low-fat Pointers [73] is a fat pointer scheme that uses compressed pointer bound encoding to make it fit into the unused portion of a pointer address. It requires software to place allocated objects properly so that pointer bounds can be efficiently encoded, and encoded pointer bounds are naturally part of pointer values, both in register and in memory. This design avoids the extra memory overhead of pointer bounds, and allows hardware to perform bound check in parallel during pointer dereference.

Viii Discussion

Relationship between Firmware TEEs. One observation we have made is that many firmware TEEs are co-dependent on each other, yet are poorly isolated from each other. While the abstractions between hardware and software create well-defined interfaces, there are no such abstractions between hardware components, a category in which firmware components are also often arbitrarily swept under. To illustrate this complexity, consider how the Intel ME [105] is blocked from accessing both SMRAM and Intel TXT, as the ME environment has a large TCB and many vectors over which it can be attacked.

Also, the TCBs of most attested TEEs include the firmware TEEs. The original positioning of DRTM (Intel TXT and AMD SVM) was to remove the need for a long chain of trust in SRTM (which included the BIOS for example). DRTM alleviates this by allowing the start fresh with a minimal TCB after boot (the CPU and optionally a tiny authorized module), which can measure and launch a new VMM/OS—hence the name “late launch”.

However, DRTM’s TCB does include other firmware (see TCG’s DRTM Architecture [46]). For example, SMM is not measured by DRTM for initial integrity, which is why SMI handlers compromised before entering TXT [126] could evade DRTM detection and thus take over the whole system (STM [129] is intended to address this). One of Intel SGX’s design motivations for running only unprivileged code again was TCB reduction [57], i.e. to exclude BIOS/UEFI as well as the OS/VMM. However, even SGX still depends on CPU microcode, another firmware TEE that is a potentially subvertable component. Furthermore, certain SGX components reside in ME as a trustlet, e.g., the later-added monotonic counter [105]. There have been attempts to add interfaces between firmware components to compartmentalize them. One example of this is the SMI Transfer Monitor (STM) [56], which limits the trust that other components must have in the SMI handlers. While STM cannot be considered a full abstraction and still requires trust in other firmware TEEs, it at least reduces the attack surface for those firmware TEEs.

Towards greater openness. A significant difference between firmware environments and Attested TEEs is that Attested TEEs are intended for users to deploy custom code, while firmware TEEs are not exposed to users or even software developers (refer to column Dev in Table I). This means that the firmware TEEs are largely undocumented, though this naturally does not mean they are any more inherently secure than open TEEs. This leads to opportunities for research.

Some of this research leads to changes to improve security, such as SMRR [59]. As another example, the group of Koppe and Kollenda have used reverse engineered microcode to change the behavior of earlier CPU models (specifically AMD K8 and K10) [71, 70]. An interesting direction for this line of work is the standardization of firmware access, which could lead to greater customization and modification of hardware [70].

Ix Concluding Remarks

We explain the perceived advantages/disadvantages of hardware security features, and the root cause of the notorious low-level attacks (firmware or side-channel), using the concept of abstraction, which is ubiquitous in computing. We find that abstraction either underdone or overdone can lead to vulnerabilities, many of which have commonalities across various TEEs. We draw a lesson that future researchers and secure hardware designers may do well to examine their abstractions, not just for ease of programmability, maintenance and performance, but for whether they are fulfilling their security requirements in terms of the information flows they allow.

References

  • [1] A. C. Aldaya, B. B. Brumley, S. ul Hassan, C. P. García, and N. Tuveri, “Port contention for fun and profit.” IACR Cryptology ePrint Archive, vol. 2018, p. 1060, 2018.
  • [2] AMD, ARM Architecture Reference Manual: ARMv7-A and ARMv7-R edition, May 2014.
  • [3] ——, AMD64 Architecture Programmer’s Manual Volume 2: System Programming, September 2018.
  • [4] Anonymous, “Opteron exposed: Reverse engineering AMD K8 microcode updates,” July 2004, available: https://securiteam.com/securityreviews/5FP0M1PDFO/ [Accessed Sept.30,2019].
  • [5] ——, “Numerous system management mode (SMM) privilege escalation vulnerabilities in ASUS motherboards including Eee PC series,” Aug 2009, available: https://dl.packetstormsecurity.net/0908-advisories/smm-escalate.txt [Accessed Sept.30,2019].
  • [6] A. M. Azab, P. Ning, J. Shah, Q. Chen, R. Bhutkar, G. Ganesh, J. Ma, and W. Shen, “Hypervision across worlds: Real-time kernel protection from the ARM TrustZone secure world,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’14, Scottsdale, Arizona, USA, 2014, pp. 90–102.
  • [7] A. M. Azab, P. Ning, Z. Wang, X. Jiang, X. Zhang, and N. C. Skalsky, “Hypersentry: Enabling stealthy in-context measurement of hypervisor integrity,” in Proceedings of the 17th ACM Conference on Computer and Communications Security, ser. CCS’10, Chicago, Illinois, USA, 2010, pp. 38–49.
  • [8] A. M. Azab, P. Ning, and X. Zhang, “SICE: A hardware-level strongly isolated computing environment for x86 multi-core platforms,” in Proceedings of the 18th ACM Conference on Computer and Communications Security, ser. CCS’11, Chicago, Illinois, USA, 2011, pp. 375–388.
  • [9] M. Backes, S. Bugiel, C. Hammer, O. Schranz, and P. von Styp-Rekowsky, “Boxify: Full-fledged app sandboxing for stock android,” in 24th USENIX Security Symposium (USENIX Security 15), Washington, D.C., Aug. 2015, pp. 691–706.
  • [10] A. Baumann, “Hardware is the new software,” in Proceedings of the 16th Workshop on Hot Topics in Operating Systems, ser. HotOS ’17, Whistler, BC, Canada, 2017, pp. 132–137.
  • [11] O. Bazhaniuk, Y. Bulygin, A. Furtak, M. Gorobets, J. Loucaides, A. Matrosov, and M. Shkatov, “A new class of vulnerabilities in SMI handlers,” in Proceedings of CanSecWest Applied Security Conference (CanSecWest 2015), 2015, available: https://cansecwest.com/slides/2015/A%20New%20Class%20of%20Vulnin%20SMI%20-%20Andrew%20Furtak.pdf [Accessed Sept.30,2019].
  • [12] F. Brasser, U. Müller, A. Dmitrienko, K. Kostiainen, S. Capkun, and A.-R. Sadeghi, “Software grand exposure: SGX cache attacks are practical,” in 11th USENIX Workshop on Offensive Technologies (WOOT 17), Vancouver, BC, 2017.
  • [13] BSDaemon, coideloko, and D0nand0n, “System management mode hacks: Using SMM for ’other purposes’,” Nov 2008, available: http://phrack.org/issues/65/7.html [Accessed Sept.30,2019].
  • [14] J. V. Bulck, M. Minkin, O. Weisse, D. Genkin, B. Kasikci, F. Piessens, M. Silberstein, T. F. Wenisch, Y. Yarom, and R. Strackx, “Foreshadow: Extracting the keys to the Intel SGX kingdom with transient out-of-order execution,” in USENIX Security Symposium, Baltimore, MD, USA, 2018, pp. 991–1008.
  • [15] Y. Bulygin and D. Samyde, “Chipset based approach to detect virtualization malware a.k.a. deepwatch,” Blackhat USA, 2008, available: http://www.hakim.ws/BHUSA08/speakers/Bulygin_Detection_of_Rootkits/bh-us-08-bulygin_Chip_Based_Approach_to_Detect_Rootkits.pdf [Accessed Sept.30,2019].
  • [16] Y. Bulygin, J. Loucaides, A. Furtak, O. Bazhaniuk, and A. Matrosov, “Summary of attacks against BIOS and secure boot,” Proceedings of the DefCon, 2014, available: http://www.c7zero.info/stuff/DEFCON22-BIOSAttacks.pdf [Accessed Sept.30,2019].
  • [17] N. Burow, S. A. Carr, J. Nash, P. Larsen, M. Franz, S. Brunthaler, and M. Payer, “Control-flow integrity: Precision, security, and performance,” ACM Computing Surveys (CSUR), vol. 50, no. 1, p. 16, 2017.
  • [18] R. “.bx” Shapiro, “Types for the chain of trust: No (loader) write left behind,” Ph.D. dissertation, Dartmouth College, April 2018.
  • [19] G. Chen, S. Chen, Y. Xiao, Y. Zhang, Z. Lin, and T. H. Lai, “Sgxpectre: Stealing intel secrets from SGX enclaves via speculative execution,” in 2019 IEEE European Symposium on Security and Privacy (EuroS P), June 2019, pp. 142–157.
  • [20] S. Chen, X. Zhang, M. K. Reiter, and Y. Zhang, “Detecting privileged side-channel attacks in shielded execution with DéJà Vu,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, ser. ASIA CCS ’17, Abu Dhabi, United Arab Emirates, 2017, pp. 7–18.
  • [21] Y. Cheng, Z. Zhou, M. Yu, X. Ding, and R. H. Deng, “Ropecker: A generic and practical approach for defending against ROP attacks,” in 21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23-26, 2014, 2014.
  • [22] T. Colburn and G. Shute, “Abstraction in computer science,” Minds and Machines, vol. 17, no. 2, pp. 169–184, 2007.
  • [23] V. Costan and S. Devadas, “Intel SGX explained.” IACR Cryptology ePrint Archive, vol. 2016, no. 086, pp. 1–118, 2016.
  • [24] M. Dalton, H. Kannan, and C. Kozyrakis, “Raksha: a flexible information flow architecture for software security,” ACM SIGARCH Computer Architecture News, vol. 35, no. 2, pp. 482–493, 2007.
  • [25] L. Davi, M. Hanreich, D. Paul, A.-R. Sadeghi, P. Koeberl, D. Sullivan, O. Arias, and Y. Jin, “Hafix: hardware-assisted flow integrity extension,” in Proceedings of the 52nd Annual Design Automation Conference.   ACM, 2015, p. 74.
  • [26] R. de Clercq and I. Verbauwhede, “A survey of hardware-based control flow integrity (CFI),” arXiv preprint arXiv:1706.07257, 2017.
  • [27] B. Delgado and K. L. Karavanic, “Performance implications of system management mode,” in 2013 IEEE International Symposium on Workload Characterization (IISWC).   IEEE, 2013, pp. 163–173.
  • [28] G. Dessouky, T. Abera, A. Ibrahim, and A. Sadeghi, “Litehax: lightweight hardware-assisted attestation of program execution,” in Proceedings of the International Conference on Computer-Aided Design, ICCAD 2018, San Diego, CA, USA, November 05-08, 2018, I. Bahar, Ed.   ACM, 2018, p. 106.
  • [29] J. Devietti, C. Blundell, M. M. K. Martin, and S. Zdancewic, “Hardbound: Architectural support for spatial safety of the C programming language,” in Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS’08, Seattle, WA, USA, 2008, pp. 103–114.
  • [30] U. Dhawan, C. Hritcu, R. Rubin, N. Vasilakis, S. Chiricescu, J. M. Smith, T. F. Knight, Jr., B. C. Pierce, and A. DeHon, “Architectural support for software-defined metadata processing,” in Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS’15, Istanbul, Turkey, 2015, pp. 487–502.
  • [31] M. G. Dixon, D. A. Koufaty, C. B. Rust, H. W. Gartler, and F. Binns, “Steering system management code region accesses,” 2014, filed in 2005.
  • [32] G. J. Duck and R. H. C. Yap, “EffectiveSan: Type and Memory Error Detection Using Dynamically Typed C/C++,” in Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, ser. PLDI 2018.   New York, NY, USA: ACM, 2018, [Accessed Sept.30,2019]. [Online]. Available: http://doi.acm.org/10.1145/3192366.3192388
  • [33] L. Duflot, D. Etiemble, and O. Grumelard, “Using CPU system management mode to circumvent operating system security functions,” CanSecWest/core06, 2006, available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.512.2730&rep=rep1&type=pdf [Accessed Sept.30,2019].
  • [34] L. Duflot, O. Levillain, B. Morin, and O. Grumelard, “Getting into the SMRAM: SMM reloaded,” CanSecWest, Vancouver, Canada, 2009, available: https://cansecwest.com/csw09/csw09-duflot.pdf [Accessed Sept.30,2019].
  • [35] ——, “System management mode design and security issues,” IT Defense, 2010, available: https://www.ssi.gouv.fr/uploads/IMG/pdf/IT_Defense_2010_final.pdf [Accessed Sept.30,2019].
  • [36] L. Duflot, Y.-A. Perez, and B. Morin, “What if you can’t trust your network card?” in Proceedings of the 14th International Conference on Recent Advances in Intrusion Detection, ser. RAID’11, Menlo Park, CA, 2011, pp. 378–397.
  • [37] J. G. Dyer, M. Lindemann, R. Perez, R. Sailer, L. van Doorn, and S. W. Smith, “Building the IBM 4758 secure coprocessor,” Computer, vol. 34, no. 10, pp. 57–66, Oct 2001.
  • [38] M. Ermolov and M. Goryachy, “How to hack a turned-off computer, or running unsigned code in Intel management engine,” Blackhat Europe 2017.
  • [39] A. Filyanov, J. M. McCune, A. Sadeghi, and M. Winandy, “Uni-directional trusted path: Transaction confirmation on just one device,” in Proceedings of the 2011 IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2011, Hong Kong, China, June 27-30 2011, 2011, pp. 1–12.
  • [40] X. Ge, W. Cui, and T. Jaeger, “Griffin: Guarding control flows using Intel processor trace,” in Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS ’17, Xi’an, China, 2017, pp. 585–598.
  • [41] X. Ge, H. Vijayakumar, and T. Jaeger, “Sprobes: Enforcing kernel code integrity on the TrustZone architecture,” arXiv preprint arXiv:1410.7747, 2014.
  • [42] S. D. Ghetie, “Protecting system management mode (SMM) spaces against cache attacks,” 2010, filed in 2007.
  • [43] B. Gras, K. Razavi, H. Bos, and C. Giuffrida, “Translation leak-aside buffer: Defeating cache side-channel protections with TLB attacks,” in 27th USENIX Security Symposium (USENIX Security 18), 2018, pp. 955–972.
  • [44] M. Green, L. Rodrigues-Lima, A. Zankl, G. Irazoqui, J. Heyszl, and T. Eisenbarth, “Autolock: Why cache attacks on ARM are harder than you think,” in 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, 2017, pp. 1075–1091.
  • [45] A. Grieve, M. Davies, P. H. Jones, and J. Zambreno, “Armor: A recompilation and instrumentation-free monitoring architecture for detecting memory exploits,” IEEE Transactions on Computers, vol. 67, no. 8, pp. 1092–1104, Aug 2018.
  • [46] T. C. Group, TCG D-RTM Architecture, June 2013.
  • [47] ——, TCG Glossary, May 2017.
  • [48] ——, TCG Roots of Trust Specification, July 2018.
  • [49] J. A. Halderman, S. D. Schoen, N. Heninger, W. Clarkson, W. Paul, J. A. Calandrino, A. J. Feldman, J. Appelbaum, and E. W. Felten, “Lest we remember: Cold boot attacks on encryption keys,” in USENIX Security Symp., Boston, MA, USA, 2008.
  • [50] L. Harrison and K. Li, “Arms race: the story of (in)-secure bootloaders,” Shmoocon, Tech. Rep., 2014.
  • [51] J. Hendricks and L. van Doorn, “Secure bootstrap is not enough: Shoring up the trusted computing base,” in Proceedings of the 11th Workshop on ACM SIGOPS European Workshop, ser. EW 11, Leuven, Belgium, 2004.
  • [52] H. Hu, C. Qian, C. Yagemann, S. P. H. Chung, W. R. Harris, T. Kim, and W. Lee, “Enforcing unique code target property for control-flow integrity,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’18, Toronto, Canada, 2018, pp. 1470–1486.
  • [53] IBM, Mainframe concepts, 2010.
  • [54] ——, IBM Z Systems Secure Service Container User’s Guide, 2017.
  • [55] Intel, Intel Xeon Processor E3-1200v4 Product Family Datasheet – Volume 2 of 2, June 2015.
  • [56] ——, “SMI transfer monitor (STM),” 2015, available: https://firmware.intel.com/content/smi-transfer-monitor-stm [Accessed Sept.30,2019].
  • [57] ——, Intel Software Guard Extensions Developer Guide, June 2017.
  • [58] ——, Intel Trusted Execution Technology: Software Development Guide, November 2017.
  • [59] ——, Intel 64 and IA-32 Architectures Software Developer’s Manual, November 2018.
  • [60] ——, Control-flow enforcement technology preview, May 2019.
  • [61] ——, Intel Architecture Memory Encryption Technologies Specification, April 2019.
  • [62] M. A. Islam and S. Ren, “Ohm’s law in data centers: A voltage side channel for timing power attacks,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.   ACM, 2018, pp. 146–162.
  • [63] N. L. P. Jr., T. Fraser, J. Molina, and W. A. Arbaugh, “Copilot - a coprocessor-based kernel runtime integrity monitor,” in Proceedings of the 13th USENIX Security Symposium, August 9-13, 2004, San Diego, CA, USA, 2004, pp. 179–194.
  • [64] C. Kallenberg, J. Butterworth, X. Kovah, and C. Cornwell, “Defeating signed bios enforcement,” EkoParty, Buenos Aires, 2013.
  • [65] C. Kallenberg, S. Cornwell, X. Kovah, and J. Butterworth, “Setup for failure: defeating secure boot,” in The Symposium on Security for Asia Network (SyScan)(April 2014), 2014.
  • [66] S. T. King and P. M. Chen, “Subvirt: Implementing malware with virtual machines,” in 2006 IEEE Symposium on Security and Privacy (S&P’06).   IEEE, 2006, pp. 14–pp.
  • [67] G. Klein, M. Norrish, T. Sewell, H. Tuch, S. Winwood, K. Elphinstone, G. Heiser, J. Andronick, D. Cock, P. Derrin, D. Elkaduwe, K. Engelhardt, and R. Kolanski, “seL4: formal verification of an OS kernel.”   ACM Press, 2009, p. 207, [Accessed Sept.30,2019]. [Online]. Available: http://portal.acm.org/citation.cfm?doid=1629575.1629596
  • [68] P. Kocher, D. Genkin, D. Gruss, W. Haas, M. Hamburg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and Y. Yarom, “Spectre attacks: Exploiting speculative execution,” arXiv preprint arXiv:1801.01203, 2018.
  • [69] P. C. Kocher, “Timing attacks on implementations of Diffe-Hellman, RSA, DSS, and other systems,” in Proceedings of the 16th Annual International Cryptology Conference on Advances in Cryptology (CRYPTO’96), August 1996, pp. 104–113.
  • [70] B. Kollenda, P. Koppe, M. Fyrbiak, C. Kison, C. Paar, and T. Holz, “An exploratory analysis of microcode as a building block for system defenses,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’18, Toronto, Canada, 2018, pp. 1649–1666.
  • [71] P. Koppe, B. Kollenda, M. Fyrbiak, C. Kison, R. Gawlik, C. Paar, and T. Holz, “Reverse engineering x86 processor microcode,” in 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, 2017, pp. 1163–1180.
  • [72] J. Kramer, “Is abstraction the key to computing?” Commun. ACM, vol. 50, no. 4, pp. 36–42, Apr. 2007.
  • [73] A. Kwon, U. Dhawan, J. M. Smith, T. F. Knight, Jr., and A. DeHon, “Low-fat pointers: Compact encoding and efficient gate-level implementation of fat pointers for spatial safety and capability-based security,” in Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security, ser. CCS ’13, Berlin, Germany, 2013, pp. 721–732.
  • [74] B. Lampson, “A note on the confinement problem.”   Association for Computing Machinery, Inc., Jan. 1973, [Accessed Sept.30,2019]. [Online]. Available: https://www.microsoft.com/en-us/research/publication/a-note-on-the-confinement-problem/
  • [75] K. Leach, F. Zhang, and W. Weimer, “Scotch: Combining software guard extensions and system management mode to monitor cloud resource usage,” in Research in Attacks, Intrusions, and Defenses, Atlanta, Georgia, USA, 2017, pp. 403–424.
  • [76] H. Lee, H. Moon, D. Jang, K. Kim, J. Lee, Y. Paek, and B. B. Kang, “Ki-mon: A hardware-assisted event-triggered monitoring platform for mutable kernel object,” in Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), Washington, D.C., 2013, pp. 511–526.
  • [77] H. Lee, C. Song, and B. B. Kang, “Lord of the x86 rings: A portable user mode privilege separation architecture on x86,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’18, Toronto, Canada, 2018, pp. 1441–1454.
  • [78] S. Lee, M.-W. Shih, P. Gera, T. Kim, H. Kim, and M. Peinado, “Inferring fine-grained control flow inside SGX enclaves with branch shadowing,” in 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, 2017, pp. 557–574.
  • [79] M. Lentz, R. Sen, P. Druschel, and B. Bhattacharjee, “Secloak: Arm TrustZone-based mobile peripheral control,” in Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services.   ACM, 2018, pp. 1–13.
  • [80] W. Li, M. Ma, J. Han, Y. Xia, B. Zang, C.-K. Chu, and T. Li, “Building trusted path on untrusted device drivers for mobile devices,” in Proceedings of 5th Asia-Pacific Workshop on Systems.   ACM, 2014, p. 8.
  • [81] H. Liljestrand, T. Nyman, K. Wang, C. C. Perez, J.-E. Ekberg, and N. Asokan, “PAC it up: Towards pointer integrity using ARM pointer authentication,” in 28th USENIX Security Symposium (USENIX Security 19), 2019, pp. 177–194.
  • [82] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, S. Mangard, P. Kocher, D. Genkin, Y. Yarom, and M. Hamburg, “Meltdown,” arXiv preprint arXiv:1801.01207, 2018.
  • [83] Z. Liu, J. Lee, J. Zeng, Y. Wen, Z. Lin, and W. Shi, “CPU transparent protection of OS kernel and hypervisor integrity with programmable DRAM,” in Proceedings of the 40th Annual International Symposium on Computer Architecture, ser. ISCA’13, Tel-Aviv, Israel, 2013, pp. 392–403.
  • [84] T. Mandt, M. Solnik, and D. Wang, “Demystifying the secure enclave processor,” Azimuth Security and OffCell Research, Tech. Rep., 2016.
  • [85] J. M. McCune, Y. Li, N. Qu, Z. Zhou, A. Datta, V. D. Gligor, and A. Perrig, “TrustVisor: Efficient TCB reduction and attestation,” in 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA, 2010, pp. 143–158.
  • [86] J. M. McCune, B. J. Parno, A. Perrig, M. K. Reiter, and H. Isozaki, “Flicker: An execution infrastructure for TCB minimization,” in EuroSys’08, Glasgow, Scotland, Apr. 2008.
  • [87] J. M. McCune, A. Perrig, and M. K. Reiter, “Safe passage for passwords and other sensitive data,” in NDSS.   The Internet Society, 2009.
  • [88] C. Meijer and B. van Gastel, “Self-encrypting deception: weaknesses in the encryption of solid state drives (SSDs),” Radboud University and Open University of the Netherlands, Tech. Rep., 2018, https://www.ru.nl/publish/pages/909282/draft-paper.pdf [Accessed Sept.30,2019].
  • [89] A. Menon, S. Murugan, C. Rebeiro, N. Gala, and K. Veezhinathan, “Shakti-T: A RISC-V processor with light weight security extensions,” in Proceedings of the Hardware and Architectural Support for Security and Privacy.   ACM, 2017, p. 2.
  • [90] R. Milner, “A theory of type polymorphism in programming,” Journal of Computer and System Sciences, vol. 17, no. 3, pp. 348 – 375, 1978, [Accessed Sept.30,2019]. [Online]. Available: http://www.sciencedirect.com/science/article/pii/0022000078900144
  • [91] C. Mitchell, Trusted computing.   Iet, 2005, vol. 6.
  • [92] H. Moon, H. Lee, J. Lee, K. Kim, Y. Paek, and B. B. Kang, “Vigilare: Toward snoop-based kernel integrity monitor,” in Proceedings of the 2012 ACM Conference on Computer and Communications Security, ser. CCS’12, Raleigh, North Carolina, USA, 2012, pp. 28–37.
  • [93] M. Morbitzer, M. Huber, and J. Horsch, “Extracting secrets from encrypted virtual machines,” in Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy.   ACM, 2019, pp. 221–230.
  • [94] M. Morbitzer, M. Huber, J. Horsch, and S. Wessel, “Severed: Subverting AMD’s virtual machine encryption,” in Proceedings of the 11th European Workshop on Systems Security, EuroSec@EuroSys 2018, Porto, Portugal, April 23, 2018, 2018, pp. 1:1–1:6.
  • [95] S. Nagarakatte, M. M. K. Martin, and S. Zdancewic, “Watchdog: Hardware for safe and secure manual memory management and full memory safety,” in 39th International Symposium on Computer Architecture (ISCA 2012), June 9-13, 2012, Portland, OR, USA, 2012, pp. 189–200.
  • [96] ——, “Watchdoglite: Hardware-accelerated compiler-based pointer checking,” in Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, ser. CGO ’14, Orlando, FL, USA, 2014, pp. 175:175–175:184.
  • [97] B. Ngabonziza, D. Martin, A. Bailey, H. Cho, and S. Martin, “TrustZone explained: Architectural features and use cases,” in 2nd IEEE International Conference on Collaboration and Internet Computing, CIC 2016, Pittsburgh, PA, USA, November 1-3, 2016, 2016, pp. 445–451.
  • [98] Nuvoton.com, “Numicro m2351 series – a TrustZone empowered microcontroller series focusing on IoT security,” product website (December, 2018). https://m2351.nuvoton.com/secure-microcontroller-platform/ [Accessed Sept.30,2019].
  • [99] T. Nyman, G. Dessouky, S. Zeitouni, A. Lehikoinen, A. Paverd, N. Asokan, and A.-R. Sadeghi, “Hardscope: Thwarting dop with hardware-assisted run-time scope enforcement,” arXiv preprint arXiv:1705.10295, 2017.
  • [100] O. Oleksenko, D. Kuvaiskii, P. Bhatotia, P. Felber, and C. Fetzer, “Intel MPX explained: A cross-layer analysis of the Intel MPX system stack,” Proc. ACM Meas. Anal. Comput. Syst., vol. 2, no. 2, pp. 28:1–28:30, Jun. 2018.
  • [101] D. A. Osvik, A. Shamir, and E. Tromer, “Cache attacks and countermeasures: the case of AES,” in Cryptographers’ track at the RSA conference.   Springer, 2006, pp. 1–20.
  • [102] V. Pappas, M. Polychronakis, and A. D. Keromytis, “Transparent ROP exploit mitigation using indirect branch tracing,” in Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), Washington, D.C., 2013, pp. 447–462.
  • [103] S. Pinto and N. Santos, “Demystifying Arm TrustZone: A comprehensive survey,” ACM Computing Surveys, vol. 51, pp. 1–36, 01 2019.
  • [104] L. Richter, J. Götzfried, and T. Müller, “Isolating operating system components with Intel SGX,” in Proceedings of the 1st Workshop on System Software for Trusted Execution, ser. SysTEX ’16, Trento, Italy, 2016, pp. 8:1–8:6.
  • [105] X. Ruan, Platform Embedded Security Technology Revealed: Safeguarding the Future of Computing with Intel Embedded Security and Management Engine.   Apress, 2014.
  • [106] J. Rutkowska and R. Wojtczuk, “Preventing and detecting Xen hypervisor subversions,” Blackhat Briefings USA, 2008, available: https://invisiblethingslab.com/resources/bh08/part2-full.pdf [Accessed Sept.30,2019].
  • [107] A. L. Sacco and A. A. Ortega, “Persistent bios infection,” in CanSecWest Applied Security Conference, 2009, available: http://phrack.org/issues/66/7.html#article.
  • [108] M. Shih, S. Lee, T. Kim, and M. Peinado, “T-SGX: eradicating controlled-channel attacks against enclave programs,” in 24th Anal Network and Distributed System Security Symposium, NDSS 2017, San Diego, California, USA, February 26 - March 1, 2017, 2017.
  • [109] R. Shu, P. Wang, S. A. Gorski III, B. Andow, A. Nadkarni, L. Deshotels, J. Gionta, W. Enck, and X. Gu, “A study of security isolation techniques,” ACM Comput. Surv., vol. 49, no. 3, pp. 50:1–50:37, Oct. 2016.
  • [110] I. Skochinsky, “Intel ME secrets,” 2014, available: https://recon.cx/2014/slides/Recon%202014%20Skochinsky.pdf [Accessed Sept.30,2019].
  • [111] C. Song, H. Moon, M. Alam, I. Yun, B. Lee, T. Kim, W. Lee, and Y. Paek, “HDFI: Hardware-assisted data-flow isolation,” in 2016 IEEE Symposium on Security and Privacy (SP), May 2016, pp. 1–17.
  • [112] H. Sun, K. Sun, Y. Wang, J. Jing, and H. Wang, “TrustICE: Hardware-assisted isolated computing environments on mobile devices,” in 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, June 2015, pp. 367–378.
  • [113] L. Szekeres, M. Payer, T. Wei, and D. Song, “SoK: Eternal war in memory,” in 2013 IEEE Symposium on Security and Privacy, May 2013, pp. 48–62.
  • [114] A. Tereshkin and R. Wojtczuk, “Introducing ring -3 rootkits,” July 2009, invisible Things Lab. Available: https://invisiblethingslab.com/resources/bh09usa/Ring%20-3%20Rootkits.pdf [Accessed Sept.30,2019].
  • [115] A. Vahldiek-Oberwagner, E. Elnikety, N. O. Duarte, M. Sammler, P. Druschel, and D. Garg, “ERIM: Secure, efficient in-process isolation with protection keys (MPK),” in 28th USENIX Security Symposium (USENIX Security 19), Santa Clara, CA, Aug. 2019, pp. 1221–1238.
  • [116] J. Van Bulck, F. Piessens, and R. Strackx, “SGX-Step: A practical attack framework for precise enclave execution control,” in Proceedings of the 2Nd Workshop on System Software for Trusted Execution, ser. SysTEX’17, Shanghai, China, 2017, pp. 4:1–4:6.
  • [117] ——, “Nemesis: Studying microarchitectural timing leaks in rudimentary CPU interrupt logic,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.   ACM, 2018, pp. 178–195.
  • [118] A. Vasudevan, S. Chaki, L. Jia, J. McCune, J. Newsome, and A. Datta, “Design, implementation and verification of an eXtensible and modular hypervisor framework,” in IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 2013.
  • [119] J. Wang, A. Stavrou, and A. Ghosh, “Hypercheck: A hardware-assisted integrity monitor,” in Recent Advances in Intrusion Detection, S. Jha, R. Sommer, and C. Kreibich, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 158–177.
  • [120] Z. Wang and X. Jiang, “Hypersafe: A lightweight approach to provide lifetime hypervisor control-flow integrity,” in 2010 IEEE Symposium on Security and Privacy, May 2010, pp. 380–395.
  • [121] R. Wilkins and B. Richardson, “UEFI secure boot in modern computer security solutions,” UEFI.org, Tech. Rep., September 2013.
  • [122] K. Winnard, L. Fadel, R. Hunt, J. Johnston, and A. Salla, IBM Mainframe Bits: Understanding the Platform Hardware, 2016.
  • [123] A. Winning, “First look at nordic’s “cellular made easy” nRF91 low-power solution,” news article (January, 2018). http://www.eenewsembedded.com/news/first-look-nordics-cellular-made-easy-nrf91-low-power-solution [Accessed Sept.30,2019].
  • [124] R. Wojtczuk and C. Kallenberg, “Attacking UEFI boot script,” 2015, available: https://bromiumlabs.files.wordpress.com/2015/01/venamis_whitepaper.pdf [Accessed Sept.30,2019].
  • [125] ——, “Attacks on UEFI security,” in Proc. 15th Annu. CanSecWest Conf.(CanSecWest), 2015, available: https://repo.zenk-security.com/Techniques%20d.attaques%20%20.%20%20Failles/Attacks-on-UEFI-security.pdf [Accessed Sept.30,2019].
  • [126] R. Wojtczuk and J. Rutkowska, “Attacking Intel Trusted Execution Technology,” black Hat DC (Feb. 2009). https://invisiblethingslab.com/resources/bh09dc/Attacking%20Intel%20TXT%20-%20paper.pdf.
  • [127] ——, “Attacking SMM memory via Intel CPU cache poisoning,” Invisible Things Lab, Tech. Rep., 03 2009, available: https://invisiblethingslab.com/resources/misc09/smm_cache_fun.pdf [Accessed Sept.30,2019].
  • [128] R. Wojtczuk and A. Tereshkin, “Attacking Intel BIOS,” BlackHat, Las Vegas, USA, 2009, available: https://www.blackhat.com/presentations/bh-usa-09/WOJTCZUK/BHUSA09-Wojtczuk-AtkIntelBios-SLIDES.pdf [Accessed Sept.30,2019].
  • [129] J. Yao and V. J. Zimmer, “White paper a tour beyond bios launching a STM to monitor SMM in EFI developer kit II,” Intel Corporation, Tech. Rep., 2015.
  • [130] Y. Yarom and K. Falkner, “FLUSH+RELOAD: a high resolution, low noise, L3 cache side-channel attack,” in 23rd USENIX Security Symposium (USENIX Security 14), 2014, pp. 719–732.
  • [131] D. Ye, Y. Su, Y. Sui, and J. Xue, “WPBOUND: Enforcing Spatial Memory Safety Efficiently at Runtime with Weakest Preconditions,” in 2014 IEEE 25th International Symposium on Software Reliability Engineering, Nov. 2014.
  • [132] B. Yee, D. Sehr, G. Dardyk, J. B. Chen, R. Muth, T. Ormandy, S. Okasaka, N. Narula, and N. Fullagar, “Native Client: A sandbox for portable, untrusted x86 native code,” in 2009 30th IEEE Symposium on Security and Privacy, May 2009, pp. 79–93.
  • [133] K. Ying, A. Ahlawat, B. Alsharifi, Y. Jiang, P. Thavai, and W. Du, “TruZ-Droid: Integrating trustzone with mobile operating system,” in Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, MobiSys 2018, Munich, Germany, June 10-15, 2018, 2018, pp. 14–27.
  • [134] M. Yu, V. D. Gligor, and Z. Zhou, “Trusted display on untrusted commodity platforms,” in Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, ser. CCS’15, Denver, Colorado, USA, 2015, pp. 989–1003.
  • [135] J. Zaddach, A. Kurmus, D. Balzarotti, E.-O. Blass, A. Francillon, T. Goodspeed, M. Gupta, and I. Koltsidas, “Implementation and implications of a stealth hard-drive backdoor,” in Proceedings of the 29th Annual Computer Security Applications Conference, ser. ACSAC ’13, New Orleans, Louisiana, USA, 2013, pp. 279–288.
  • [136] N. Zeldovich, H. Kannan, M. Dalton, and C. Kozyrakis, “Hardware enforcement of application security policies using tagged memory,” in 8th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2008, December 8-10, 2008, San Diego, California, USA, Proceedings, 2008, pp. 225–240.
  • [137] F. Zhang, K. Leach, H. Wang, and A. Stavrou, “Trustlogin: Securing password-login on commodity operating systems,” in Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security.   ACM, 2015, pp. 333–344.
  • [138] F. Zhang, H. Wang, K. Leach, and A. Stavrou, “A framework to secure peripherals at runtime,” in Computer Security - ESORICS 2014, M. Kutyłowski and J. Vaidya, Eds.   Cham: Springer International Publishing, 2014, pp. 219–238.
  • [139] F. Zhang, J. Chen, H. Chen, and B. Zang, “CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization,” in Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles.   ACM, 2011, pp. 203–216.
  • [140] N. Zhang, K. Sun, D. Shands, W. Lou, and Y. T. Hou, “TruSpy: Cache side-channel information leakage from the secure world on ARM devices,” IACR Cryptology ePrint Archive, vol. 2016, p. 980, 2016.
  • [141] ——, “TruSense: Information leakage from TrustZone,” in 2018 IEEE Conference on Computer Communications, INFOCOM 2018, Honolulu, HI, USA, April 16-19, 2018, 2018, pp. 1097–1105.
  • [142] N. Zhang, R. Zhang, K. Sun, W. Lou, Y. T. Hou, and S. Jajodia, “Memory forensic challenges under misused architectural features,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 9, pp. 2345–2358, 2018.