Stockade: Hardware Hardening for Distributed Trusted Sandboxes

08/26/2021 ∙ by Joongun Park, et al. ∙ 0

The widening availability of hardware-based trusted execution environments (TEEs) has been accelerating the adaptation of new applications using TEEs. Recent studies showed that a cloud application consists of multiple distributed software modules provided by mutually distrustful parties. The applications use multiple TEEs (enclaves) communicating through software-encrypted memory channels. Such execution model requires bi-directional protection: protecting the rest of the system from the enclave module with sandboxing and protecting the enclave module from a third-part module and operating systems. However, the current TEE model, such as Intel SGX, cannot efficiently represent such distributed sandbox applications. To overcome the lack of hardware supports for sandboxed TEEs, this paper proposes an extended enclave model called Stockade, which supports distributed sandboxes hardened by hardware. Stockade proposes new three key techniques. First, it extends the hardware-based memory isolation in SGX to confine a user software module only within its enclave. Second, it proposes a trusted monitor enclave that filters and validates systems calls from enclaves. Finally, it allows hardware-protected memory sharing between a pair of enclaves for efficient protected communication without software-based encryption. Using an emulated SGX platform with the proposed extensions, this paper shows that distributed sandbox applications can be effectively supported with small changes of SGX hardware.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 3

page 5

page 8

page 9

page 12

page 13

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Hardware-based trusted execution environments (TEEs) enabled the strong isolation of execution contexts in remote clouds, even when the servers are exposed to potential vulnerability in privileged software and physical attacks. Among recent TEE supports, Intel Software Guard Extension (SGX), a commercial incarnation of TEEs, provides isolated execution environments called enclaves protected by the CPU hardware. The CPU hardware isolates each enclave from the operating system. Its code and data are encrypted and integrity-verified while they reside in the external DRAM.

The introduction of commercially available TEEs has been accelerating the exploration of application scenarios utilizing their strong isolation capability. One important cloud-oriented scenario is to provide function-as-a-service or software-as-a-service on clouds, running a function or software in each enclave [openlambda, amazon_lambda, google_app_engine]. In such applications, it is critical not only to protect user-provided functions from the potentially vulnerable cloud system but also to secure the hosting cloud system by sandboxing the user-provided functions or software, as they cannot be fully trusted from the perspective of the hosting system. Besides, an application task is composed of multiple functions communicating with each other [ryoan]. Figure 1 presents such a distributed sandboxed application. Such an application consists of software modules from multiple software providers, which may not entirely trust the other providers. With multiple participants, each module must be protected from other modules or the hosting system, and modules must also be confined to prevent any exploitation of system vulnerability. A key software technique for such distributed secure applications is software sandboxing which prevents the codes in an enclave from accessing the memory beyond the protected enclave memory and validates system calls.

The distributed sandboxed applications reveal the limitations of the current SGX model. First, the codes inside an enclave can freely access the remaining untrusted memory of the process. Such uni-directional protection can endanger the rest of the system if the enclave code is malicious. To address such vulnerability, the prior study proposed to employ a heavy software sandboxing running with user codes inside an enclave [ryoan, occlum, enclavedom, Chancel, AccTEE]. Second, enclaves require to use operating system services via system calls, but the secure interaction via system calls must be considered. Not only such system call requests must be verified to protect the hosting system [SGXJail], but return values must be checked to prevent Iago attacks against the enclave [emilia, PANOPLY, graphene-sgx, scone]. Finally, the communication channel among enclaves is not provided by the hardware mechanism. For secure inter-enclave communication, a pair of enclaves must share an untrusted memory region, and each message must be encrypted and integrity-protected by the software running inside the enclaves. Such software-based encrypted communication not only increases the communication latency but also can cause a vulnerability [TOCTOU, PANOPLY].

To overcome the limitations of the current TEE model, this study proposes an extension of the enclave model, called Stockade. Stockade provides efficient hardware-supported solutions for the three limitations. First, instead of using software-based sandboxing, Stockade blocks accesses from enclaves to the untrusted world. We call the sandboxed enclave bi-enclave. By simply extending the pre-existing memory validation mechanism in SGX hardware, a bi-enclave can not only be protected from the untrusted world but also be prevented from accessing the untrusted context. Such bi-directional isolation enables solid sandboxing support for each bi-enclave without any extra software layer.

The second mechanism is to provide a hardened interaction between a bi-enclave and the operating system. The interaction of the bi-enclave and operating system can be forced to go through the monitor enclave to process the system calls only if they are valid. The key difference from the prior approaches [ryoan, occlum, enclavedom, SGXJail, Chancel, AccTEE] is that the monitor is isolated both from the bi-enclave and from the operating system, which provides stronger protection for the system call verification and return value validation. The codes running in the monitor enclave are attested by both the bi-enclave and operating system, providing verified monitoring operations by the two entities. With the neutral monitor enclave, Stockade can provide a temper-proof accounting service of system resources such as file I/Os and network usages, as both the cloud users and providers can trust the monitor enclave.

Figure 1: A distributed sandboxes with SGX. Gray boxes are enclaves running modules from different providers

The final mechanism allows sharing of trusted memory pages between two enclaves. The hardware provides an interface for sharing the protected pages between two enclaves, and the memory isolation mechanism is extended to allow two enclaves to access the shared pages. By sharing the hardware-protected pages, the communication between the enclaves does not require costly software-based encryption and integrity protection.

To show the effectiveness of the new enclave extensions, we ported several application scenarios on an emulated SGX runtime with the extended interface. The experimental results show that minor hardware extensions can improve the efficiency and security of distributed sandbox applications on clouds. Compared to the prior SW-based sandboxing, it provides 1.419.5% performance improvements. This study hardens the distributed sandbox applications with hardware extensions. To the best of our knowledge, it is the first study to extend the execution model and hardware for bi-directional protection with the protected system call monitor. The new contributions of this paper are as follows:

  • It proposes bi-directional isolation between an enclave and its untrusted environment. The design shows that a simple extension of the existing memory access control mechanism in SGX can provide efficient isolation for both ways.

  • It proposes a hardware-protected monitoring mechanism for handling system call filtering and accounting operations for each enclave.

  • It proposes a shared trusted memory between two enclaves. With a careful design, a designated part of the protected memory of an enclave can be shared with the other enclave.

The rest of the paper is organized as follows. Section 2 presents the background of distributed sandbox applications. Section 3 discusses the motivations of three extensions, and Section 3.4 discusses the related works. Section 4 presents the proposed hardware extensions. Section 5 presents the security analysis, and Section 6 provides four application scenarios using bi-enclave and their performance on an emulated SGX runtime. Section 7 concludes the paper.

2 Background

2.1 Intel Software Guard Extensions (SGX)

Intel SGX provides a user-level trusted execution environment called an enclave. The context of an enclave is protected by the hardware mechanism. The protected memory region of enclaves is created in Enclave Page Cache (EPC). Part of physical memory, Processor Reserved Memory (PRM), is reserved for SGX and is protected by the hardware memory encryption engine (MEE). PRM contains the EPC pages in addition to other security meta-data for SGX. Although EPC pages are in the external DRAM, their confidentiality and integrity are guaranteed under direct physical attacks on DRAM and system interconnection components. The attestation support allows a user to verify the identity and measured digest of an enclave and platform setting where the enclave runs.

The memory isolation for each enclave is done during the address translation step for each memory access. A mode transition between the enclave mode and untrusted mode requires flushing Translation Lookaside Buffers (TLBs). For each TLB miss, the validity of access is verified by the CPU hardware logic. A key internal data structure for verification is Enclave Page Cache Map (EPCM) which is stored in PRM. An EPCM entry has information about a physical page that belongs to the EPC region. It contains the owner’s enclave ID and its virtual address in the enclave memory space, in addition to other status information. Even though page tables are still managed and updated by the operating system, the EPCM table is accessible only by the hardware, and the page table entry for EPC can be verified using EPCM. The crucial invariant for the correctness of memory isolation is that TLB must contain only verified translations.

SGX controls enclave through a set of instructions. After an enclave is created, EINIT initializes it to be ready for protected execution. The virtual address of protected memory region for an enclave is fixed during the initialization of the enclave. The virtual address range for an enclave should be a single contiguous region called Enclave Linear Address Range (ELRANGE). The context information of an enclave is stored in its SGX Enclave Control Structures (SECS). SECS are allocated in EPC pages for its safety against the malicious operating system. SGX includes instructions for switching modes between enclave context and unprotected context: EENTER to enter enclave mode, and EEXIT to exit enclave mode.

2.2 Sandboxing

Sandboxing confines an application in its own environment. By isolating untrusted applications, sandboxing protects the kernel and host environment against potential attacks from the applications. Sandboxing is widely adopted for runtime protection against third-party applications, such as web browsers running plugins written by unauthorized developers [chrome_sandbox, mozilla_sandbox], and testbeds for third-party developers migrating their applications to the production system [google_app_engine, microsoft_sandbox, amazon_sandbox, yahoo_sandbox, paypal_sandbox_1, paypal_sandbox_2, ebay_sandbox].

An application running in a sandbox must not be allowed to directly access the memory outside of the sandbox. In addition, the application control should never reach beyond the designated sandbox, neither directly nor indirectly during its runtime. To provide sandboxing, fault isolation confines control transfer and data access within a sandbox, and system call filtering validates system call requests from sandbox applications.

Fault Isolation: Fault isolation provides a logically isolated compartment by enforcing its confinement policy on memory and control transfer. Software-based fault isolation provides such confinement by binary instrumentation or compiler support [SFI, fastSFI, NaCl, adaptingSFI, webAssembly]. Using binary instrumentation, Google Native Client (NaCl) restricts memory accesses from untrusted applications, by masking target addresses with memory boundary before the binary execution. Such software-based isolation needs to execute extra instructions for access validation, adding performance overheads. In addition, the instruction-based bound checking is potentially vulnerable to the Spectre attacks [swivel, spectre, meltdown, systematic, spectrereturns]. Other fault isolation techniques rely on CPU hardware supports for confinement [ARMlock, flexdroid, SGXJail, enclavedom]. With hardware supports such as Intel Memory Protection Keys (MPK) [libmpk], or ARM Domain [armdomain], they provide sandboxes to separate modules from each other. However, the current MPK uses page tables to track memory domains, and thus the domain information can be changed by OS.

System Call Monitoring: In addition to the memory access control, the interaction with the operating system must also be regulated by sandboxing. Although the operating system is protected with privilege separation and system call interfaces, system vulnerabilities via system calls have been continuously reported [CVE-2019-2054, CVE-2020-17087, shellshock, CVE-2019-3969]. A naive way to alleviate this problem is not allowing untrusted applications to make any system calls. However, many real-world applications are relying on system call interfaces such as POSIX to use network supports and file management. Therefore, the sandbox must provide controlled system functionalities by verifying system calls from the untrusted application. Seccomp-bpf [seccomp] interposes system call requests by filtering system call with ID and arguments. In addition to filtering system calls, by manipulating return values of system calls, a malicious operating system can leak the application’s secret or break the execution integrity known as the Iago Attack [Iago]. To prevent Iago attacks, return values also need to be validated [Sego, Inktag].

2.3 Cloud Applications and Trusted Execution

Cloud services have evolved to use a more complex task model, where many different software modules are interacting with each other. A single cloud application may rely on multiple modules from different parties. As shown in Figure 1, a recent advancement of function-as-a-service or software-as-a-service has enabled cloud applications to be composed of small functions or modules. Each module is implemented by a different party, and thus, its trustworthiness is not fully guaranteed from the perspective of the other module providers. In addition, the cloud provider must protect the system from modules and clients. To support such new application scenarios, the trusted execution model needs to evolve.

Recent studies investigated applying trusted computing to such distributed cloud applications consisting of multiple modules with different providers [ryoan, sfaas, clemmys, multi-domains, awslambda]. They proposed to run each module in an enclave, but additionally sandboxing is combined with the module running in an enclave. The sandbox library blocks access to untrusted memory from an enclave, which confines the access boundary of an enclave only to its own EPC region. With such software sandboxing, it attempts to prevent the functions in an enclave from exploiting potential vulnerabilities of the system. In addition, it uses software-encrypted communication via shared memory between enclaves to allow the coordination and data transfer of multiple modules.

Such cloud applications require extensions of the current SGX model: First, TEE not only needs to protect the context in an enclave but also must confine accesses from enclaves, if necessary. Second, a software module in an enclave often needs to access the system resource via system calls. How to control the system call access must also be considered in the SGX model. Third, multiple software modules must efficiently interact with each other. However, the current SGX is not designed for facilitating inter-enclave communication.

3 Motivation

3.1 Bi-directional Isolation with Enclave

In distributed sandboxed applications, an enclave execution must be protected, but it must also be prevented from accessing memory beyond its own EPC region. In the current SGX model, the in-enclave execution is freely allowed to access the rest of its process memory, which is confined only by the operating system. There are two different ways of providing confinement supports for the current SGX enclave model.

Figure 2: Three confinement approaches: SW, HW+OS (MPK), and HW-only (Bi-enclave)

The software-based confinement for the enclave is to include the instrumented application binary and sandbox library codes together in each enclave. In such an approach, the sandbox library, as well as application binary, must be trusted. Figure 2 (a) describes the software-based approach (SW). Combining the sandbox runtime with the application binary in a single enclave increases the Trusted Computing Base (TCB) of the enclave. When a vulnerability exists in the sandbox library [CVE-2019-2054], the application code can potentially exploit the vulnerability to bypass the confinement. In addition to the increased TCB, the memory access occurring in the enclave must be verified by instrumented instructions, causing extra performance costs. A recent study [Chancel] reports that the software-based confinement incurs a slowdown of an average of 12.43%, up to 24.89% compared to native execution because it requires 23.52% more instructions.

Recent hardware supports for memory confinement such as Intel Memory Protection Keys (MPK) can mitigate the weaknesses of software-only approaches. However, the current hardware-assisted mechanism relies on page tables for tracking the isolated memory domains. Since page tables can be modified by any privileged access, the confinement assumes that the operating system is trusted. Figure 2 (b) shows the hardware-assisted approach with MPK (HW+OS). A limitation of MPK is that the PKRU registers which defines the permission for each domain are user-accessible. Therefore, the user application in an enclave must be verified not to update the PKRU registers. In addition, the current HW+OS approach relies on the security of page tables. However, recent studies showed that page tables can be vulnerable to various attacks including rowhammer attacks [Cross_VM_Row_Hammer_Attack, Seaborn, Cheng2018StillHA, Drammer, Rowhammer.js].

To overcome the limitations of the current software-only and hardware-assisted sandbox designs, this paper proposes to extend the SGX memory access control mechanism. Unlike MPK, it does not use page tables to store critical domain information. Figure 2 (c) shows the pure hardware-enforced approach of Stockade (HW-only).

3.2 OS Interactions with Bi-enclave

Figure 3: Three syscall monitor approaches: in-enclave, exo-enclave, and neutral enclave

A bi-enclave is prohibited to access the memory outside of its own EPC, but it should be allowed to issue system call requests if the system call requests can be verified for their safety. Therefore, to provide fully functioning sandbox enclaves, it is necessary to support a safe mechanism to verify system call requests and forward the filtered requests to the operating system. In addition, the returned value from the untrusted operating system may need to be checked. On the contrary, in the prior distributed sandbox approach [ryoan], a confined module is not allowed to use any system call directly.

There are two different approaches to provide system call services to enclave execution: system call emulation and system call delegation. First, the system call emulation approach imports the entire library OS [drawbridge, graphene, lkl] and C standard libraries [musl] inside an enclave [Haven, graphene-sgx, scone, occlum, enclavedom, sgx-lkl]. With the intra-enclave libOS, porting efforts for existing applications to SGX is minimized. In addition, with a carefully designed shim layer, it helps to minimize the exposed interfaces between the host and enclave which is the main attack surface of the enclave [CoinAttack]. However, this approach adds the entire software stack within an enclave, increasing the TCB of an enclave significantly. Figure 3 (a) shows the in-enclave monitor approach. It assumes that the monitor code can be completely isolated from the application module by a software confinement layer. However, as discussed in the prior subsection, the vulnerability in SW-only confinement may not guarantee the protection of the monitor.

Second, the system call delegation approach relies on the underlying OS itself, thereby reducing TCB drastically [SGXJail, PANOPLY]. Rather than including a large libOS stack within each enclave, it includes a much smaller codebase for system call interposition and verification after execution. It delegates system calls to the non-enclave mode and performs its system call. The returned results from the system call are verified inside the enclave to prevent malicious intervention. In both cases, the enclave must be able to interact with the untrusted context for system calls. Figure 3 (b) shows the exo-enclave monitor approach. It assumes that the monitor code exists as a process in unprotected environments accessible from the OS kernel. Therefore, this approach is vulnerable to attacks like privilege escalation [privilege_escalation_1, shellshock, Process-injection].

Confinement System call filter Protection against Multi module support
Related Work Type Method Type Method
SW Sandbox
Vulnerability
Privilege
Escalation
PTE
Corruption
Iago attack
3rd Party
Module
Protected
Channel
Ryoan [ryoan] SW Native Client In-enclave Libc Inter-enclave (SW)
Chancel [Chancel] SW LLVM In-enclave Libc Intra-enclave
AccTEE [AccTEE] SW WebAssembly In-enclave LibOS
Occlum [occlum] HW+OS MPX 333 In-enclave LibOS Intra-enclave
EnclaveDom [enclavedom] HW+OS MPK 222 In-enclave LibOS Intra-enclave
SGXJail [SGXJail] HW+OS MPK 222 Exo-enclave Seccomp
Stockade HW-only SGX Neutral Monitor Inter-enclave (HW)
✓: Considered / Secure ✗: Not considered / Vulnerable 333Deprecated [mpx, intelmpx] 222HW modification required
Table 1: Comparing Stockade to prior work

To allow the controlled interaction from the sandbox enclave and operating system, we propose to add a monitor enclave that can be coupled with one or more bi-enclaves. A monitor enclave is a conventional enclave, and thus it can jump to the system call function in the untrusted region. Between the bi-enclave and monitor enclave, a protected memory channel is created, and the bi-enclave sends a request to the monitor enclave. As shown in figure 3 (c), Stockade takes a different approach to others. Stockade locates the monitor in a position-neutral enclave. In the approach, the monitor is protected both from the user enclave and OS kernel.

Sandboxes using system call delegation have to consider races between sandboxes [ostia]. In addition, a prior study [emilia] observed that a Iago attack can occur across multiple components, and thus checking the return value within each enclave individually is not enough to prevent such an attack. In Stockade, the monitor can track global states among bi-enclaves thus can prevent Iago attacks against connected bi-enclaves. In addition, the design helps developer not to add Iago attack protection in every bi-enclave.

Tamper-proof resource accounting: One of the requirements for the trusted cloud service is tamper-proof resource accounting [HRA, T-lease, AccTEE]. For each user, the system resource usages must be securely tracked and reported. To support such tamper-proof accounting which can be trusted by both cloud users and service providers, it is necessary to track system resource usages by a mutually trusted entity. As a monitor enclave can be isolated from both users and OS, it can act as a neutral accountant, recording the file and network I/Os.

3.3 Interactions between Enclaves

Figure 4: Comparing inter-enclave communication between SGX and Stockade

Intel SGX supports function-like interfaces, ecall and ocall, between an enclave and unprotected side for context switching. However, SGX does not provide APIs for inter-enclave communication. The prior work used a software-based reliable messaging mechanism with shared untrusted memory between enclaves [ryoan]

. The messages are encrypted with a shared key. Message authentication code and counters ensure the integrity and freshness of the message. The protection mechanism can be further improved by padding or truncating the messages for making the attacker hard to guess the original message size. The software module for cryptography must be trusted and verified, as it is included in the enclave binary. However, Panoply 

[PANOPLY] showed that an attacker can abort the application silently and make the application misjudge its state by dropping messages even the communication channel is encrypted.

In this study, we use a similar message-based interaction model, but propose to extend the hardware memory access control to allow the protected memory channel. We propose a new secure memory sharing model between two enclaves, which uses a designated part of EPC memory region. The memory region is accessible only by the two enclaves, and the hardware memory access control and memory encryption engine protect it. It eliminates the need for software-based memory encryption, which increases the latency and may cause potential vulnerabilities as exploited by Panoply. Figure 4 shows the communication latency and bandwidth between the current SGX and proposed bi-enclave communication. The SGX model uses software-based encryption(AES-GCM) via untrusted shared memory, while the bi-enclave configuration directly shares the hardware protected memory. As shown in the figure, the latency for transferring data is much higher with SGX than Stockade, and the difference increases as the chunk size increases. In addition, the throughput of Stockade communication exhibits much higher bandwidth than the software-encrypted channel.

3.4 Comparison to the Prior Work

There have been several prior work for sandboxing within a TEE. Minibox [Minibox] presents the first two-way sandbox for native x86 code, providing secure file I/O and Iago attack protection. Ryoan [ryoan] modifies Native Client [NaCl] for its software sandboxing. Similar to Stockade, Ryoan decomposed a cloud application into distributed enclaves with the software sandboxing and SW-encrypted channels. Chancel [Chancel] proposes multi-client software fault isolation through binary instrumentation and read-only shared memory between threads. It supports multiple isolated threads within an enclave. Several studies accommodate hardware features (Intel MPX or MPK) under software control for multi-domain SFI scheme  [occlum, mdSFI, enclavedom, Chancel]. Occlum [occlum] and Enclavedom [enclavedom] provide isolated compartments within a sandboxed enclave using Intel MPX or MPK. SGXJail [SGXJail] isolates an enclave instance for each process, and the system call filtering is provided by the seccomp filters in each process.

Table 1 presents the confinement mechanism and monitor locations of the prior works compared to Stockade. Only Stockade can provide comprehensive protection against four attack types, SW sandbox vulnerability, privilege escalation, rowhammer on PTE, and Iago attack. In addition, Stockade allows secure hardware-protected inter-enclave communication.

Other related work: There are studies [Haven, scone, graphene-sgx, occlum] provide trusted library OSes running in the enclave, and enable unmodified application execution in the enclave. Panoply [PANOPLY] reduces TCB by delegates syscalls to OS and verify later. Nested Enclave [NestedEnclave] presents static sharing enclave and communication via the outer enclave. Stockade provides dynamic EPC sharing with page granularity.

3.5 Threat model

Stockade shares the basic threat model and trusted computing base (TCB) of SGX. The SGX-enabled processor package is trusted. Privileged software such as the operating system and hypervisor can be compromised by its vulnerability or any person who obtains the privilege permission. Moreover, attackers can wield direct physical attacks on on-board interconnections and external DRAM.

A different assumption of this work over SGX is in the trustworthiness of software modules running in enclaves. In SGX, a software module running in an enclave is trusted, and potential attacks from the module itself to the rest of the system are not considered. We assume that each module does not fully trust the other modules, even when they are used together to build an application. In our model, the code running in the monitor enclave is trusted, and the monitor enclave and a bi-enclave are mutually protected from each other.

Out of scope: Architecture defects [Foreshadow, foreshadow-ng, spectre, meltdown, tlbleed], side channel attacks [telling, inferring, sgxcacheattack, controlled, guard, survey, microscope], and availability are not considered in this work. For such attacks, prior patches [foreshadow_patch] and protections [tsgx, preventing, raccoon, thwarting, obfuscuro, sgx-shield, CoinAttack] can be used as orthogonal measures. Stockade does not support resiliency to code reuse attacks[guard, hacking] and arbitrary API invocation (e.g. COIN attack[CoinAttack]).

4 Architecture

4.1 Overview

Figure 5: Stockade architecture. The control flow from the untrusted context to bi-enclave is only allowed at launching

Figure 5 presents Stockade’s distributed sandboxing model. Stockade architecture has two types of enclave, bi-enclave and monitor enclave as shown in (a). Bi-enclave inherits all the properties of SGX enclave, and additionally, a bi-enclave blocks all accesses to the non-enclave memory with the hardware support. The code running in a bi-enclave is not allowed to read, write, and execute contents outside the bi-enclave memory. In addition, the control of a bi-enclave cannot be directly transferred to the non-enclave context, but it must go through the monitor enclave to interact with the rest of the system. Stockade allows the monitor enclave to communicate with the operating system (OS).

The monitor enclave works as a proxy to communicate the operating system. To interact with the operating system, a bi-enclave has to establish a secure shared memory channel to a monitor enclave. For communication with other enclaves, the same secure shared memory channel is supported. As shown in Figure 5 (b), with the monitor enclave attached to a bi-enclave, it delegates system call to OS. A monitor enclave verifies system calls based on a given profile and validates the return values of system calls to prevent known Iago attacks. In addition, the monitor enclave can track system call usage records in a mutually trusted way.

Application model: In Stockade model, a service consists of one or more mutually distrustful modules. Each module is enclosed in a bi-enclave, and mutually distrustful modules do not reside in the same bi-enclave. With the protection boundary, both bi-enclave and monitor can have multiple threads.

Target Description
SECS* New field (1 bit) for bi-enclave flag (4.1)
EPCM entry* New field (52 bits) for phsycal address of Co-owner’s SECS (4.2)
EINIT222 Set bi-enclave flag in SECS on initialization (4.1)
EEXIT222 Abort EEXIT when bi-enclave flag is set in SECS (4.2)
ESADD222 Establish shared EPC to target enclave (Ring 3) (4.3)
ESACCEPT222 Accept shareable EPC from ESADD (Ring 3) (4.3)
TLB333 Fill entry on Co-onwer access / Abort on forbidden access (4.2)
*Data Structure  222Instruction  333Access checks on TLB miss
Table 2: Summary of hardware changes

Hardware changes: The SGX security features are mostly implemented in microcode which incurs much less implementation overheads than CPU circuitry [sgx-explained] Therefore, the majority of modifications for Stockade on data structures, instructions, and access control, can be done via minor microcode changes. Table 2 shows required hardware changes for Stockade and related subsections. First, to support new enclave type, bi-enclave, SECS and EPCM entries are modified. Second, new instructions ESADD and ESACCEPT for secure channel are added, while existing instructions (EINIT, and EEXIT) are modified. Finally, the TLB miss handler has been changed to support different permission checks of Stockade with SGX. Note that the access validation is done only when TLB miss occurs. Stockade does not change any other components in processors and cache hierarchy.

4.2 Memory Protection for Stockade

Figure 6: Access control flow for Stockade. Modification are painted blue on the original SGX’s flow [sgx-explained]

Access Validation: Stockade leverages SGX memory protection features to enable bi-directional memory protection. Based on the SGX original memory isolation, Stockade provides additional memory protection in the opposite direction. Stockade protects the non-enclave memory context by preventing memory translation from a bi-enclave. Figure 6 is the hardware flowchart for Stockade’s address translation. (1) in the figure indicates the additional memory protection added for Stockade. When an enclave is not in bi-enclave mode, memory access to a non-ELRANGE virtual address is allowed. Thus, Stockade inserts a new entry to the TLB in the same way as the original SGX. However, when the enclave is in bi-enclave mode, the sandboxed code must not be allowed to translate to the outside memory. Stockade inserts an abort page to cause a failure in resolving the non-ELRANGE virtual address to a physical address. As shown in the figure, the extra access control for bi-enclave does not require any significant hardware changes; Stockade needs a minor extra condition check while handling a TLB miss.

Control transition: By calling ocall, an enclave performs a control transfer from the enclave to the non-enclave context. To perform an ocall, a normal enclave saves its state in the protected memory, cleanses all internal CPU states to prevent security leaks, and switches its mode into the non-enclave mode with EEXIT. Unlike normal enclaves, Stockade isolates a bi-enclave by disabling EEXIT instruction. The hardware modification for EEXIT is minor since CPU only checks whether the current context is in bi-enclave mode or not by reading flag in the SECS structure. However, Asynchronous Enclave Exit (AEX) is still allowed even for the bi-enclave because operating systems must handle exceptions such as page faults or interrupts. When AEX occurs, all execution contexts are securely saved in the enclave memory, and Stockade switches its execution mode to handle the exit events. The event is handled by designated hardware exception handlers in processors. During AEX, it erases any context (secrets) that may exist in the execution state [sgx-explained]. Therefore, the software in a bi-enclave cannot exploit AEX for escaping the sandbox. The saved context will be restored during the next EENTER or ERESUME. Stockade does not modify the flow of AEX from the original SGX.

Advantages: Compared to the prior SW and HW+OS confinement approaches [ryoan, Chancel, AccTEE, SGXJail], Stockade can provide hardware-supported strong isolation efficiently. Stockade does not require any extra SW layers or compiler-based validations for the confinement, unlike the prior SW approaches. Furthermore, Stockade is more robust against Spectre-like attacks which attempt to bypass protection boundary checks of the SW approaches because an unauthorized speculative access will incur a TLB miss and the address translation fails. Compared to the HW+OS approaches which keep domain IDs in vulnerable page tables, Stockade keeps the critical meta-data in the secure memory region (PRM). Therefore, the meta-data is protected from OS and from DRAM attacks including rowhammer because any bitflips in PRM are detected by the integrity validation of the hardware engine.

4.3 Pairwise Secure Shared Memory

For efficient communication between enclaves, Stockade introduces a new secure channel using the protected shared memory. The communication channel is a small piece of SGX memory exclusively shared with two enclave parties. To share an EPC page between two enclaves, the enclave pair has to agree on the memory sharing. Once the channel is established, the software-based encryption is not necessary for inter-enclave communication. In Stockade, establishing the shared channel is done by hardware. Moreover, the channel can avoid even the encryption by hardware, if the contents fit in the CPU caches for efficient data exchange.

EPCM in SGX contains the EPC mapping information to validate translation. To support the shared memory, the EPCM entry should be modified to include the sharer information. In addition to the owner enclave’s SECS address, an shared EPCM entry includes a single co-owner enclave’s SECS. The SECS address is represented physical page number related to start address of EPC [sgx-explained]. Stockade extends EPCM in the same way for the co-owner. In Stockade, a single EPC page can be shared only between two enclaves. Figure 6 shows the hardware extension for the ownership checking during a TLB miss at EPC. In (2) of the figure, when a memory access occurs to EPC, Stockade checks the corresponding EPCM entry and verifies the owner enclave. The EPCM entry has at most two enclaves as its owner and co-owner, and thus only two different enclave contexts can access the EPC page.

Sharing EPC memory: Stockade provides a new user level instruction, ESADD, to share a EPC page with a co-owner enclave. The instruction takes the EPC page address of owner enclave and the ID of co-owner enclave as inputs. When ESADD is invoked, the SGX hardware zeros the page and blocks any access to the page until the corresponding co-owner invokes ESACCEPT. ESACCEPT performs TLB synchronization to remove old mappings in TLB, and write co-owner down on corresponding EPCM. This step is similar to the dynamic EPC expansion instructions in SGX2 [sgx2].

Once a shared memory is established, two enclaves initiate the local attestation step. Each enclave’s digest (MRENCLAVE) is passed through the newly established shared memory. The digest uniquely identifies each enclave because it records all the enclave contents (page contents, related position, security flags) [sgx-explained]. If both enclaves verify each other successfully, then they finalize the channel establishment. Otherwise, Stockade destroys the channel.

Communication via APIs: The communication API is similar to ecall and ocall, but all the arguments are secured. When an enclave module invokes the API, the module performs sanitizing and marshalling all the parameters to the structure allocated in shared EPC. After that, Stockade copies the parameters into callee’s private memory to prevent possible TOCTOU(time-of-check-time-of-use) attacks [TOCTOU] and performs de-marshalling to execute a callee’s function. Because caller and callee belong to different enclaves, CPU state flush is not necessary.

4.4 Sandbox Monitor

A monitor enclave executes a software reference monitor which verifies system calls and returns values. For legitimate system calls, the monitor enclave executes those on behalf of bi-enclaves. The monitor can execute system calls to the kernel or can leverage Intel-provided C standard libraries [developer_guide].

Policy Loading: The monitor enclave reads a policy definition file for the bi-enclave which is mutually agreed and shared in advance by the application module provider and the cloud provider. To verify the policy file is correctly loaded, both a bi-enclave and an OS can query the monitor enclave to obtain the digest of the file. Bi-enclave can request the query with the key made during local attestation. Bi-enclave checks the digest value matches with its own policy digest to make sure the file is not manipulated by the OS. The monitor enclave opens an upcall interface only for serving the digest query from the OS. This is similar approach to Intel’s attestation to verify an enclave’s identity.

SYS_NUM ACTION
0        0      // read     ALLOW
1        2      // write    NOTIFY
2        1      // open     LOG
42       5      // connect  KILL
43       3      // accept   TRAP
BLACKLIST   0  "/path/to/top/secret*"
WHITELIST   2  "/path/to/no/secret/[a-z_\-\s0-9\.]"
BLACKLIST  43  "112.233.0.0/16"
Listing 1: Example policy file

Monitor as mediator: Listing  1 shows an example policy definition file supported in our system. Similar to seccomp-bpf [seccomp], the monitor enclave filters each system call with system call ID and its arguments based on fine-grained privileges to system resources through a blacklist and a whitelist. To speed up the syscall filtering, the sandbox monitor can adopt an action-based policy. If an action specifies KILL, the monitor enclave sends a request to the kernel to terminate the bi-enclave. On NOTIFY, the monitor makes a notification to the kernel and continue the execution. When the action is LOG, it writes encrypted logs of the system call to a file. On TRAP, the monitor runs a customized logic (e.g. sends a message to the module provider).

When a system call returns, the monitor enclave verifies the return value from the kernel to prevent Iago attacks. For example, Since most system calls return boolean or integer type [PANOPLY], the monitor can check whether the return values belong to a proper range of values. Stockade checks whether futex, locks, and semaphore are not shared between bi-enclave and untrusted world. For a system call that returns a descriptor or reference (e.g. open, socket), the monitor enclave keeps it in its memory so that the returned descriptor is not substituted and reused. In addition, Stockade is resilient to pointer misuses since the reference is not accessible by a bi-enclave.

Trusted accounting: As discusses, the monitor enclave builds mutual trust. The monitor enclave is isolated from both the user-provided bi-enclave and the host operating system, so it works in the neutral area where the bi-enclave and kernel can trust. This model makes new functionalities deployed in the monitoring enable other than system call monitoring. One use case is a trusted resource accounting system for function-as-a-service. The cloud resource usage accounting needs to be verified by both users and provider [HRA]. For example, the monitor enclave can record tamper-proof evidence (e.g., log file) for network and file usages because all accesses (system calls) to the resource must pass through the monitor enclave. Therefore, the monitor enclave can log resource requests from each bi-enclave, and neither a bi-enclave nor the host OS cannot modify the log contents.

Benchmark SGX enabled Lib
# of allowed
interfaces
Modified
LOC
NBench None 0 8
SSL Server OpenSSL [sgx-openssl] 18 8
File I/O bench Protected FS [protected_fs] 19 12
YCSB (SQL) SQLite [sgx-sqlite] 12 56
ML benchmark LibSVM [libsvm] 7 8
FTPS Server OpenSSL & Protected FS 37 20
Table 3: Benchmarks for evaluation.
Type Attacker Target Section
Read / Write / Execute OS Bi-enclave, Monitor 4.2
Read / Write / Execute Bi-enclave Other Bi-enclave, Monitor 4.2
Read / Write / Execute Monitor Bi-enclave 4.2
Read / Write / Execute Bi-enclave Outside sandbox 4.2
Transfer control Bi-enclave Other Bi-enclave, Monitor 4.2
Transfer control Bi-enclave Outside sandbox 4.2
Establish a connection OS Bi-enclave, Monitor 4.3
Evasdrop / Modify OS Shared Channel 4.3
Known Iago attacks OS Bi-enclave, Monitor 4.4
Table 4: Summary of security analysis of Stockade

5 Discussion

5.1 Development in Stockade

Like other SGX-based sandboxing [ryoan], applications using Stockade are compartmentalized based on protection domains. Stockade executes each protection domain in separate bi-enclaves. Module providers should specify a policy for their modules as discussed in Section 4.4. To communicate among bi-enclaves, Stockade provides APIs in the SGX SDK. Application developers simply replace the existing communication APIs (e.g. ecall/ocall) with Stockade’s APIs so that the module providers do not need to care about new interfaces. Table 3 lists benchmark with allowed interfaces and modified LOC. When the application is already ported into SGX, porting to Stockade only requires few lines for initial setup. Most of porting would be done in Makefile, which would be provided by cloud provider. If the application involves mutually untrusted modules written by multiple parties, developer has to map each untrusted module to a bi-enclave.

A case requiring developers’ porting efforts is when an application’s SGX module is written to communicate with non-enclave code (e.g., ocall to untrusted libraries in non-enclave mode). Because Stockade does not allow such communications, developers must port the non-enclave code to run inside another bi-enclave. In addition, each bi-enclave must not access over its sandbox limit to avoid abort page.

5.2 Security Analysis

Table 4 summarizes the security analysis of Stockade. An attacker tries to break isolations of bi-enclave by compromising or launching a bi-enclave. However, the attacker cannot run or access untrusted context even with ret, jmp and EEXIT as described in 4.2. The compromised bi-enclave cannot transfer its control to the monitor enclave or another bi-enclave because they are separate enclaves. Also, the compromised bi-enclave cannot create a shared memory with an arbitrary bi-enclave because every shared memory establishment is verified with the attestation. On the other hands, malicious system software cannot eavesdrop or hijack communication to mount the man-in-the-middle attacks [PANOPLY]. Stockade allows the enclave-to-enclave communication channel only via the SGX-protected memory, so the OS or hypervisor is not able to access the communication channel unlike what the original SGX does. The possible attack surfaces are syscall interfaces for the monitor enclave. We implemented known Iago attack protections by checking the file descriptor from syscall [Overshadow, Inktag] and POSIX semaphore invocations [Sego].

Limitation: An attacker may subvert the entire application by gathering code-reuse gadgets one by one across multiple modules and exploiting vulnerable APIs between them. As we mentioned in section 3.5 we leave this our limitation.

6 Evaluation

Hardware mode Simulation mode
SGX-NBench(geomean) [sgx-nbench] 6.0 6.4
SGX ecall / ocall (switchless) 2283.8 / 3748.5 3110.3 / 3783.7
Stockade inter-enclave call - 4930.5
Table 5: Hardware and simulation mode performace comparison (1000 iterations/sec)
Figure 7: Evaluation scenarios

Figure 8: Comparison of execution times between applications run on several secure systems including Stockade

6.1 Methodology

Environment: We evaluate Stockade in servers consisting of Intel CPU i7-7700, 64GB DDR4 DRAM, and Ubuntu 16.04 with Linux kernel 4.13.0. To add new hardware features, we use the simulation mode in Intel SGX driver and SDK version 2.2. The simulation mode supports SGX APIs, trusted libraries, and emulation for SGX instructions [sgx-simulation]. Table 5 shows performance comparisons between the hardware mode and the simulation mode. To capture the effect of TLB shootdown, Stockade sends ioctl to SGX driver. The driver runs mov cr3, cr3, which flushes TLB of the process.

Stockade features: Stockade hardware features are implemented mostly in SDK and Driver. Disabling ocall from bi-enclave is done by modifying emulated EEXIT instruction as described in 4.2. We modified Edger8r in SDK to generates APIs from Enclave Defined Language (EDL) format. Based on the format, the Edger8r per-API data structure for type and boundary checking. For example, pointer arrays require the number of elements to be passed for marshaling. The generated APIs are linked to enclave modules at compile time.

6.2 Sandbox Overhead

In this section, we measure sandboxing overhead of Stockade compared to software-based approaches that use process isolation and binary instrumentation. Figure 7 (a) shows the evaluation scenario.

Comparison to process-based filtering: SGXJail is a sandbox that isolates an enclave instance in a separated process confined by seccomp filters [seccomp]

. Because SGXJail is not open-sourced, we emualte SGXJail in our platform. To reproduce SGXJail’s performance, we use Firejail 

[firejail] which leverages Linux namespace and seccomp to provide system call interposition. We compared Stockade with four control groups: no protections from SGX nor sandbox (No Protection), software-based sandbox (Firejail), SGX enclave (SGX), and SGX enclave with the software sandbox (SGXJail).

Figure 8 shows normalized execution time running SQLite and ML Service. We evaluate SQLite as an I/O-intensive benchmark. SQLite runs a set of queries generated by YCSB [YCSB]. Each set contains 10,000 queries of INSERT, SELECT, and UPDATE according to the uniform key distribution in different ratios. A higher ratio in SELECT queries degrades the performance over INSERT/UPDATE since SELECT generates more syscalls than others to traverse a database. SGXJail shows the slowest performance: 1.83 slower on average compared to SGX due to frequent system call monitoring by Firejail. Meanwhile, Stockade is only 1.49 slower on average than SGX as Stockade passes syscall requests without costly IPC. As a result, Stockade shows 18.6% better performance over SGXJail while providing stronger hardware-based isolation. In ML services, SGX, SGXJail, and Stockade are similar in speed, less than 3%, as ML inference is CPU-intensive and seldom invokes syscalls.

Figure 9: Normalized performance of NBench.

Comparison to binary instrumentation: We compare Stockade with Chancel [Chancel], a software-based binary instrumentation approach for bi-directional isolation like Stockade. We run NBench which consists of ten benchmarks exposing CPU, FPU, and memory capabilities [nbench]. For fair comparision, we run Stockade with clang-4.0 with -O0 option as the same configure of Chancel. Figure 9 shows normalized performance degradations. We normalize performance by non-confined baseline. In NBench, Chancel shows 12.3% performance degradations on average over its baseline. The overhead is from Chancel’s binary instrumentation which adds additional instructions (+23.5%). However, Stockade runs NBench on hardware confined areas and doesn’t degrade performance compared to the baseline because NBench doesn’t communicate to other modules, thereby it does not incur IPC and system call monitoring overhead in the monitor enclave.

6.3 Communication cost reduction with Stockade

Figure 10: Execution time breakdown in file I/O scenario

In this section, we compare the communication cost of Stockade (HW) to SW-based approach. Figure 7 (b) represents the scenario. We develop Protected FS which provides integrity and confidentiality protection of files. A module runs the Protected FS, and a communicating module uses the Protected FS to secure its file I/O. The communication is done via the Stockade’s secure channel, so the content of the file and messages are secured. For the same guarantee, We implement Baseline which uses software encryption for secure communication, but it does not confine modules.

Figure 10 presents normalized execution time in various chunk sizes. The execution time is normalized to when the chunk size is 64B of Baseline. To observe communication cost clearly, we perform file I/O on mounted tmpfs (DRAM backend). We breakdown its performance by three factors. File I/O indicates the time taken for file APIs, and Communication shows execution time for all communications between modules including message serialization and encryption. Monitor is the overhead of Stockade monitor. Stockade’s HW-based communication effectively saves the communication cost than SW-based approaches. As consequence, Stockade shows up to 1.38 faster in read, 1.16 faster in write comparing to Baseline when the chunk size is 1KB. This speedup is caused by the elimination of costly software encryption and decryption. When the chunk size is 64B, Stockade shows up to 1.89 faster in read, 1.92 faster in write comparing to Baseline. Fine-grained file I/O operations causes more frequent communication overhead between modules, thus it stresses the communication cost.

6.4 Tamper-proof Accounting System

Figure 11: Latency distribution of FTPS requests

In a cloud system, multiple tenants compete to use the resource from limited hardware. Trusted tamper-proof accounting systems provide useful feature to securely manage resources across the tenants [HRA, T-lease, AccTEE]. Stockade can provide such a system using a monitor enclave as shown in figure Figure 7 (c). Because the monitor enclave intervenes all the system calls from bi-enclaves, it can account every stat of system calls (e.g. file I/O access, network request, memory consumption). Stockade provides a trustworthy accounting reports that the service provider can verify through attestation.

To demonstrate the scenario, we implement the following accounting system in the monitor enclave. We spawn 512 clients; each of them sends a request for a 1MB file to secure FTP server. The server takes the request and sends back corresponding files to clients. Handling the requests, the monitor enclave logs per-request resource consumption in its secure memory. Figure 11 shows the latency distribution of the requests with three different systems: SGX, Stockade, Stockade + Accounting. The median latency of Stockade is only 2% slower than SGX and the Stockade + Accounting shows negligible overhead in median and tail latencies. This implies that mutually trustful secure accounting can be achieved without large overhead via Stockade.

6.5 Secure Services with Multiple Modules

To evaluate combined benefits of Stockade, we build a secure query server containing DB and ML services. Figure 7 (d) describes the system; each service consists of multiple modules and they are isolated to each bi-enclave. The modules attests to each other and establishes secure communication channels at first. The monitor enclave reads a system call policy that specifies the least privilege of each module. For example, SQLite is not allowed to use network-related system calls such as connect or send. Each service uses confined third-party modules for secure network communication (SSL Server) and safe file management (Protected FS). Whenever a client sends security-sensitive data to DB service via SSL Server, SQLite module asks the protected file system to store or load the data. Protected FS has its own encryption key which is not accessible from SSL Server or SQLite module. In addition, the LibSVM module handles a prediction with the inference model stored in the local file system encrypted by Protected FS. Even though SSL Server is compromised, the inference model cannot be stolen due to Stockade’s isolation.

Figure 12: Normalized execution time of distributed query server scenarios

Figure 12 shows the performance of the two services. SGX indicates each module runs in an enclave and communicates with each other via unprotected channel without encryption. Each enclave is not confined, so it can perform any system calls. However, in SGXJail and Stockade, every system call has verified. The sandbox overhead of Stockade incurs a slowdown as shown in 6.2, but the efficient hardware-based encryption amortizes the performance degradation. As a result, for I/O-intensive SQLite, Stockade shows 38.9% overhead compared to SGX and 19.5% better to SGXJail on average. Despite communication overhead from adding monitor enclave, Stockade outperformed SGXJail by leveraging hardware encryption. For ML service, the performance among the three models are similar, but some overhead is added in SGXJail and Stockade due to secured communication.

7 Conclusion

This paper explores a new extension model, Stockade, for SGX to support distributed sandboxing. With a minor change in SGX, Stockade provides strong sandboxing. In addition, it allows the mutually trusted monitor enclave between the user bi-enclave and the operating system, by filtering system call requests and validating return values. The performance results show the viability of Stockade. In multi-module, Stockade shows an average 19.5% speedup for SQLite, and 1.4% speedup for ML service over the SW sandbox approach.

Acknowledgements

This work was supported by Institute for Information & communications Technology Promotion (IITP2017-0-00466). The grant is funded by the Ministry of Science and ICT, Korea. This work was also partly supported by Samsung Electronics Co., Ltd. (IO201209-07864-01).

References