BesFS: Mechanized Proof of an Iago-Safe Filesystem for Enclaves

07/02/2018 ∙ by Shweta Shinde, et al. ∙ National University of Singapore 0

New trusted computing primitives such as Intel SGX have shown the feasibility of running user-level applications in enclaves on a commodity trusted processor without trusting a large OS. However, the OS can compromise the integrity of the applications via the system call interface by tampering the return values. This class of attacks (commonly referred to as Iago attacks) have been shown to be powerful enough to execute arbitrary logic in enclave programs. To this end, we present BesFS -- a formal and provably Iago-safe API specification for the filesystem subset of the POSIX interface. We prove 118 lemmas and 2 key theorems in 3676 lines of CoQ proof scripts, which directly proves safety properties of BesFS implementation. BesFS API is expressive enough to support 17 real applications we test, and this principled approach eliminates several bugs. BesFS integrates into existing SGX-enabled applications with minimal impact to TCB (less than 750 LOC), and it can serve as concrete test oracle for other hand-coded Iago-safety checks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Existing computer systems encompass millions of lines of complex operating system (OS) code, which is highly susceptible to vulnerabilities, trusted by all user-level applications. In the last decade, a line of research has established that trusting an OS implementation is not necessary. Specifically, new trusted computing primitives (e.g. Intel SGX (McKeen et al., 2013), Sanctum (Costan, Lebedev, and Devadas, Costan et al.), PodArch (Shinde, Tople, Kathayat, and Saxena, Shinde et al.), Bastion (Champagne and Lee, 2010)) have shown the feasibility of running user-level applications on a commodity trusted processor without trusting a large OS. These are called enclaved execution primitives, using the parlance introduced by Intel SGX — a widely shipping feature in commodity Intel processors today. Applications on such systems run isolated from the OS in a region of CPU-protected memory called an enclave; the adversary model defeated by individual designs vary (see  (Costan and Devadas, 2016; Maas et al., 2013)).

The promise of enclaving systems is to minimize the trusted code base (TCB) of a security-critical application. Ideally, the TCB can be made boiler-plate and small enough to be formally verified to be free of vulnerabilities. Towards this vision, recent works have formally specified and checked the interfaces between the enclave and the CPU (Subramanyan et al., 2017; Ferraiuolo et al., 2017), as well as verified confidentiality properties of an application (Sinha et al., c; Sinha et al., a). One critical gap remains unaddressed: verifying the integrity of the application from a hostile OS. Applications are increasingly becoming easier to port to enclaves (Shinde et al., 2017; Baumann et al., 2014; che Tsai et al., 2017); however, these legacy applications optimistically assume that the OS is benign. A hostile OS, however, can behave arbitrarily violating assumptions inherent in the basic abstractions of a process or files, and exchange malicious data with the application. This threat is well-known, originally identified by Ports and Garfinkel as system call tampering (Ports and Garfinkel, 2008), and more recently discussed as Iago attacks (Checkoway and Shacham, 2013).

A number of enclave execution platforms have recognized this channel of attack, but left specifying the necessary checks out of scope. For instance, systems such as Haven (Baumann et al., 2014),  (Shinde et al., 2017), Graphene-SGX (che Tsai et al., 2017), and Scone (Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.) built on Intel SGX have alluded to syscall tampering defense as an important challenge; however, none of these systems claim a guaranteed defense. One of the reasons is that a hostile OS can deviate from the intended behavior in so many ways, and reasoning about a complete set of checks that suffices to capture all attacks is difficult.

In this work, we take a step towards a formally verified TCB to protect integrity of enclaves against a hostile OS. To maximize the eliminated attack surface and compatibility with existing OSes, we propose to safeguard at the POSIX system call interface. We scope this work to the filesystem subset of the POSIX API. Our main contribution is — a POSIX-compliant filesystem specification with formal guarantees of integrity, and a machine-checked proof of its implementation. Client applications running in SGX enclaves interact with a commodity (e.g., Linux) OS via our implementation, running as a library (see Figure 1). Applications use the POSIX filesystem API transparently (see Table 1), requiring minimal integration changes. Being formally verified, specifications and implementation can further be used to test implementations of existing platforms based on SGX and similar primitives.

Challenges & Approach

. The main set of challenges in developing are two-fold. The first challenge is in establishing the “right” specification of the filesystem interface, such that it is both safe (captures well-known attacks) and admits common benign functionality. To show safety, we outline several known syscall tampering attacks and prove that interface specification defeats at least these attacks by its very design. The attacks defeated are not limited to identified list here — in fact, any deviations from the defined behavior of the interface is treated as a violation, aborting the client program safely. To address compatibility, we empirically test a number of real-world applications and benchmarks with a -enhanced system for running SGX applications. These tests show no impact on compatibility, which bolsters our claim that the specification is rich enough to run practical applications on commodity OS implementations. The API has only core operations. However, it is accompanied crucially by a composition theorem that safeguards chaining all combinations of operations, making extensions to high-level APIs (e.g., ) easy.

The second challenge is in the execution of the proof of the implementation itself. Our proof turns out to be challenging because the properties require higher-order logic (hence the need for ) and reasoning about arbitrary behavior at points at which the OS is invoked. Specifically, the filesystem is modeled as a state-transition system where each filesystem operation transitions from one state to another. A number of design challenges arise (Section 4) in handling a stateful implementation in the stateless proof system of , and uncovering inductive proof strategies for recursive data structures used in the implementation. These proof strategies are more involved than those applied automatically by .

Results

. Our proof comprises of theorems and LOC while our implementation of is LOC in size. We add LOC for application stubs and compatibility with enclave code. We demonstrate the expressiveness of by supporting applications. We show that is compatible with state-of-the-art filesystems. It is fully compatible with a large array of benchmarks we tested. It also aids in finding implementation mistakes. We hope serves as a specification for future optimizations and hand-coded implementations to be tested against.

Contributions.

We make the following contributions:

  • We formally model the class of attacks that the OS can launch against SGX enclaves via the filesystem API; and develop a complete set of specifications to disable them.

  • We present  — a formally verified set of API implementations which are machine-check for their soundness w.r.t. API specifications. Our auto-generated run-time monitoring mechanism ensures that the runs of the concrete filesystem stay within the envelope of our specification.

  • We prove lemmas and key theorems in lines of CoQ proof scripts and evaluate correctness, compatibility and expressiveness of over a set of applications from real-world programs from SPEC CINT 2006 and filesystem benchmarks to eliminate several bugs.

2. Problem

There has been long-standing research on protecting the OS from user-level applications (Johnson and Wagner, 2004). In this work, the threat model is reversed; the applications demand protection from the OS kernel. We briefly review the specifics of Intel SGX, on which our system is built, and highlight the need for a formal approach to safety.

2.1. Background & Setup

Intel SGX provides a set of CPU instructions which can protect selected parts of user-level application logic from an untrusted operating system. Specifically, the developer can encapsulate sensitive logic inside an enclave. When the hardware starts to execute the enclave, it creates a protected virtual address space for the enclave. The CPU allocates protected physical memory from Enclave Page Cache (EPC) that backs the enclave main memory; and its content is encrypted in the main memory (RAM). Only the owner enclave can access its EPC pages at any point during execution. The hardware does not allow any other process or the OS to access or modify code and data inside the enclave’s boundary. Interested readers can refer to  (Costan and Devadas, 2016) for full details.

Due to the strict memory protection, unprotected instructions such as syscall are illegal inside the enclave. However, the application can use out calls () to executes system calls outside the enclave. The enclave code copies the parameters to the untrusted partition of the application, which in turn calls the OS system call, collects the return values and passes it back to the enclave. When the control returns to the enclave, the enclave wrapper code copies the syscall return values such as buffers from the untrusted memory to the protected enclave memory. This mechanism facilitates interactions between the enclave and non-enclave logic of an application. Almost all enclave applications need to dispatch either for standard APIs such as syscalls or for application-specific operations. To save developer time, the Intel’s Software Development Kit for SGX (SGX SDK (int, 2018)) provides a boilerplate code and tools to generate the wrapper code. The developer can provide the type signature for the function call, and the SDK generates the wrapper code and switch-case from the type-based templates (using Edger8r tool (edg, 2018)).

Enclaves ensure that the control from untrusted execution returning inside the enclave enters at the right entry points via the ENCLU[EENTER] and ENCLU[ERESUME] instructions (McKeen et al., 2013). This safety mechanism prevents the OS from resuming the enclave at arbitrary points and execute arbitrary sequences of logic inside the enclave. All the have to use ENCLU[ERESUME] instruction to re-enter the enclave. For simplicity, SGX SDK consolidates all entry and exit points of the enclave into just a few selected locations. The SGX SDK internally creates a large switch-case statement for handling of different and exits and entries respectively. To this end, the SGX SDK provides a helper function sgx_ocall which wraps the usage of ENCLU [ERESUME] instruction. It takes in the number of the as a parameter and uses it to select the right switch case. When the returns, the same number is used to un-marshal the return values.

Figure 1. Overview. Thick black and dotted outline represents trusted and untrusted components respectively.

Syscall Parameter Tampering.

This is a broad class of attacks and has been inspected in various aspects by Ports and Garfinkel (Ports and Garfinkel, 2008); a specific subclass of it is called as Iago attacks (Checkoway and Shacham, 2013). Ports-Garfinkel first showed system call tampering attacks for various subsystems such as filesystem, IPC, process management, time, randomness and I/O. For file content and metadata tampering attacks, their paper suggested defenses such as maintaining protection metadata such as a secure hash for pages in the file along protected by MAC and freshness counter stored in the untrusted guest filesystem. For file namespace management they proposed using a trusted, protected daemon to maintain a secure namespace which maps a file’s pathname to the associated protection metadata. This way, verifying if OS return values are correctly computed would be easier than undertaking to compute them. An added benefit is that the TCB of such a trusted monitoring mechanism for the untrusted kernel is smaller. The recent work on Iago attacks shows a subclass of concrete attacks on these interfaces thus highlighting that verification of return values is non-trivial for complex kernel tasks such as managing virtual memory. Iago attacks demonstrate that verifying return values may require the supervisor to have a complete understanding of a kernel’s memory management algorithms and data structures. In this paper, our focus is on the filesystem subset. Further, we concentrate mainly on enclave-like systems, but our work applies equally well to other systems (Chen et al., 2008; Hofmann et al., 2013).

Threat to Existing Systems.

Note that all systems such as Haven (Baumann et al., 2014), Scone (Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.),  (Shinde et al., 2017), Graphene-SGX (che Tsai et al., 2017) which use either SDK or hand-code wrappers must address syscall parameter tampering attacks. All the systems are upfront in acknowledging this gap and employ ad-hoc checks for each API to address a subset of attacks. See Appendix A.1 for the informal claims made by prior works. Integrity preserving filesystems (Amani et al., 2016) and formally testing if a filesystem abides by POSIX semantics (Ridge, Sheets, Tuerk, Giugliano, Madhavapeddy, and Sewell, Ridge et al.) are a stepping stone towards our goal, but their designs do not reason about intentional deviations by the untrusted OS.

2.2. Attacks

We demonstrate two representative attack capabilities on state-of-the-art enclave systems to motivate why a provable implementation (down to the details) is important: (a) executing arbitrary code inside the enclave using low-level memory exploits (Lee et al., 2017a; Checkoway and Shacham, 2013) (b) subverting the integrity of the enclave operation by violating the high-level syscall semantics (Ports and Garfinkel, 2008).

//enclave.c
char* buf = malloc(sizeof(char) * 100);
int status = ocall_fread(buf, 100, 1, fd);
//update buf
status = ocall_fwrite(buf, 100, 1, fd);
}
//ocall-helper.c
sgx_status_t SGX_CDECL ocall_fread(size_t* retval, void* ptr, size_t size, size_t nmemb, FILE ..) {
        
        ms->ms_size = size;
        ms->ms_nmemb = nmemb;
        ms->ms_FILESTREAM = FILESTREAM;
        status = sgx_ocall(FREAD, ms); //FREAD is pragma for fread in switch case
        if (retval) *retval = ms->ms_retval;
        if (ptr) memcpy((void*)ptr, ms->ms_ptr, _len_ptr);‘
        sgx_ocfree();
        return status;
Listing 1: Intel SGX SDK’s enclave mechanism.

Low-level Attacks.

Listing 1 shows an example of mechanism for fread call generated by the Intel SGX SDK (int, 2018). At line the enclave wrapper code calls the untrusted fread function which executes outside the enclave. The results generated by the fread call are copied into the enclave on line . The onus of checking the buffer sizes of such untrusted return values lies on the enclave wrapper code. In our example, line has a buffer overflow in the read system call because the SDK leaves such checks to be implemented by the client application by definition. As recently demonstrated, this buffer overflow can be used to corrupt the enclave stack and launch expressive attacks such as ROP on the enclave logic (Lee et al., 2017a). The OS can also leverage more sophisticated attacks such as data-corruption (Hu et al., 2015) to overwrite the number inside the enclave memory during un-marshaling. With such a corruption, the OS can fake a return from of a different system call. Once the OS jumps to the right return inside the enclave, the enclave starts executing the logic following the wrong return. In fact, the OS can chain enough of such return gadgets to program arbitrary logic, depending on client logic (Hu et al., 2016; Lee et al., 2017a).

High-level Attacks.

Consider an application where the enclave is executing an anti-virus scan which white-lists user files. Listing 2 shows a code snippet of such an enclave function. It reads in the names of the suspicious files (line ) and opens each file (line ). The function inspect on line then checks the signature of the file against a white-list and returns a value or . The enclave then creates a new report file for logging results for each scanned file (line ). If the file is marked benign the enclave writes safe to the .log file (line ), else it writes malicious (line ). The enclave is supposed to protect the signature files, ensure complete inspection of the suspicious files and safe logging of the scan results. However, a malware-infected OS might deviate from the expected filesystem semantics and cause the malware file to be falsely white-listed in the following ways:

char list[MAXBUFSIZ], logname[BUFSIZ];
FILE* l, fd, logname;
l = fopen (”list_of_suspicious_files”, r);
int err = fread (l, list,  );
if (!err) {
        //process each new line entry in the list
        for (f = getline(list)) {
                fd = fopen (f);
                result = inspect(fd);
                strcpy (logname, f);
                strcat (logname, ”.log”);
                log = fopen (logname);
                if (result == 0) fwrite (log, ”safe”);
                else fwrite (log, ”malicious”);
}
else
        //report that scan was successful
Listing 2: Code snippet of client enclave logic for anti-virus scanner to whitelist user files.

(A1) Paths & File Descriptor Mismatch.

The OS can use the wrong file paths, wrong permissions or wrong file names. In our example, the OS can trick the enclave into believing that it is scanning a different file on line by opening say a safe file file4956 instead of a malicious file4444. This tricks the enclave to write the scan results of the wrong file in the log and the OS succeeds in marking the malicious file4444 as safe. Alternatively, the OS can swap file descriptors on line in order to redirect all file operations to a file of its choice. So, on line instead of writing "safe" and "malicious" to file4956 and file4444 respectively, it can swap the descriptors. Thus the enclave ends up marking the file file4444 as safe.

(A2) Size Mismatch.

The OS can violate the size requested in the operations by increasing or decreasing the size of the buffers. For example, on line , instead of returning ["file4956", "file4345", "file1538"], the fread call returns ["file4956", "file4345"] to bypass the checks for "file1538".

(A3) Iago Attacks on File Content.

Apart from simple parameter tampering, the OS can do subtle attacks at the memory mapping layer for file content. This includes (a) mapping multiple file blocks of the same or different files to single physical block (b) read/write content from/to the wrong offset or block (c) misalign the sequence of file blocks in a file. In our example, the OS can mark any file with any tag it wants by manipulating the file to block mapping in the above ways. If the last file to be scanned is file4956 and it is safe, then the enclave is about to write the tag safe in the file file4956.log on line . At this point, the OS can map the blocks of all the .log files to a single physical block. Thus, when the enclave writes to file4956.log, the safe tag is written to all the .log files. The OS can do similarly file-to-block manipulation attacks as and when it desires to achieve arbitrary effects.

(A4) Error Code Manipulation.

The OS can change the error codes returned by the filesystem and force the enclave to take a different control-flow path in its execution. In our example on line , the enclave checks if there was an error while reading the list of suspicious files that it wants to scan. If the enclave encounters an error, it simply reports that the scan succeeded (line ), with zero malicious file warnings. The OS can intentionally send the error code indicating the file does not exist and thus bypass the checks from line . Note that this attack is more than just denial of service because the OS does return back an error value (so it does not deny the service), but it misrepresents filesystem state.

We do not claim to be the first to showcase these attacks. Further, our list of attacks is not exhaustive. They are merely a representative of the intractably large number of ways the OS can cheat, depending on the logic of the client application. This motivates a strong defense which not only strictly defines an acceptable behavior but also flags all violations as potentially dangerous.

3. Design

All the classes of filesystem API attacks covered in Section 2.2 stem from the fact that the OS can deviate from its expected semantics. This, in turn, leads to exploitable behavior inside the enclave.

3.1. Approach

Attacks on an enclaved application can arise at multiple layers of the filesystem stack (Appendix A.2). Our choice of API to formally proof-check is guided by the observation that the higher the layer we safeguard, the larger the attack surface we can eliminate, and the more implementation-agnostic the API becomes. One could include all the layers including the disk kernel driver, where content is finally mapped to persistent storage, in the enclave. Enforcing safety at this interface will require simply encrypting/decrypting disk blocks with correct handling for block positions (Kwon et al., 2016). Alternatively, one could include a virtual filesystem management layer, which maps file abstractions to disk blocks and physical page allocations, in the enclave — as done in several LibraryOS systems like Graphene-SGX (che Tsai et al., 2017; Baumann et al., 2014). To ensure safety at this layer, the model needs to reason about simple operations (reads, writes, sync, and metadata management). Further up, one could design to protect at the system call layer, leaving all of the logic for a filesystem (e.g., journaling, physical page management, user management, and so on) outside the enclave. However, this still includes the entire library code (e.g. the logic) which manages virtual memory of the user-level process (heap management, allocation of user-level pages to buffers and file-backed pages). This is about MLOC in glibc and KLOC in musl-libc, for instance. Once we include such a TCB inside the enclave, we either need to prove its implementation safety or trust it with blind faith. We decide to model our API above all of these layers, excluding them from the TCB.

models the POSIX standard for file sub-system. POSIX is a documented standard, with small variations across implementations on various OSes (Ridge, Sheets, Tuerk, Giugliano, Madhavapeddy, and Sewell, Ridge et al.); in contrast, many of the other layers do not have such defined and stable interfaces. At the POSIX layer, models the file/directory path structures, file content layouts, access rights, state metadata (file handles, position cursors, and so on). Specifically, ensures safety without the need to model virtual-to-physical memory management, storage, specifics of kernel data structures for namespace management (e.g., Linux inode, user groups), and so on. is thus generic and compatible with different underlying filesystem implementations (NFS, ext4, and so on). Further, this choice of API reduces the complexity of the proofs as they are dispatched for simpler data structures.

Solution Overview.

is an abstract filesystem which ensure that the OS follows the semantics of a benign filesystem i.e., the OS is exhibiting a behavior which is observationally equivalent to a good OS. This way, instead of enlisting potentially an infinite set of attacks, we define a good OS and deviation from it is categorized as an attack from a compromised or a potentially malicious OS. Specifically, our definition of a good OS includes POSIX-compliance and a set of safety properties expected from the underlying filesystem implementation. We design a set of core filesystem APIs along with a safety specification. Table 1 shows this POSIX-compliant interface, which can be invoked by an external client program running in the SGX enclave. It has a set of methods, states, and safety properties (SP1-SP5 and TP1-TP13) defined in Section 3.2. Each method operates on a starting state (implicitly) and client program inputs. The safety properties capture our definition of a benign OS behavior. Empirically, we show in Section 6 that the real implementations of existing OS, when benign, satisfy the safety properties — the application executes with the interface as it does with direct calls to the OS. Further, the safety properties reject any deviations from a benign behavior, which at least includes all the attacks outlined in Section 2.2.

The safety provided is proven to be compositional. First, the state safety properties (SP1-SP5) ensure that if we invoke a core API operation in a good (safe) state, we are guaranteed to resume control in the application in a good state. Second, we show that calls are chainable, i.e., the good state after a call can be used as an input to any of the calls, through a set of safe transition properties (TP1-TP13). This compositionality is crucial to allow executions of benign applications which make a potentially infinite set of calls; further, one can model higher level API (e.g. the fprintf interface in libc) by composing two or more API operations.

Scope.

aims strictly at integrity property; it does not claim any guarantees about the privacy and the confidentiality of the file operations. A number of side-channels and hardware mistakes are known which impact the confidentiality guarantees of SGX (Xu, Cui, and Peinado, Xu et al.; Wang et al., 2017). Out of lemmas in , only one lemma assumes the correctness of the cryptographic operations. Specifically, assumes the secrecy of its AES-GCM key used to ensure the integrity of the filesystem content. Our lemma assumes that the underlying cryptography does not allow the adversary to bypass the integrity checks by generating valid tags for arbitrary messages. Further, we assume that the adversary does not know the AES-GCM key used by the enclave to generate the integrity tags.

3.2. Interface

Interface Pre-condition Transition Relation
fs_close
fs_open
fs_mkdir
fs_create
fs_remove
fs_rmdir
fs_stat
fs_readdir
fs_chmod
fs_seek
fs_read
  
fs_write
       
fs_truncate
Table 1. Interface. Method API, pre-conditions, transition relations and post-conditions. denotes everything in is the same as , only is replaced with . In Column , the and symbols denote set addition and deletion operations. denotes new mapping is added and denotes update of a mapping in relation.

interface is a state transition system. Specifically, it defines a set of valid filesystem states and methods to move from one state to another. While doing so, also dictates which transitions are valid by a set of transition properties.

State.

has a set of type variables (denoted in sans-serif font type) which together define a state. Specifically, state comprises valid paths in the filesystem (), mapping from paths to file and directory identifiers and metadata (), a set of open files () and the memory map of file content ().

All file and directory paths that exist in the filesystem are captured by path set , where represents the data type path.

A directory path type can be specifically denoted by , whereas a file path type is denoted by . We also define the operator which takes in a path and returns the parent path. For example, if the path is /foo/bar/file.txt, then gives the parent path /foo/bar.

captures the information about the files and directories via the node map . Thus, associates an identifier to each file and directory for simplifying the operations which operate on file handles instead of paths. We represent the user read, write and execute permissions by Permission. The size field for a file signifies the number of bytes of file content. For directories, the size is supposed to signify the number of files and directories in it. For simplicity, currently does not track the number of elements in the directory and all the size field for all the directories is always set to . For a path , we use the subscript notations , , and to denote the id, permissions, and size respectively.

Each open file is tracked using via its file id. also tracks the current cursor position for the open file to facilitate operations on the file content. Given a tuple in , for simplicity, we use subscript notations and to denote the id and the cursor position of that file.

The file content is stored in a byte memory and each byte can be accessed using the tuple file id and the specific position in the file.

Thus, the state can be defined by the tuple . The state variables cannot take arbitrary values, instead, they must abide by a set of state properties defined by . For path set , enforces that the entries in the path set are unique and do not contain circular paths (Chari et al., 2010; Canetti et al., 2011). This ensures that each directory contains unique file and directory names by the definition of a path set. All files and directories in have unique identifiers and are mapped by the partial function to their metadata such as permission bits and size. Formally, this is defined as:

(SP1)

All open file IDs have to be registered in the . can only have unique entries and the cursor of an open file handle cannot take a value larger than that file’s current size.

(SP2)
(SP3)
(SP4)

The does not allow any overlap between addresses and has a one-to-one mapping from virtual address to content. The partial function ensures this by definition. All file operations are bounded by the file size. Specifically, the memory can be dereferenced only for offsets between and the EOF. Any attempts to access file content beyond EOF are invalid by definition in . Similarly, the current cursor position can only take values between and EOF. We represent such invalid memory accesses by the symbol . Formally, this is defined as:

(SP5)

State Transitions.

interface specifies a set of methods listed in API in Table 1. Each of these methods takes in a valid state and user inputs to transition the filesystem to a new state. Thus, interface facilitates safe state transitions. Formally, we represent it as , where is the interface method invoked on state to produce a new state

. The vector

represents the explicit results of the interface. This way, enforces state transition atomicity i.e., if the operation is completed successfully then all the changes to the filesystem because of the operations must be reflected; if the operation fails, then does not reflect any change to the filesystem state. Formally,

Safety Properties.

satisfies the state properties at the initialization. This is because the start state () is empty. Specifically, all the lists are empty and the mappings do not have any entries. So, they trivially abide by the state properties in (). Once the user starts interfacing with the state, we need to ensure that the state properties (SP1-SP5) still hold true. Further, each interface itself dictates a set of constraints – for example, a file should be opened first in order to close it. Thus, such interface-specific properties not only ensure that the state is valid but also specify the safe behavior for each interface. Transition properties TP1-TP13 in Table 1 specify the relation between type map, state and the state transition in .

3.3. How Do Our Properties Defeat Attacks?

Our state properties in Section 3.2 and transition properties in Table 1 are strong enough to defeat attacks described in Section 2.2.

Path Mismatch (A1a).

state ensures that each path is uniquely mapped to a directory or a file node. All methods which operate on paths first check if the path exists and if so is the operation allowed on that file/directory path. For example, for a method call readdir("foo/bar"), the path foo/bar may not exist or can be a file path instead of a directory path. SP1 ensures that file directory paths are distinguished, are unique and are mapped to the right metadata information. Subsequently, any queries or changes to the path structure ensure that these properties are preserved. For example, fs_create checks if the parent path is valid and if the file name pre-exists in the parent path. When all the pre-conditions are met, the corresponding state variables are updated (SP4).

File Descriptor Mismatch (A1b).

Similar to path resolution, file descriptor resolution is critical as well. Once the file is opened successfully, all file-content related operations are facilitated via the file descriptor. ensures that the file name to descriptor mappings are unique and are preserved while the file is open. Further, maps any updates to the metadata or file content via the file descriptor in such a way that it can detect any mapping corruption attempts from the OS (SP5).

Size Mismatch (A2).

’s atomicity property ensures that the filesystem completely reflects the semantics of the interface during the state transition. Our file content specific operations have properties which ensure that performs the operation on the entire size specified in the input. The post-conditions of fs_read, fs_write and fs_truncate reflect this in Table 1.

File Content Manipulation (A3).

The unique mapping property (SP5) of ensures that the OS cannot reorder or overlap the underlying pages of the file content.

Error Code Manipulation (A4).

All violations of state or transition properties during the execution of the interface correspond to a specific error code. Each of these error codes distinctly represents which property was violated. For example, if the user tries to read using an invalid file descriptor, the SP3 and TP11 properties are violated and return an eBadF

error code. All error types in map to standard error codes in the POSIX API specification. If there are no violations and the state transition succeeds, returns the new filesystem state and a error code. Since interface performs its own checks to identify error states, so the enclave does not rely on the OS to return the right error codes. This way, we ensure that the OS cannot manipulate the enclave logic by returning wrong error codes.

3.4. Implementation

defines a collection of data structures that are sufficient to capture the filesystem state and completely model the interfaces in Section 3.2. We build types by using pre-defined types built from ascii, list, nat, bool, set, record, string, map and by composing or induction over one or more types in standard libraries. We give their simplified definition below.

All files and directories in have ids and respectively. These ids are mapped to the corresponding file and directory nodes and . Specifically, stores the file name, permissions and all the pages that belong to this file and the size of the file; stores the directory name, permission bits, and the number of files and directories inside it. The filesystem layout stores the and in a tree form to represent the directory tree structure. The list of open file handles stores tuples of and cursor position. Lastly, each page is a sequence of Pg_Size bytes which is typical size of a page 111We set the page size (Pg_Size) to 4096 bytes. and has a unique page number . Finally, the entire filesystem memory is stored as a list of pages . In summary, implementation represents its filesystem state as below:

Our implementation must satisfy the state properties SP1-SP5 and transition properties TP1-TP13 we outlined in Section 3.2. We discuss how we achieve this for each data structure. Table 2 summarizes the invariants for our data structure implementation.

Virtual Memory ().

The filesystem memory is represented by a set of virtual memory pages such that each page is a sequence of Pg_Size bytes and is represented by a unique page id . Unallocated pages are marked as free in the pool. Each file comprises an ordered sequence of pages allocated from a pool of free pages. One page can belong only to a single file. This ensures that no two files have overlapping page memory.

Files & Directories ().

Each file’s information including file name, the current size of the file, permission bits of the file is stored in a file node . Each file’s content is a sequence of bytes, partitioned into uniformly sized pages. This content is tracked by keeping an ordered list of virtual memory page ids []. For example, the first id in a file node’s page list points to the exact page in the virtual memory where the first bytes of the page are stored. also maintains a map which associates each file node with a unique file identifier . Similar to file nodes, also has directory nodes to track directory information such as names and permissions. Each directory is associated with a unique directory id . The directory map tracks the one-to-one relationship between ids and nodes.

Layout & Paths ().

tracks the paths for all files and directories via a tree layout . Each node in the tree can be a file node id or a directory node id . Files are leaf nodes. On the other hand, each directory, in turn, can have its own tree layout. Note that does not allow cycles in the tree layout. Also, each level of the layout tree has non-duplicate directory and file names.

Virtual
Memory
Files &
Directories
Layout &
Paths
Open file
handles
Table 2. data structures invariants from Section 3.4.

Open File Handles ().

Each open file has a file handle which is allocated when the file is first opened. The file handle comprises the file id and the current cursor position for that file. tracks all the list of open files via the open file handles list . All operations on an open file are done via its file handle. When the file is closed, the file handle is removed from the list. Further, the list cannot have any duplicate because each open file can have only one handle.

Error Codes.

In cases where the filesystem cannot complete the operation successfully, the enclave should receive the right error code to know the exact reason for failure. models a subset of error codes as specified by POSIX. This ensures that the attacker cannot alter the enclave’s behavior by via error codes.

Good State.

must satisfy all the data structure invariants in Table 2 before and after any interface invocation to be in a good state. We can summarize a state as good if the following holds true:

Known Limitations.

implementation does not support a small set of filesystem operations such as symbolic links, file-backed mapping, shared files, and rename because they violate our safety properties. We have consciously decided to not support these functionalities in our first version of to maintain simplicity. However, there is no fundamental limitation in extending specification and proofs to a broader set of file operations in the future.

4. Safety Proof

The key theorems for our implementation are that the functions meet our interface specifications. For each method of our interface, we must prove that the implementation satisfies the state properties (SP1-SP5) from Section 3.2 and the transition properties (TP1-TP13) outlined in Table 1. We assume is running on a hostile OS that can take any actions permitted by the hardware.

Theorem 4.1 (State Transition Safety).

Given a good state satisfying , then if we execute to reach state , then is always a good state and relation between and is valid according to the transition relation :

We can verify sequences of calls to our functions by inductively chaining this theorem. Our second theorem states that the state property is preserved for a composition of any sequence of interface calls. We close the proof loop with induction by starting in a good initial state and using Theorem 4.1 to show that a method invocation in always produces a good state for a sequential composition of transitions. The proof is dispatched using the proof assistant.

Theorem 4.2 (Compositional Safety).

Given a good initial state subject to a sequence of transitions always produces a good final state :

4.1. Proof Assistant

As one can readily see, our implementation uses recursive data structures, and its state properties require second-order logic. For example, in the filesystem layout in Section 3.4 is defined represented mutually recursively in terms of a forest (a list of trees). This motivates our choice of , an interactive proof assistant supporting calculus of inductive constructions. allows the prover to write definitions of data structures and interface specification in a language called , which is a purely functional language. The statements of the theorems are written in Galina as well. The proofs of the statement, called proof scripts are written in a language called . makes writing proofs less tedious as it supports a library of “tactics”, or one-line commands that encode standard proof strategies.

The system performs two operations with proof scripts. First, it mechanically checks that the proof script entails the statement of the theorem. If the proof cannot go through, it interacts with the prover by showing parts of the proof that are not complete as “holes”, prompting the human prover to provide a proof script for each hole. Second, after the entire proof is checked, the proof script is converting to a program. The type of that program is the statement of the theorem. proof system embodies the Curry-Howard correspondence between typing and programming, enabling rich statements to be written as mathematic types (Pierce et al., 2017).

.

is a functional programming language similar to and . The following code listing shows code snippet for the implementation of write method. It starts with the keyword Definition. It can be split into two parts: the signature (arguments and return type separated by colon) before := and the body after :=. Most part of the code is self-explanatory; so, we turn attention to specific features of for readers. In line , the return type State FSState ErrCode indicates that we adopt the state monad to ease the state passing coding style (Wadler, 1992; Peyton Jones and Wadler, 1993). Lines and can be seen as getting and setting actions of the state. They are classical syntactic sugar in monadic style programming: the fs is not a variable but an argument representing the state. On line , the action externalCall actually does tasks: increase the global counter retrieved from the state, perform the external call with the global counter as the additional argument, append logs to the state and put back the new counter to the state.

Definition fs_write (fId: Fid) (buf: string)
   (pos: nat) : State FSState ErrCode :=
  do fs <- getFS;
  let opos := find (fun x => (fst x) =? fId) fs.(open_handles) in
  match opos with
  | None => return_ eBadF
  | Some _ => 
    do err <- externalCall (Call_VLSeek fId (pos_to_vpage pos)) (v_lseek fId (pos_to_vpage pos));
    if (isNotSucc err) then return_ err else 
    putFMap  >>
  end.

Language.

allows the programmer to write lemmas, which have a representation in . In the following code listing, we can see the statement is written in , which actually declares a program whose type is the statement of the lemma. The script called tactics between Proof and Qed is written in . During the interactive development, human provers can see the effect of each tactic on the proof goals and finally prove it by trial and error. From the perspective of , those tactics guide it to construct a program written in , then will check whether its type is the statement after Lemma, we call this step as “mechanized verification”. We also prove helper lemmas to simplify proofs.

 Lemma fs_write_ok: forall fId buf pos fs err fs’,
   (err, fs’) = fs_write fId buf pos fs -> good_file_system (fsFS fs) ->
   (~ In fId (map fst (openhandleFS fs)) /\ …) \/ 
   (… /\ In (fId, …) (openhandleFS fs’) /\
   (forall id, id <> fId -> (fMapFS fs) id = (fMapFS fs’) id) /\ …)
 Proof.
   intros. unfold fs_write in H. 
   - right. rm_hif_eqn H. 1: left; 
     + right. destruct p0 as 
   - left. inversion H. intuition..
 Qed.

4.2. Challenges

Purely Functional.

The programming language provided by is purely functional, having no global state variables. However, the filesystem is inherently stateful. So, we use state passing to bridge this gap. The state resulting from the operation of each method is explicitly passed as a parameter to the next call. If we explicitly pass these state in each call, it is prone to clutter and accidental omission; therefore, we define them as a monad. As we can see in the definition of fs_write, the code is purely functional but it looks like the traditional imperative program. The benefit of this monadic style programming is that it hides the explicit state passing, which makes the code more elegant and less error-prone.

While proof script checking, if encounters a memoized expression for , it will skip proving again. This is a challenge because in a sequence of system calls the same call to with identical arguments may return different values. Therefore, we have to force to treat each call as different. To implement this, we introduce an implicit counter as an argument to all calls, which increments after each call completes. For example, consider the consecutive external calls read_dir, create_dir, and read_dir. The two read_dir commands may read the same directory (the same argument) but with different return values because of the create_dir command. To reason about such cases, the real arguments passed to the external calls contain not only common arguments but also an ever-increasing global counter. Thus, in our read_dir example, the two commands with original argument will be represented as read_dir and read_dir so that will treat them as the different commands.

Atomicity

. The purely functional nature of proofs helps to prove atomicity of each method call. In an enclave, the internal state of the enclave is not accessible by the OS; so, in a way, the enclave behaves as a pure function between two OS calls. This allows us to prove atomicity directly. We structure the proof script to check if an error state is reachable from the input state and the OS returned values; if so, the input state is retained as the output state. If no error is possible, the output state is set to the new state. For concrete illustration, the write method illustrates it progressively checks 5 conditions (1: Argument id is in the handler. 2: The specified position is correct. 3: It writes to the copied virtual memory successfully. 4: The external call to seek succeeds. 5: The external call to write succeeds.) before changing the state.

Non-deterministic Recursive Termination.

’s consistency guarantees that any theorem about a program is consistent, i.e., it cannot be both proven and disproved. Further, all programs in must terminate, since the type of the program is the statement of a theorem 222A non-terminating program such as has an arbitrary type, and hence any theorem is valid about it.. uses a small set of syntactic criteria to ensure the termination. ’s termination requirement poses challenges for writing a implementation, which uses recursive data structures. In most cases, the termination proof for properties are automatic; however, for a small number of properties, we have to provide an explicit termination proof. For instance, the write_to_buffer does not admit a syntactic check for termination, as there is a recursive call. To prove termination, we show via induction that size of the input buffer strictly reduces for each invocation of write. Effectively, we establish that there are no infinite chains of nested recursive calls for that program.

Mutually Recursive Data Structures.

Most of our data structure proofs are based on the induction principle, and always provides an induction scheme for each inductively declared structure. The automatically generated induction scheme from is not always strong enough to prove for some of our properties. Specifically, a key data structure in our design is a tree, the leaves of which are a list of trees — this represents the directory and file layouts (Section 3.2) — is the case. We provide an inductive statement Tree_ind2 that is stronger than -provided induction scheme Tree_ind, shown in the following listing. Tree_ind is correct but useless. We dispatch the proof by the principle of strong induction, which is Tree_ind2. Our induction property uses ’s second-order logic capability, as the following code listing shows that the sub-property P is an input argument to the main property. A number of specific instances of properties instantiate P in our full proof.

Tree_ind: forall P : Tree -> Prop,
  (forall f : Fid, P (Fnode f)) -> (forall (d : Did) (l : list Tree),
  P (Dnode d l)) -> forall t : Tree, P t
Tree_ind2: forall P : Tree -> Prop,
  (forall f : F, P (Fnode f)) -> (forall (d : Did) (l : list Tree),
  Forall P l -> P (Dnode d l)) -> forall t : Tree, P t

External Calls to the OS.

In our proof, we assume that calls to the OS always terminate to allow to provide a proof. If the call terminates, the safety is guaranteed; the OS can, of course, decide not to terminate which constitutes as a denial-of-service attack.

Odds & Ends.

Out of lemmas, of them are proved using inductions while the resort of them are proved by logical deductions. There are two kinds of inductions in our proofs: strong induction and weak induction. Their difference is the proof obligation. For example, in weak induction we need to prove “if is true then is true” while in strong induction it is “if is true for all less than or equal to then is true”. Our customized induction principle for Tree is a typical strong induction. In all, we proved lemmas by induction of which are by strong induction and the rest are by weak induction.

We do not implement the function get_next_free_page but enforce that an implementation must satisfy the property that the new page allocated by get_next_free_page is not used for existing files and is a valid page (less than the upper bound limit). Similarly, for functions new_fid and new_did we enforce the new ids are unique to avoid conflict which is formally stated as and respectively. Note that we only give a specification for allocating new pages and ids for files and directories because we do not want to restrict the page management and namespace management algorithm. This way, the implementation can use a naive strategy of just allocating a new id/page for each request, employ a sophisticated re-use strategy to allocated previously freed ids, or use temporal and spatial optimizations for page allocation as long as they fulfill our safety conditions.

5. to Executable Code

’s definitions and proof script comprise LOC with lemmas and main theorems 333will be released at https://github.com/shwetasshinde24/BesFS. The development effort for was approximately one year man hours for designing the specifications and proving them. Our proofs are complete without any unproven axioms. implementation has been machine checked to prove the safety theorems. But we cannot execute the code directly inside the enclave. Currently, supports automatic extraction to OCaml, , and Scheme (coq, 2018). In our first round of evaluation, we extracted our code to and compiled it with along with wrappers to tunnel the syscalls to the underlying untrusted OS. This setup ran out of the box in a non-SGX environment. However, we failed to execute our compiled binary implementation inside SGX using existing systems (e.g., Graphene-SGX (che Tsai et al., 2017) or  (Shinde et al., 2017)). Our investigation shows that Graphene-SGX cannot support a simple hello-world binary. This is because Graphene-SGX does not support a set of syscalls (create_timer, set_timer, delete_timer) used by the runtime. We attempted to add support for these system calls, but they depend on sigaction handling which is not yet supported in Graphene-SGX. We ran into similar problems with OCaml implementation of . Currently, no other publicly available system supports , OCaml or Scheme run-time inside SGX. In fact, all the current public system for SGX only support C code. Thus, we have resorted to manual extraction from -to-C. We first convert implementation to C manually by hand-coding line by line from -to-C. Our C implementation comprises of LOC core logic and LOC helper functions, totaling LOC. Our code leaves out the implementation of untrusted POSIX calls. While executing the code inside the enclave, these calls have to be redirected to an actual filesystem provided by the OS.

Component Language LOC Size (in KB)
Machine-proved Implementation
definitions & Proofs 3676 1757.38
Hand-coded Implementation
Implementation C 863 172.39
External Call Interface C 469 201.55
SGX Utils C 117 667.04
Total 1449 1040.98
Table 3. LOC for various components of .

Our implementation can be integrated with any SGX framework (Shinde et al., 2017; che Tsai et al., 2017; Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.). We tested Graphene-SGX as our first choice for integration and checked if it can execute our unmodified benchmarks inside an enclave. However, Graphene-SGX segfaults on our a large subset of our benchmark. Next, we chose as our underlying enclave-execution system (Shinde et al., 2017). We tunnel the POSIX calls from enclave to the untrusted environment using ’s interface. By default, converts the application call arguments to its own representation, makes the and converts the return values to the data type expected by the application. For example, has an internal representation of file descriptors and directory descriptors. But the actual API invoked by the application and implemented in the external library use file pointers (FILE*) or integers for descriptors. maintains a mapping between its own representation and the descriptors. For adding support, we wrap the application calls and marshal its arguments to make them compatible with interface described in Section 3.2. Once collects the return values from the external call, we unmarshal the return values and give it back to . Our wrapper then performs its checks on the return values and converts back the results to a data type expected by the application. If deems the results as safe, we return the final output of the API call to the application. Otherwise, we flag a safety violation. We add a total of LOC to the code-base, which is within the realm of auditing. Readers can refer to Appendix A.3 for the detailed breakdown of LOC.

Future work can certify the process of creating machine code from our implementation. Existing certified compilers do not support extraction from to enclave executable code; however, a roadmap to this feasibility is discussed in Appendix A.4.

6. Evaluation

Our evaluation goal is to demonstrate the following:

  • safety definition is compatible with the semantics of POSIX APIs expected by benign applications.

  • Our API has the right abstraction and is expressive enough to support a wide range of applications.

  • The bugs uncovered in our implementation due to formal verification efforts.

  • can be integrated into a real system.

Experimental Setup.

All our experiments were conducted on a machine with Intel Skylake i7-6600U CPU (2.60GHz, 4 cores) with 12GB memory and 128MB EPC of which 96MB is available to user enclaves. We execute our benchmark on Ubuntu 14.04 LTS with Linux Kernel 4.2. We use to run our benchmarks in an enclave, which internally uses Intel SGX SDK Linux Open Source version 1.6 (sgx, 2018). Our system uses ext4 (ext, 2018) as the underlying POSIX compliant filesystem for our experiments.

Benchmarks.

We use the benchmark suite from  (Chen et al., 2015) — a filesystem written and verified in the proof assistant for crash tolerance. It comprises applications to test each system call and different sequences of filesystem operations on large and small files. For testing on real-world applications, we use programs from SPEC CINT2006 (spe, 2018). ’s available case studies do not include any of our benchmarks. So we port all of our target benchmarks to successfully. However, for our CPU bound benchmarks, we were able to port programs from SPEC. We were unable to port the rest of the benchmarks because some programs from SPEC (omnetpp, perlbench, xalancbmk) use non-C APIs which are not supported in . Other limitations such as lack of support for longjmp in ’s SDK version prevent us from running the gobmk and gcc programs. Our final evaluation is on a total of applications: programs from SPEC and programs from .

6.1. Expressiveness & Compatibility

Libc
Calls
SPEC CINT 2006 Total
astar mcf bzip2 hmmer libqu h264 sjeng single small large
Core Calls
close 3 0 5 0 0 4 0 5 4 2 23
open 6 0 5 0 0 2 0 6 4 2 25
mkdir 0 0 0 0 0 0 0 0 1 0 1
remove 0 0 6 4 0 0 0 0 0 0 10
stat 0 0 0 1 0 0 0 1 0 0 2
chmod 0 0 1 0 0 0 0 0 0 0 1
lseek 0 0 0 0 0 6 0 0 0 4 10
read 33 0 1 0 0 3 0 1 2 3 43
write 0 0 3 0 0 4 0 2 2 2 13
Auxiliary Calls
fread 0 0 3 68 12 1 0 0 0 0 84
fscanf 12 0 0 0 0 9 0 0 0 0 21
fwrite 0 0 4 84 1 4 0 0 0 0 93
fprintf 0 5 89 304 3 308 13 0 1 22 745
fopen 1 2 10 23 2 19 3 0 0 0 60
fseek 0 0 0 11 0 2 0 0 0 0 13
rewind 0 0 1 7 0 0 0 0 0 0 8
ftell 0 0 0 4 0 1 0 0 0 0 5
fgetc 0 0 2 0 1 0 0 0 0 0 3
fgets 0 3 0 47 0 0 0 3 0 0 53
Unsafe Calls
fsync 0 0 0 0 0 0 0 0 0 2 2
rename 0 0 0 1 0 0 6 0 0 0 7
Total 55 10 130 554 19 363 22 18 14 37 1222
Table 4. Frequency of filesystem calls. Rows and represent the frequency of core and auxiliary calls supported by respectively. Rows show the frequency of unsafe calls for each of our benchmarks.

maintains compatibility with of the filesystem API calls in our benchmarks. We empirically demonstrate that if the underlying filesystem and OS are POSIX compliant and benign then is not overly restrictive in the safety conditions. We first analyze all filesystem calls made by our benchmarks for various workloads using strace and ltrace respectively. We then filter out the fraction of calls related to filesystem. Table 4 shows the statistics of the type of filesystem call and its frequency for each of our benchmarks. We observe a total of filesystem calls comprising of unique APIs. can protect of these calls. Table 5 shows how we support the remaining calls by composing using ’s API.

Compositional Power of BesFS.

directly reasons about calls using the core APIs outlined in Section 3.2. We use ’s composition theorem and support all set of auxiliary APIs that have to be intercepted such that checks all the file operations for safety. For example, fgets reads a file and stops after an EOF or a newline. The read is limited to at most one less character than size parameter specified in the call. We implement fgets by using ’s core API for read (see Table 5). Since we do not know the location of the newline character, we read the input file character-by- character and stop reading only when we see a new line, end of file or the buffer size reaches the value size. Similarly, when writing the content to the output file we already know the total size of the buffer (e.g., after resolving the format specifiers in fprintf) thus we write the complete buffer in one single call. Many of the calls allow the application to specify flags in order to decide what all operations the API must perform. For example, the application can use the fopen API to open the file for writing. If the application specifies the append flag ("a"), the library will create the file if it does not exist, and position the cursor at the end of the file. To achieve the same functionality using , we first try to open the file, if it fails with an ENOENT error, we check if the parent directory exists. If so, we first create a new file. If the file exists, we open the file and then explicitly seek the cursor to the end of the file. Thus, even if there exists a one-to-one mapping from to APIs, we still have to use multipleAPIs to realize the semantics of various modes/flags supported by . We implement and support a total of flags in total for our APIs which require flags. Note that our implementation currently supports only the common flags used by applications. However, the support can be extended to other flags if necessary for an application.

does not reason about the safety of the remaining APIs which amount to a total of calls in our benchmarks. Although does not support these unsafe calls, it still allows the enclave to perform those calls. Only of our benchmarks invoke at least one unsafe API. Importantly, these unsupported calls do not interfere with the runs in our test suite and do not affect our test executions. By the virtue of ’s atomicity property, synchronization calls sync/fsync/fdatasync have to be implicitly invoked for the OS after each function call to persist the changes by each call. We experimentally confirm that the program produces the same output with and without , thus reaffirming that we do not alter the program behavior because of our safety check.

Libc
API
LOC Core API used for composition of LibC API
fstat read open close seek create mkdir rmdir remove chmod readdir truncate write
read 7
fread 25
fscanf 34
fwrite 12
write 20
fprintf 15
fopen 78
open 60
fclose 9
close 17
fseek 31
lseek 39
rewind 5
creat 30
mkdir 25
unlink 21
chmod 23
ftruncate 5
ftell 12
fgetc 9
fgets 25
readdir 10
Table 5. Expressiveness of . Row represents a file system API used by our benchmarks. Column represents the LOC added to implement the API. Columns represent the core APIs supported by . A ✓in a cell represents that the API is used to compose API.

6.2. Do Proofs Help in Eliminating Bugs?

We encountered many mistakes that our proof eliminates during the development process as a part of our proof experience. They highlight the importance of a machine-proved implementation.

Example 1: Seek Specification Bug.

In at least two of our functions, we need to test whether the position of the current cursor is within the range of the file, in other words, less than the length of the file. If the cursor is beyond the scope of a specific file, any further operation such as read or write is illegal. In the early versions of our implementation, we simply put “if pos < size” as a judgment. But during the proof, we found we cannot prove certain assertions because we ignore the corner case: when the file is just created with 0 size, the only valid position is also 0. In this sense, the proof helped us to find a bug.

Example 2: Write Implementation Bug.

The function write in takes in pos as an argument, which represents the position at which the buffer is to be written. In our initial implementation of write, we were using the name pos for the cursor stored in the open handles (). Thus, we had two different variables being referred to by the same name. As a result, the second variable value (the cursor) shadowed the write position. Due to this bug, our implementation of write was violating the specification for the argument pos. We uncovered it when our proof was not going through. However, once we fixed the bug by renaming the input argument, we were able to prove the safety of write.

Example 3: Panoply & Intel SGX SDK Overflow Bugs.

When makes fread and fwrite calls, it passes the size of the buffer and a pointer to the buffer. The default Intel SDK generated code is then responsible for copying the buffer content from the enclave to the untrusted part for write or the other way around for fread. piggybacks on the calls to read and write encrypted pages. While integrating code in , our integrity checks after read/write calls were failing. On further inspection, we identified stack corruption bugs in both fread and fwrite implementations of . Specifically, if the buffer size is larger than the maximum allowed stack size in the enclave configuration file (greater than 64KB in our experiments), even if we pass the right buffer size, the enclave’s stack is corrupted. To fix this issue, we changed the SDK code to splice the buffer into smaller sizes (less than KB) to read/write large buffers. After our fix, the implementation passed checks.

Example 4: Panoply Error Code Bugs.

POSIX specification for fopen call states that the function shall fail with error code ENOENT if a component of the filename does not name an existing file or filename is an empty string. When we used ’s fopen interface to tunnel ’s open call, did not return the expected error code when the file did not exist. checks after the external call flagged a warning of a safety condition violation. This was because did not have a record of this file but the external call claimed that the file existed. We investigated this case and discovered that had a bug in its errno passing logic. In fact, on further testing of other functions using , we found distinct functions where ’s error codes were incorrect.

6.3. Performance

is the first formally verified filesystem for SGX and performance is not our primary goal. Future optimizations can use API as an oracle for golden implementation. For completeness of the paper, we report our preliminary performance measurements. We observe average overhead of for the SPEC CINT2006 benchmarks with our highly unoptimized implementation. For the I / O intensive benchmarks the overhead is larger. Interested readers can refer to Appendix A.5 for more details. There is ample scope for SGX optimization using well-known techniques discussed in the previous literature (Weisse et al., 2017; Orenbach et al., 2017; Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.). We outline a set of optimization strategies in Appendix A.6 for interested readers.

7. Related Work

SGX Attacks & Defenses.

reasons about the integrity of its filesystem APIs and relies on SGX’s integrity guarantees from the hardware. It makes an assumption on the confidentiality properties of SGX only in one of its lemmas, assuming secrecy of a cryptographic key. This design choice is an important one in light of the many side-channels that have been discovered on the SGX platform (Götzfried et al., 2017; Schwarz et al., 2017; Brasser et al., 2017b; Moghimi et al., 2017; Liu et al., 2015; Xu, Cui, and Peinado, Xu et al.; Shinde et al., 2016; Bulck et al., 2017; Hähnel et al., 2017; Wang et al., 2017; Chen et al., 2017; Lee et al., 2017b) and more recently hardware mistakes in speculative execution (Kocher et al., 2018; Lipp et al., 2018). assumes that the hardware is securely implemented, and is agnostic to the defenses the enclave might deploy for ensuring confidentiality (Gruss et al., 2017; Brasser et al., 2017a; Kuvaiskii et al., 2017; Shih et al., 2017; Strackx and Piessens, 2017; Fu et al., 2017; Sasy et al., 2017), on top of integrity properties.

Filesystem Support in SGX.

Ideally, the enclave should not make any assumptions about the faithful execution on the untrusted calls and should do its due diligence before using any (implicit or explicit) results of each untrusted call. The effects of malicious behavior of the OS on the enclave’s execution depends on what counter-measures the enclave has in place to detect and / or protect against an unfaithful OS. Currently, the common ways to facilitate the use of filesystem APIs inside an enclave are:

  • Port the entire filesystem inside the enclave (Ahmad et al., 2018; Hunt et al., 2016).

  • Keep the filesystem outside the enclave (Shinde et al., 2017; che Tsai et al., 2017); and for each return parameters, check the data types, bounds on the IO buffers, valid value ranges of API specific values such as error codes, flags, and structures.

  • Implement a filesystem shield (Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.), such that the enclave encrypts all the file data before writing it outside and decrypts the data being read.

All methods help to reduce the attack surface of file syscall return value tampering but do not provably thwart all the attacks in Section 2.2. Appendix A.1 details how their claims lack formal proofs of comprehensiveness. There are several other protected filesystems designed to defend against an untrusted OS in a non-enclave setting, but none of them are formally verified (Hofmann et al., 2013; Kwon et al., 2016).

Verified Guarantees for Enclaves.

Formal guarantees have been a subject of investigation in the context of enclaved applications. Various efforts are underway to provide provable confidentiality guarantees for pieces of code executing inside the enclave. Most notably, Moat (Sinha et al., c) formally models various adversary models in SGX and ensures that the enclave code does not have any vulnerabilities which leak confidential information. /Confidential (Sinha et al., b) builds on Moat to provide a narrow information release channel for enclaves to reduce the attack surface. IMPe builds a type-system to provides a strong non-interference-based information security guarantee for enclave code (Gollamudi and Chong, 2016). All these efforts are towards confidentiality and are orthogonal to ’s integrity goals.

Another line of verification research has focussed on certifying the properties of the SGX hardware primitive itself, which assumes to be correctly implemented. Accordion (Leslie-Hurd et al., 2015) provides a DSL and uses model checking to ensure that the concurrent interactions between SGX instructions and the shared hardware state maintain linearizability property (Herlihy and Wing, 1990). Komodo (Ferraiuolo et al., 2017) is a formally specified and verified monitor for isolated execution which ensures the confidentiality and integrity of enclaves. TAP (Subramanyan et al., 2017) does formal modeling and verification to show that SGX and Sanctum (Costan, Lebedev, and Devadas, Costan et al.) provide secure remote execution which includes integrity, confidentiality, and secure measurement. However, the existing works on verified filesystems cannot be simply added on top of TAP (Subramanyan et al., 2017) because they do not reason about an untrusted OS. is a layer above the hardware abstractions provided by TAP and Komodo.

Filesystem Verification.

Formal verification for large-scale systems such as operating systems (Klein et al., 2009; Gu et al., 2016; Nelson et al., 2017; Yang and Hawblitzel, 2010), hypervisors(Alkassar et al., 2010), driver sub-systems (Chen et al., 2016) and user- applications (Hawblitzel et al., 2014) has been a long-standing area of research. None of these works consider a Byzantine OS, which leads to a completely different modeling of properties. Filesystem verification for benign OS, however, is in itself a challenging task (Keller et al., 2013; Joshi and Holzmann, 2008) and is well studied. This includes building abstract specifications (Schierl et al., 2009; Arkoudas et al., 2004; Gardner et al., 2014), systematically finding bugs (Yang et al., 2006) and POSIX non-compliance (Ridge, Sheets, Tuerk, Giugliano, Madhavapeddy, and Sewell, Ridge et al.) in filesystem implementations. Apart from end-to-end verified implementations (Amani et al., 2016; Schellhorn et al., 2014), filesystems are also built to provided crash consistency (Bornholt, Kaufmann, Li, Krishnamurthy, Torlak, and Wang, Bornholt et al.; Fryer et al., 2012), refinement (Sigurbjarnarson, Bornholt, Torlak, and Wang, Sigurbjarnarson et al.), recovery (Chen et al., 2015) and safety (Chen et al., 2017).

8. Conclusion

is a formal and provably Iago-safe API specification for the file-system subset of the POSIX interface. We prove lemmas and two key theorems for safety properties of implementation. API is expressive enough to support real applications we test and our principled approach eliminates several bugs.

Acknowledgements.
We thank Michael Steiner from Intel for his feedback. Thanks to Shruti Tople, Shiqi Shen, Teodora Baluta and Zheng Leong Chua for their feedback and assistance in the preparation of this draft. This research was partially supported by a grant from the National Research Foundation, Prime Ministers Office, Singapore under its National Cybersecurity R&D Program (TSUNAMi project, No. NRF2014NCR-NCR001-21) and administered by the National Cybersecurity R&D Directorate.

References

  • (1)
  • ext (2018) 2018. Ext4 Filesystem Documentation. https://www.kernel.org/doc/Documentation/filesystems/ext4.txt. (2018).
  • edg (2018) 2018. Intel SGX edger8r Tool. https://github.com/intel/linux-sgx/tree/master/sdk/edger8r/. (2018).
  • int (2018) 2018. Intel Software Guard Extensions SDK - Documentation — Intel Software. https://software.intel.com/en-us/sgx-sdk/documentation. (2018).
  • sgx (2018) 2018. intel/linux-sgx-driver at sgx_driver_1.6. https://github.com/intel/linux-sgx-driver/tree/sgx_driver_1.6. (2018).
  • spe (2018) 2018. SPEC CINT2006 Benchmarks. https://www.spec.org/cpu2006/CINT2006/. (2018).
  • coq (2018) 2018. Standard Library — The Coq Proof Assistant. https://coq.inria.fr/library/Coq.extraction.Extraction.html. (2018).
  • Ahmad et al. (2018) Adil Ahmad, Kyungtae Kim, Muhammad Ihsanulhaq Sarfaraz, and Byoungyoung Lee. 2018. OBLIVIATE: A Data Oblivious File System for Intel SGX. In 25th Annual Network and Distributed System Security Symposium, NDSS.
  • Alkassar et al. (2010) Eyad Alkassar, Mark A. Hillebrand, Wolfgang Paul, and Elena Petrova. 2010. Automated Verification of a Small Hypervisor. In Verified Software: Theories, Tools, Experiments, Gary T. Leavens, Peter O’Hearn, and Sriram K. Rajamani (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 40–54.
  • Amani et al. (2016) Sidney Amani, Alex Hixon, Zilin Chen, Christine Rizkallah, Peter Chubb, Liam O’Connor, Joel Beeren, Yutaka Nagashima, Japheth Lim, Thomas Sewell, Joseph Tuong, Gabriele Keller, Toby Murray, Gerwin Klein, and Gernot Heiser. 2016. Cogent: Verifying High-Assurance File System Implementations. In International Conference on Architectural Support for Programming Languages and Operating Systems. Atlanta, GA, USA, 175–188. https://doi.org/10.1145/2872362.2872404
  • Anand et al. (2017) Abhishek Anand, Andrew Appel, Greg Morrisett, Zoe Paraskevopoulou, Randy Pollack, Olivier Savary Belanger, Matthieu Sozeau, and Matthew Weaver. 2017. CertiCoq: A verified compiler for Coq - POPL 2017. In CoqPL 2017 The Third International Workshop on Coq for Programming Languages (CoqPL’17).
  • Arkoudas et al. (2004) Konstantine Arkoudas, Karen Zee, Viktor Kuncak, and Martin Rinard. 2004. Verifying a File System Implementation. In Formal Methods and Software Engineering, Jim Davies, Wolfram Schulte, and Mike Barnett (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 373–390.
  • Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer (Arnautov et al.) Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Daniel O’Keeffe, Mark L Stillwell, David Goltzsche, Dave Eyers, Rüdiger Kapitza, Peter Pietzuch, and Christof Fetzer. SCONE: Secure Linux Containers with Intel SGX. In OSDI ’16.
  • Baumann et al. (2014) Andrew Baumann, Marcus Peinado, and Galen Hunt. 2014. Shielding Applications from an Untrusted Cloud with Haven. In OSDI.
  • Bornholt, Kaufmann, Li, Krishnamurthy, Torlak, and Wang (Bornholt et al.) James Bornholt, Antoine Kaufmann, Jialin Li, Arvind Krishnamurthy, Emina Torlak, and Xi Wang. Specifying and Checking File System Crash-Consistency Models (ASPLOS ’16).
  • Brasser et al. (2017a) Ferdinand Brasser, Srdjan Capkun, Alexandra Dmitrienko, Tommaso Frassetto, Kari Kostiainen, Urs Müller, and Ahmad-Reza Sadeghi. 2017a. DR.SGX: Hardening SGX Enclaves against Cache Attacks with Data Location Randomization. CoRR abs/1709.09917 (2017). arXiv:1709.09917 http://arxiv.org/abs/1709.09917
  • Brasser et al. (2017b) Ferdinand Brasser, Urs Müller, Alexandra Dmitrienko, Kari Kostiainen, Srdjan Capkun, and Ahmad-Reza Sadeghi. 2017b. Software Grand Exposure: SGX Cache Attacks Are Practical. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/brasser
  • Bulck et al. (2017) Jo Van Bulck, Nico Weichbrodt, Rüdiger Kapitza, Frank Piessens, and Raoul Strackx. 2017. Telling Your Secrets without Page Faults: Stealthy Page Table-Based Attacks on Enclaved Execution. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 1041–1056. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/van-bulck
  • Canetti et al. (2011) Ran Canetti, Suresh Chari, Shai Halevi, Birgit Pfitzmann, Arnab Roy, Michael Steiner, and Wietse Venema. 2011. Composable Security Analysis of OS Services. In Proceedings of the 9th International Conference on Applied Cryptography and Network Security (ACNS’11). Springer-Verlag, Berlin, Heidelberg, 431–448. http://dl.acm.org/citation.cfm?id=2025968.2026002
  • Champagne and Lee (2010) D. Champagne and R. B. Lee. 2010. Scalable architectural support for trusted software. In HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. 1–12. https://doi.org/10.1109/HPCA.2010.5416657
  • Chari et al. (2010) Suresh Chari, Shai Halevi, and Wietse Z. Venema. 2010. Where Do You Want to Go Today? Escalating Privileges by Pathname Manipulation. In NDSS. The Internet Society.
  • che Tsai et al. (2017) Chia che Tsai, Donald E. Porter, and Mona Vij. 2017. Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA, 645–658. https://www.usenix.org/conference/atc17/technical-sessions/presentation/tsai
  • Checkoway and Shacham (2013) Stephen Checkoway and Hovav Shacham. 2013. Iago Attacks: Why the System Call API is a Bad Untrusted RPC Interface. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’13). ACM, New York, NY, USA, 253–264. https://doi.org/10.1145/2451116.2451145
  • Chen et al. (2017) Haogang Chen, Tej Chajed, Alex Konradi, Stephanie Wang, Atalay Ileri, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2017. Verifying a high-performance crash-safe file system using a tree specification. In Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP 2017). Shanghai, China.
  • Chen et al. (2016) Hao Chen, Xiongnan (Newman) Wu, Zhong Shao, Joshua Lockerman, and Ronghui Gu. 2016. Toward Compositional Verification of Interruptible OS Kernels and Device Drivers. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA, 431–447. https://doi.org/10.1145/2908080.2908101
  • Chen et al. (2015) Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich. 2015. Using Crash Hoare Logic for Certifying the FSCQ File System. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15). ACM, New York, NY, USA, 18–37. https://doi.org/10.1145/2815400.2815402
  • Chen et al. (2017) Sanchuan Chen, Xiaokuan Zhang, Michael K. Reiter, and Yinqian Zhang. 2017. Detecting Privileged Side-Channel Attacks in Shielded Execution with DéJà Vu. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security (ASIA CCS ’17). ACM, New York, NY, USA, 7–18. https://doi.org/10.1145/3052973.3053007
  • Chen et al. (2008) Xiaoxin Chen, Tal Garfinkel, E. Christopher Lewis, Pratap Subrahmanyam, Carl A. Waldspurger, Dan Boneh, Jeffrey Dwoskin, and Dan R.K. Ports. 2008. Overshadow: A Virtualization-based Approach to Retrofitting Protection in Commodity Operating Systems. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII). ACM, New York, NY, USA, 2–13. https://doi.org/10.1145/1346281.1346284
  • Costan and Devadas (2016) Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. Cryptology ePrint Archive, Report 2016/086. (2016). http://eprint.iacr.org/2016/086.
  • Costan, Lebedev, and Devadas (Costan et al.) Victor Costan, Ilia Lebedev, and Srinivas Devadas. Sanctum: Minimal Hardware Extensions for Strong Software Isolation. In USENIX Security ’16.
  • Ferraiuolo et al. (2017) Andrew Ferraiuolo, Andrew Baumann, Chris Hawblitzel, and Bryan Parno. 2017. Komodo: Using verification to disentangle secure-enclave hardware from software, In 26th ACM Symposium on Operating Systems Principles (SOSP’17). https://www.microsoft.com/en-us/research/publication/komodo-using-verification-disentangle-secure-enclave-hardware-software/
  • Fryer et al. (2012) Daniel Fryer, Kuei Sun, Rahat Mahmood, Tinghao Cheng, Shaun Benjamin, Ashvin Goel, and Angela Demke Brown. 2012. Recon: Verifying File System Consistency at Runtime. Trans. Storage 8, 4, Article 15 (Dec. 2012), 29 pages. https://doi.org/10.1145/2385603.2385608
  • Fu et al. (2017) Yangchun Fu, Erick Bauman, Raul Quinonez, and Zhiqiang Lin. 2017. Sgx-Lapd: Thwarting Controlled Side Channel Attacks via Enclave Verifiable Page Faults. In Research in Attacks, Intrusions, and Defenses, Marc Dacier, Michael Bailey, Michalis Polychronakis, and Manos Antonakakis (Eds.). Springer International Publishing, Cham, 357–380.
  • Gardner et al. (2014) Philippa Gardner, Gian Ntzik, and Adam Wright. 2014. Local Reasoning for the POSIX File System. In Programming Languages and Systems, Zhong Shao (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 169–188.
  • Gollamudi and Chong (2016) Anitha Gollamudi and Stephen Chong. 2016. Automatic Enforcement of Expressive Security Policies Using Enclaves. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016). ACM, New York, NY, USA, 494–513. https://doi.org/10.1145/2983990.2984002
  • Götzfried et al. (2017) Johannes Götzfried, Moritz Eckert, Sebastian Schinzel, and Tilo Müller. 2017. Cache Attacks on Intel SGX. In Proceedings of the 10th European Workshop on Systems Security (EuroSec’17). ACM, New York, NY, USA, Article 2, 6 pages. https://doi.org/10.1145/3065913.3065915
  • Gruss et al. (2017) Daniel Gruss, Julian Lettner, Felix Schuster, Olya Ohrimenko, Istvan Haller, and Manuel Costa. 2017. Strong and Efficient Cache Side-Channel Protection using Hardware Transactional Memory. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 217–233. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/gruss
  • Gu et al. (2016) Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, USA, 653–669. http://dl.acm.org/citation.cfm?id=3026877.3026928
  • Hähnel et al. (2017) Marcus Hähnel, Weidong Cui, and Marcus Peinado. 2017. High-Resolution Side Channels for Untrusted Operating Systems. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA, 299–312. https://www.usenix.org/conference/atc17/technical-sessions/presentation/hahnel
  • Hawblitzel et al. (2014) Chris Hawblitzel, Jon Howell, Jacob R. Lorch, Arjun Narayan, Bryan Parno, Danfeng Zhang, and Brian Zill. 2014. Ironclad Apps: End-to-end Security via Automated Full-system Verification. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, USA, 165–181. http://dl.acm.org/citation.cfm?id=2685048.2685062
  • Herlihy and Wing (1990) Maurice P. Herlihy and Jeannette M. Wing. 1990. Linearizability: A Correctness Condition for Concurrent Objects. ACM Trans. Program. Lang. Syst. 12, 3 (July 1990), 463–492. https://doi.org/10.1145/78969.78972
  • Hofmann et al. (2013) Owen S. Hofmann, Sangman Kim, Alan M. Dunn, Michael Z. Lee, and Emmett Witchel. 2013. InkTag: Secure Applications on an Untrusted Operating System. In Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’13). ACM, New York, NY, USA, 265–278. https://doi.org/10.1145/2451116.2451146
  • Hu et al. (2015) Hong Hu, Zheng Leong Chua, Sendroiu Adrian, Prateek Saxena, and Zhenkai Liang. 2015. Automatic Generation of Data-Oriented Exploits. In Proceedings of the 24th USENIX Security Symposium.
  • Hu et al. (2016) Hong Hu, Shweta Shinde, Sendroiu Adrian, Zheng Leong Chua, Prateek Saxena, and Zhenkai Liang. 2016. Data-Oriented Programming: On the Expressiveness of Non-control Data Attacks. In IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016. 969–986. https://doi.org/10.1109/SP.2016.62
  • Hunt et al. (2016) Tyler Hunt, Zhiting Zhu, Yuanzhong Xu, Simon Peter, and Emmett Witchel. 2016. Ryoan: A Distributed Sandbox for Untrusted Computation on Secret Data. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, GA, 533–549. https://www.usenix.org/conference/osdi16/technical-sessions/presentation/hunt
  • Johnson and Wagner (2004) Rob Johnson and David Wagner. 2004. Finding User/Kernel Pointer Bugs with Type Inference. In Proceedings of the 13th Conference on USENIX Security Symposium - Volume 13 (SSYM’04). USENIX Association, Berkeley, CA, USA, 9–9. http://dl.acm.org/citation.cfm?id=1251375.1251384
  • Joshi and Holzmann (2008) Rajeev Joshi and Gerard J. Holzmann. 2008. A Mini Challenge: Build a Verifiable Filesystem. Springer Berlin Heidelberg, Berlin, Heidelberg, 49–56. https://doi.org/10.1007/978-3-540-69149-5_6
  • Keller et al. (2013) Gabriele Keller, Toby Murray, Sidney Amani, Liam O’Connor, Zilin Chen, Leonid Ryzhyk, Gerwin Klein, and Gernot Heiser. 2013. File Systems Deserve Verification Too!. In Proceedings of the Seventh Workshop on Programming Languages and Operating Systems (PLOS ’13). ACM, New York, NY, USA, Article 1, 7 pages. https://doi.org/10.1145/2525528.2525530
  • Klein et al. (2009) Gerwin Klein, Kevin Elphinstone, Gernot Heiser, June Andronick, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch, and Simon Winwood. 2009. seL4: Formal Verification of an OS Kernel. In Proceedings of the ACM SIGOPS 22Nd Symposium on Operating Systems Principles (SOSP ’09). ACM, New York, NY, USA, 207–220. https://doi.org/10.1145/1629575.1629596
  • Kocher et al. (2018) Paul Kocher, Daniel Genkin, Daniel Gruss, Werner Haas, Mike Hamburg, Moritz Lipp, Stefan Mangard, Thomas Prescher, Michael Schwarz, and Yuval Yarom. 2018. Spectre Attacks: Exploiting Speculative Execution. ArXiv e-prints (Jan. 2018). arXiv:1801.01203
  • Kuvaiskii et al. (2017) Dmitrii Kuvaiskii, Oleksii Oleksenko, Sergei Arnautov, Bohdan Trach, Pramod Bhatotia, Pascal Felber, and Christof Fetzer. 2017. SGXBOUNDS: Memory Safety for Shielded Execution. In Proceedings of the Twelfth European Conference on Computer Systems (EuroSys ’17). ACM, New York, NY, USA, 205–221. https://doi.org/10.1145/3064176.3064192
  • Kwon et al. (2016) Youngjin Kwon, Alan M. Dunn, Michael Z. Lee, Owen Hofmann, Yuanzhong Xu, and Emmett Witchel. 2016. Sego: Pervasive Trusted Metadata for Efficiently Verified Untrusted System Services. In ASPLOS.
  • Lee et al. (2017a) Jaehyuk Lee, Jinsoo Jang, Yeongjin Jang, Nohyun Kwak, Yeseul Choi, Changho Choi, Taesoo Kim, Marcus Peinado, and Brent ByungHoon Kang. 2017a. Hacking in Darkness: Return-oriented Programming against Secure Enclaves. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 523–539. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/lee-jaehyuk
  • Lee et al. (2017b) Sangho Lee, Ming-Wei Shih, Prasun Gera, Taesoo Kim, Hyesoon Kim, and Marcus Peinado. 2017b. Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, Vancouver, BC, 557–574. https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/lee-sangho
  • Leroy (2018) Xavier Leroy. 2005 - 2018. The CompCert verified compiler. http://compcert.inria.fr/. (2005 - 2018).
  • Leslie-Hurd et al. (2015) Rebekah Leslie-Hurd, Dror Caspi, and Matthew Fernandez. 2015. Verifying Linearizability of Intel® Software Guard Extensions. In Computer Aided Verification, Daniel Kroening and Corina S. Păsăreanu (Eds.). Springer International Publishing, Cham, 144–160.
  • Lipp et al. (2018) Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg. 2018. Meltdown. ArXiv e-prints (Jan. 2018). arXiv:1801.01207
  • Liu et al. (2015) F. Liu, Y. Yarom, Q. Ge, G. Heiser, and R.B. Lee. 2015. Last-Level Cache Side-Channel Attacks are Practical. In IEEE S&P.
  • Maas et al. (2013) Martin Maas, Eric Love, Emil Stefanov, Mohit Tiwari, Elaine Shi, Krste Asanovic, John Kubiatowicz, and Dawn Song. 2013. PHANTOM: Practical Oblivious Computation in a Secure Processor. In Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS ’13). ACM, New York, NY, USA, 311–324. https://doi.org/10.1145/2508859.2516692
  • McKeen et al. (2013) Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V. Rozas, Hisham Shafi, Vedvyas Shanbhogue, and Uday R. Savagaonkar. 2013. Innovative Instructions and Software Model for Isolated Execution. In Proceedings of the 2Nd International Workshop on Hardware and Architectural Support for Security and Privacy (HASP ’13). ACM, New York, NY, USA, Article 10, 1 pages. https://doi.org/10.1145/2487726.2488368
  • Moghimi et al. (2017) Ahmad Moghimi, Gorka Irazoqui, and Thomas Eisenbarth. 2017. CacheZoom: How SGX Amplifies The Power of Cache Attacks. CoRR abs/1703.06986 (2017). arXiv:1703.06986 http://arxiv.org/abs/1703.06986
  • Nelson et al. (2017) Luke Nelson, Helgi Sigurbjarnarson, Kaiyuan Zhang, Dylan Johnson, James Bornholt, Emina Torlak, and Xi Wang. 2017. Hyperkernel: Push-Button Verification of an OS Kernel. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP ’17). ACM, New York, NY, USA, 252–269. https://doi.org/10.1145/3132747.3132748
  • Orenbach et al. (2017) Meni Orenbach, Pavel Lifshits, Marina Minkin, and Mark Silberstein. 2017. Eleos: ExitLess OS Services for SGX Enclaves. In Proceedings of the Twelfth European Conference on Computer Systems (EuroSys ’17). ACM, New York, NY, USA, 238–253. https://doi.org/10.1145/3064176.3064219
  • Peyton Jones and Wadler (1993) Simon L. Peyton Jones and Philip Wadler. 1993. Imperative Functional Programming. In Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’93). ACM, New York, NY, USA, 71–84. https://doi.org/10.1145/158511.158524
  • Pierce et al. (2017) Benjamin C. Pierce, Arthur Azevedo de Amorim, Chris Casinghino, Marco Gaboardi, Michael Greenberg, Cǎtǎlin Hriţcu, Vilhelm Sjöberg, and Brent Yorgey. 2017. Software Foundations.
  • Ports and Garfinkel (2008) Dan R. K. Ports and Tal Garfinkel. 2008. Towards Application Security on Untrusted Operating Systems. In HOTSEC.
  • Ridge, Sheets, Tuerk, Giugliano, Madhavapeddy, and Sewell (Ridge et al.) Tom Ridge, David Sheets, Thomas Tuerk, Andrea Giugliano, Anil Madhavapeddy, and Peter Sewell. SibylFS: Formal Specification and Oracle-based Testing for POSIX and Real-world File Systems (SOSP ’15).
  • Sasy et al. (2017) Sajin Sasy, Sergey Gorbunov, and Christopher W. Fletcher. 2017. ZeroTrace : Oblivious Memory Primitives from Intel SGX. Cryptology ePrint Archive, Report 2017/549. (2017). https://eprint.iacr.org/2017/549.
  • Schellhorn et al. (2014) Gerhard Schellhorn, Gidon Ernst, Jörg Pfähler, Dominik Haneberg, and Wolfgang Reif. 2014. Development of a Verified Flash File System. In Abstract State Machines, Alloy, B, TLA, VDM, and Z, Yamine Ait Ameur and Klaus-Dieter Schewe (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 9–24.
  • Schierl et al. (2009) Andreas Schierl, Gerhard Schellhorn, Dominik Haneberg, and Wolfgang Reif. 2009. Abstract Specification of the UBIFS File System for Flash Memory. In Proceedings of the 2Nd World Congress on Formal Methods (FM ’09). Springer-Verlag, Berlin, Heidelberg, 190–206. https://doi.org/10.1007/978-3-642-05089-3_13
  • Schwarz et al. (2017) Michael Schwarz, Samuel Weiser, Daniel Gruss, Clémentine Maurice, and Stefan Mangard. 2017. Malware Guard Extension: Using SGX to Conceal Cache Attacks. CoRR abs/1702.08719 (2017). arXiv:1702.08719 http://arxiv.org/abs/1702.08719
  • Shih et al. (2017) Ming-Wei Shih, Sangho Lee, Taesoo Kim, and Marcus Peinado. 2017. T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs (NDSS). Internet Society.
  • Shinde et al. (2016) Shweta Shinde, Zheng Leong Chua, Viswesh Narayanan, and Prateek Saxena. 2016. Preventing Page Faults from Telling Your Secrets. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security (ASIA CCS ’16). ACM, New York, NY, USA, 317–328. https://doi.org/10.1145/2897845.2897885
  • Shinde et al. (2017) Shweta Shinde, Dat Le Tien, Shruti Tople, and Prateek Saxena. 2017. Panoply: Low-TCB Linux Applications With SGX Enclaves. In 24th Annual Network and Distributed System Security Symposium, NDSS.
  • Shinde, Tople, Kathayat, and Saxena (Shinde et al.) Shweta Shinde, Shruti Tople, Deepak Kathayat, and Prateek Saxena. PodArch: Protecting Legacy Applications with a Purely Hardware TCB. Technical Report.
  • Sigurbjarnarson, Bornholt, Torlak, and Wang (Sigurbjarnarson et al.) Helgi Sigurbjarnarson, James Bornholt, Emina Torlak, and Xi Wang. Push-button Verification of File Systems via Crash Refinement (OSDI’16).
  • Sinha et al. (b) Rohit Sinha, Manuel Costa, Akash Lal, Nuno Lopes, Sanjit Seshia, Sriram Rajamani, and Kapil Vaswani. A Design and Verification Methodology for Secure Isolated Regions. In PLDI ’16.
  • Sinha et al. (a) Rohit Sinha, Manuel Costa, Akash Lal, Nuno P. Lopes, Sriram Rajamani, Sanjit A. Seshia, and Kapil Vaswani. A Design and Verification Methodology for Secure Isolated Regions (PLDI ’16).
  • Sinha et al. (c) Rohit Sinha, Sriram Rajamani, Sanjit Seshia, and Kapil Vaswani. Moat: Verifying Confidentiality of Enclave Programs (CCS ’15).
  • Strackx and Piessens (2017) R. Strackx and F. Piessens. 2017. The Heisenberg Defense: Proactively Defending SGX Enclaves against Page-Table-Based Side-Channel Attacks. ArXiv e-prints (Dec. 2017). arXiv:cs.CR/1712.08519
  • Subramanyan et al. (2017) Pramod Subramanyan, Rohit Sinha, Ilia Lebedev, Srinivas Devadas, and Sanjit A. Seshia. 2017. A Formal Foundation for Secure Remote Execution of Enclaves. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS ’17). ACM, New York, NY, USA, 2435–2450. https://doi.org/10.1145/3133956.3134098
  • Wadler (1992) Philip Wadler. 1992. The Essence of Functional Programming. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’92). ACM, New York, NY, USA, 1–14. https://doi.org/10.1145/143165.143169
  • Wang et al. (2017) Wenhao Wang, Guoxing Chen, Xiaorui Pan, Yinqian Zhang, XiaoFeng Wang, Vincent Bindschaedler, Haixu Tang, and Carl A. Gunter. 2017. Leaky Cauldron on the Dark Land: Understanding Memory Side-Channel Hazards in SGX. CoRR abs/1705.07289 (2017). arXiv:1705.07289 http://arxiv.org/abs/1705.07289
  • Weisse et al. (2017) Ofir Weisse, Valeria Bertacco, and Todd Austin. 2017. Regaining Lost Cycles with HotCalls: A Fast Interface for SGX Secure Enclaves. In Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA ’17). ACM, New York, NY, USA, 81–93. https://doi.org/10.1145/3079856.3080208
  • Xu, Cui, and Peinado (Xu et al.) Yuanzhong Xu, Weidong Cui, and Marcus Peinado. Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems. In S&P ’15’.
  • Yang and Hawblitzel (2010) Jean Yang and Chris Hawblitzel. 2010. Safe to the Last Instruction: Automated Verification of a Type-safe Operating System. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’10). ACM, New York, NY, USA, 99–110. https://doi.org/10.1145/1806596.1806610
  • Yang et al. (2006) Junfeng Yang, Paul Twohey, Dawson Engler, and Madanlal Musuvathi. 2006. Using Model Checking to Find Serious File System Errors. ACM Trans. Comput. Syst. 24, 4 (Nov. 2006), 393–423. https://doi.org/10.1145/1189256.1189259

Appendix A Appendix

a.1. Defenses Against Iago Attacks in Existing Systems.

Following are the verbatim quotes from the research papers of existing systems, which do not make any concrete claims.

Haven.

We use established techniques to correctly implement the OS primitives in the presence of a malicious host: careful defensive coding, exhaustive validation of untrusted inputs, and encryption and integrity protection of any private data exposed to untrusted code.

Scone.

The enclave code handling system calls also ensures that pointers passed by the OS to the enclave do not point to enclave memory. This check protects the enclave from memory-based Iago attacks [12] and is performed for all shield libraries.

Panoply.

The shim library performs checks for Iago attacks, safeguarding against low-level data-tampering for OS services.

Graphene-SGX.

Any SGX framework must provide some shielding support, to validate or reject inputs from the untrusted OS. The complexity of shielding is directly related to the interface complexity: inasmuch as a library OS or shim can reduce the size or complexity of the enclave API, the risks of a successful Iago attack are reduced.

Ryoan.

Ryoan allows files to be preloaded in memory, and the list of preloaded files must be determined before the module is confined; e.g., they can be listed in the DAG specification, or requested by the module during initialization. Ryoan presents POSIX-compatible APIs to access preloaded files that are available even after the module is confined. Second, a confined module can create temporary files and directories (which Ryoan keeps in enclave memory). When the module is destroyed or reset, all temporary files and directories are destroyed, and all changes to preloaded files are reverted.

a.2. Layers in Filesystem Stack

The higher the layer we safeguard, the larger the attack surface we can eliminate, and the more implementation-agnostic the API becomes. Figure 2 shows various layers where one can intercept the filesystem operations for integrity checks with the application being the topmost layer and the device driver is the lowest layer.

Figure 2. Layers of the filesystem where the highest layer is enclave application and lowest layer is the persistent device storage. The dotted area shows the components within an untrusted OS.
API
Name
Total
Trusted Trusted Untrusted
Custom Custom Auto Custom Auto
close 13 4 16 1 3 37
create 33 4 28 1 3 69
open 63 13 13 1 3 93
mkdir 28 4 28 1 3 64
remove 5 4 27 1 3 40
rmdir 5 4 27 1 3 40
stat 1 4 40 1 3 49
readdir 12 4 16 2 3 37
chmod 26 4 28 1 3 62
lseek 11 4 18 1 3 37
read 8 39 30 4 3 84
write 12 39 29 2 3 85
ftruncate 2 4 17 1 3 27
TOTAL 219 131 317 18 39 724
Table 6. LOC for implementing APIs in . Column represents the code wrapper code to integrate implementation in . Column represent the addition to for integration or fixing bugs. Custom implies hand-written code and Auto implies that the code was generated by Intel SGX SDK’s edger8r tool. The Trusted code runs inside the enclave whereas Untrusted code runs outside the enclave.

a.3. Implementation Details

Table 6 shows the detailed break down of LOC added for adding support for each API in .

a.4. Feasibility of Machine-checked Executable Code

Note that our primary goal in this paper is not to generate certified assembly code but to certify higher-level properties of implementation. Currently, only guarantees certified correctness for its implementation. However, multiple projects have shown that it is possible to extend certification all the way to assembly. Thus, there are no fundamental limitations for certifying ’s machine code in the future. For our specific setup, one option is to use a CertiCoq (to C) and CompCert (C to assembly). CertiCoq (Anand et al., 2017) is a certified compiler from to CompCert C light. CompCert (Leroy, 2018) is a certified, optimized C compiler which ensures that the generated machine code for various processors is efficient and behaves exactly as prescribed by the semantics of the source program. Thus, both these certified compilers can be composed to give a certified -to-assembly compiler which we can use to certify our machine code for . We have contacted authors of CertiCoq who report that the tool is under active development, not available publicly, and cannot be used for our C implementation yet. We believe that once CertiCoq is fully functional, can ensure that its conversion from machine proved implementation to assembly code executing inside the enclave is certified end-to-end; however, this is beyond the realm of demonstration today.

a.5. Performance

We perform the following measurements for our benchmarks:

  1. Enclaved execution in without checks.

  2. Enclaved execution in with checks.

All our results are aggregated over runs. For non-I/O intensive benchmarks (SPEC CINT2006) we observe an overhead of for the . For benchmarks, our average overhead is for single syscall tests and for large I / O workloads. Thus, incurs an average of CPU overhead compared to the baseline. Our break down shows that a large fraction of ’s overhead is because of page-level AES-GCM encryption-decryption for preserving integrity and system call latency in ’s synchronous mechanism. We present a set of optimizations for real world applications so as not to incur excessive overhead.

Single-syscalls.

We use the micro-benchmark to measure the performance of the I/O intensive calls in . Table 7 shows the overhead of when a single system call is called multiple times. The average overhead over . We observe that read-write operations incur a large overhead. Specifically, our read operation is slowed down by , while create+write is slower. The primary reason for this is that performs page-level AES-GCM authenticated encryption when the file content is stored on the disk. Thus, each read and write operation leads to encryption-decryption and integrity computation of at least one page.

Large I/O Workloads.

We test the performance of under various file access patterns. We run all the tests in with the configuration of the block size of KB, I/O transfer size is KB and total file size to be MB. We perform number of each type of operations on files. We observe an average overhead of because of checks. performs a series of sequential write, sequential read, re-read, random read, random write, multi-write and multi-read operations. Figure 3 the bandwidth for each of these operations. Sequential access incurs relatively less performance overhead because they consolidate the page-level encryption-decryption for every K bytes. Random accesses on the other hand are more expensive because each read / write may cause a page-level encryption-decryption. Since does not cache any page content, re-reads incur the same overhead as sequential read.

Test Time (usec) Overhead
+
multicreate 517264 939727 0.8x
multiwrite 424193 1025797 1.4x
multiread 1007232 4756286 3.7x
multicreatewrite 245901 1578016 5.4x
multiopen 668430 2868140 3.3x
multicreatemany 21607 102655 3.8x
Table 7. Single syscall Performance. Execution time for single syscall benchmarks in and .
Figure 3. Performance for Large I/O Workloads.
Figure 4. Performance for SPEC CINT2006 Benchmarks.

SPEC CINT2006 Benchmarks.

We test benchmarks namely astar, bzip2, h264ref, hmmer, libquantum, mcf, sjeng from SPEC. Each of these benchmarks takes in a configuration file and optionally input file to produce an output file. Figure 4 shows the performance for each of these benchmarks. With our analysis in Table 4, we also measure the frequency of each call per application. Programs hmmer, href, sjeng, and libquantum have relatively less overhead. On the other hand, astar, bzip2, and mcf exhibit larger overhead. On further inspection, we notice that astar and mcf use fscanf to read the configuration files. Thus, reading each character leads to a page read and corresponding decryption and integrity check. Further, astar reads a binary size of KB for processing. As shown by our single syscall measurements (Table 7), reads are expensive. Both these factors amplify the slowdown for astar. bzip2, and mcf output the benchmark results to a new file of sizes and KB respectively which leads to a slowdown. Specifically, bzip2 reads input file in chunks of bytes which leads to a 2-page read / write and decrypt/encrypt per chunk. Finally, libquantum has the lowest overhead because it does not perform any file operations.

a.6. Optimizations

Note that we do not include any optimizations for caching or memory management at the moment. There is scope for improving ’s performance with various optimizations which are independent of the safety properties.

(O1) Reduce

. When the applications invoke a protected API, immediately relays the call to the OS. This results in a lot of . For example, we implement fgets using fgetc because we don’t know beforehand how much buffer size we need to read until we encounter a newline. This is safe but super slow — each character read causes an I/O of KB. We can see the effect of this in the mcf benchmark which reads character by character. An alternative is to maintain a buffer inside the enclave which reflects the changes, instead of doing an immediate call (and hence an ) for each operation. Then the accumulated changes of batched calls can be flushed to the OS periodically. We can do similar optimizations for writes.

(O2) Batch Processing.

Since we integrate with , we have to interface at the interface for tunneling the calls to the untrusted OS. However, other systems such as Scone, Haven, Graphene-SGX keep the C library inside the enclave and interface with the OS purely at the syscall level. All modern C libraries (e.g., musl-libc, glibc, eglibc) have optimized the number of syscalls. They do not invoke an underlying syscall for each API. Instead, they batch as many I/O calls as possible to avoid expensive context switches.

(O3) Optimized Page Allocation Algorithm.

ensures that each page in the memory is being used only by a single file. Thus when wants a new page, its lemma states that page allocation algorithm should return an unused page. Similarly, when the page is unallocated, states that no file should use that page after allocation. To satisfy this lemma, implementation in keeps a page bitmap, which is used for allocation and deallocation.

(O4) Optimized Block Alignment.

Our current implementation assumes a page of bytes. For each such page, uses the first bytes to store the file content and the rest of the bytes for metadata such as the integrity tags. Thus any single page operation in an I/O intensive program, will incur a read / write (and hence decrypt/encrypt) of two pages. Applications which do custom block alignment to tune their performance will see an added slowdown. Our choice of page sizes was just a design choice and is independent of any proofs. proofs and implementation use a macro for these values, and if required, the developer can change them to suit their requirements. The developer can change the block size in the application to bytes.

(O5) Reduce Costs

. Our current implementation is integrated with , which does synchronous . It has been experimentally shown that asynchronous are much faster and can speed up the applications by an order of magnitude (Weisse et al., 2017; Orenbach et al., 2017; Arnautov, Trach, Gregor, Knauth, Martin, Priebe, Lind, Muthukumaran, O’Keeffe, Stillwell, Goltzsche, Eyers, Kapitza, Pietzuch, and Fetzer, Arnautov et al.). As long as the asynchronous call implementations obey the syscall semantics enforced by , our implementation will work out of the box with this optimization.