Software Fault Isolation for Robust Compilation

by   Ana Nora Evans, et al.
University of Virginia

Memory corruption vulnerabilities are endemic to unsafe languages, such as C, and they can even be found in safe languages that themselves are implemented in unsafe languages or linked with libraries implemented in unsafe languages. Robust compilation mitigates the threat of linking with memory-unsafe libraries. The source language is a C-like language, enriched with a notion of a component which encapsulates data and code, exposing functionality through well-defined interfaces. Robust compilation defines what security properties a component still has, even, if one or more components are compromised. The main contribution of this work is to demonstrate that the compartmentalization necessary for a compiler that has the robust compilation property can be realized on a basic RISC processor using software fault isolation.



page 1

page 2

page 3


Retrofitting Fine Grain Isolation in the Firefox Renderer (Extended Version)

Firefox and other major browsers rely on dozens of third-party libraries...

Linking Types for Multi-Language Software: Have Your Cake and Eat It Too

Software developers compose systems from components written in many diff...

When Good Components Go Bad: Formally Secure Compilation Despite Dynamic Compromise

We propose a new formal criterion for secure compilation, giving strong ...

Developpement de Methodes Automatiques pour la Reutilisation des Composants Logiciels

The large amount of information and the increasing complexity of applica...

Formally Secure Compilation of Unsafe Low-Level Components (Extended Abstract)

We propose a new formal criterion for secure compilation, providing stro...

Gobi: WebAssembly as a Practical Path to Library Sandboxing

Software based fault isolation (SFI) is a powerful approach to reduce th...

Debloating Software through Piece-Wise Compilation and Loading

Programs are bloated. Our study shows that only 5 across Ubuntu Desktop ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Problem and Motivation

Formal definitions of secure compilation have been proposed by Juglaret et al. (7) and, more recently, by Garg et al. (4). This work is part of the effort to propose a new definition for robust compilation of unsafe low-level languages (3). A compiler has the robust compilation property if any attack on a compiled variant of a program (a set of components) that can be mounted by a component linked and executed with it, can also be mounted at the source level by a component. In the source level semantics, it is impossible to write in another’s component memory and only procedures exported by the callee and imported by the caller can be called. Thus, for the robust compilation property to hold, a strong machine-level separation of the compiled program and the target context is necessary. Juglaret et al.’s (6) implementation targeted a micro-policy architecture (2) with special tagging capabilities at the level of memory location. This work focuses on supporting the new definition of secure compilation on a generic RISC processor, without specialized hardware. We use software fault isolation (13) mechanisms to provide a proof-of-concept implementation of a compiler back-end to a basic RISC machine.

2. Background and Related Work

Software fault isolation was proposed in 1993 by Wahbe et al. (13). A distrusted module is sandboxed into its own fault domain, a logical region of the address space. To prevent it from modifying data or executing code belonging to the rest of the application, its object code is instrumented. The physical address is split logically into a segment id and offset, and the introduced instrumentation does not allow writes outside the data domain and execution to escape the code domain, other than predefined exit points. Many applications that use software fault isolation followed. Google’s Native Client (14) uses software fault isolation to sandbox C/C++ code in the Chrome web browser. Morrissett et al. (10) proposed a semantics of the x86 architecture and constructed a machine verified checker of Native Client. ARMor (15) is a machine verified system that uses software isolation to sandbox application code running on embedded processors. In this research, we combine ideas from this previous work and apply them to support robust compilation on a processor without specialized hardware.

Abadi (1) defined full abstraction as the property of a compiler to preserve and reflect observational equivalence. Achieving observational equivalence in the presence of side channels such as timing, is impossible. Instead, robust compilation focuses on only mapping back to the source level a context that induces a certain behavior on a program. The robust compilation property for unsafe languages proposed by Fachini et al. (3) is:

That is, for all source-level programs and all low-level contexts there exists a source-level context , with no undefined behavior, such that the low-level trace of compiled linked with and source-level trace of linked with , match up to an undefined behavior in .

3. Approach and Uniqueness

The work presented in this abstract is part of a project (3) that aims at defining a new security property that implements a proof-of-concept compiler from a C-like language with components to two target machines: a generic RISC processor and a micro-policy machine (2). The generated executable runs on the bare hardware with the back-end compiler phase targeting the generic RISC processor. While promising, the micro-policy machine (2) does not exist yet. Here we target a generic load-store machine with no specialized hardware for protection. The novelty of this new software fault isolation implementation is that instead of protecting an application from one or more potentially malicious libraries, all components are potentially malicious and, thus, mutually distrustful.

In our approach, a source-level program is translated from the C-like language with components to an intermediate level language that uses a similar memory model to CompCert (9) enriched with a notion of component and interfaces between components. The addresses are not resolved and the interface calls between components are abstract. Our work implements a compiler pass in Coq (12). It takes this intermediate program and generates a RISC assembly program that satisfies the following invariants:

  1. a component can write only within its own data memory;

  2. a component can only jump within its own code memory, except for predefined exit points allowed by the interface; and

  3. if after a call to another component, the execution is transferred back to the callee component, then it will always return to the instruction after the call.

The assumptions in this research are that the basic RISC machine has a minimal load-store instruction set. The register file contains a set of registers dedicated to the software fault isolation instrumentation. The memory is unbounded and it is split into slots. The slots are allocated statically to each component and their type, code or data, is also statically determined. A physical address is an unbounded integer, with the bits starting from the least significant: offset with slot, component identifier, slot identifier. The offset and component are bounded, and the slot identifier is not. Thus, each component has an unbounded memory, but a limit on the contiguous memory it can allocate.

To enforce the first two invariants, this work uses a strategy from Wahbe et al. (13)

that has two extra instructions and three dedicated registers. Using binary bitwise operations on an address, the bits corresponding to the component identifier are set to the current one. All the data slots are odd and the instrumentation for the store instruction sets the least significant bit of the slot. All the code slots are even and the instrumentation for jump resets the least significant bit of the slot. Thus, no writes are possible in the code segment.

For the enforcement of the cross-component control flow, we use a dedicated protected control stack and a dedicated register for the stack pointer. The protected control stack is kept in a reserved memory, which can be accessed only from special instrumentation sequences. To ensure continuous execution of a certain number of instructions needed for managing the protected control stack, we align the instructions (10).

The first two sandboxing invariants do not protect the current executing component, but rather protect all other components from it. Special care must be taken to protect the control stack. First, the procedures called externally are placed at an unaligned address and are preceded by a instruction. Thus spurious pushes onto the protected control stack are avoided. Second, to avoid the error of popping from an empty stack the execution starts with pushing the address of a instruction on the protected control stack and, then the execution is transferred to the main function.

4. Results and Contributions

The project is implemented in Coq (12) and uses the QuickChick (11) framework to test the three invariants. A test consists of the following steps: randomly generate intermediate program using QuickChick’s primitives (8), compilate with our proof-of-concept compiler, execute in simulator with recording of a log specific to each invariant using a state monad, and verify the log by a checker (8). The intermediate programs were syntactically correct and no tests were discarded. Currently, we are working on simulating an attack by randomly injecting a change to the data memory of a component.

The robust compilation property definition cannot be directly applied at the the target level, where the addresses are resolved and a certain layout in memory and instrumentation are expected. Here, the adversarial context is linked and compiled together with the program and the robust compilation property is defined as:


In figure 1 the program has three components, and it’s linked with the adversarial component . Together, they are compiled and executed in the target machine semantic and produce the trace . By robust compilation, there exists a component , with no undefined behavior, such that: together with can be executed in the intermediate semantic, producing a trace . The trace is a prefix of trace until induces and undefined behavior in .

Figure 1. Robust Compilation Intermediate to Target

In conclusion, we designed and implemented a compiler transformation from a RISC-like intermediate language to a basic RISC assembly language that uses software fault isolation mechanisms to provide the memory and control flow separation required by the robust compilation property. We tested the implementation using property based testing (5) and the QuickChick framework (11).

The robust compilation property does not require specialized hardware. More work is needed to support system calls and dynamic loading, but this is an encouraging first step.