Memory Safety Preservation for WebAssembly

by   Marco Vassena, et al.

WebAssembly (Wasm) is a next-generation portable compilation target for deploying applications written in high-level languages on the web. In order to protect their memory from untrusted code, web browser engines confine the execution of compiled Wasm programs in a memory-safe sandbox. Unfortunately, classic memory-safety vulnerabilities (e.g., buffer overflows and use-after-free) can still corrupt the memory within the sandbox and allow Wasm code to mount severe attacks. To prevent these attacks, we study a class of secure compilers that eliminate (different kinds of) of memory safety violations. Following a rigorous approach, we discuss memory safety in terms of hypersafety properties, which let us identify suitable secure compilation criteria for memory-safety-preserving compilers. We conjecture that, barring some restrictions at module boundaries, the existing security mechanisms of Wasm may suffice to enforce memory-safety preservation, in the short term. In the long term, we observe that certain features proposed in the design of a memory-safe variant of Wasm could allow compilers to lift these restrictions and enforce relaxed forms of memory safety.



There are no comments yet.


page 1

page 2

page 3

page 4


Deciding Memory Safety for Forest Datastructures

Memory safety is the problem of determining if a heap manipulating progr...

Robust Hyperproperty Preservation for Secure Compilation (Extended Abstract)

We map the space of soundness criteria for secure compilation based on t...

The Meaning of Memory Safety

We propose a rigorous characterization of what it means for a programmin...

SCOPE: Secure Compiling of PLCs in Cyber-Physical Systems

Cyber-Physical Systems (CPS) are being widely adopted in critical infras...

MESH: A Memory-Efficient Safe Heap for C/C++

While memory corruption bugs stemming from the use of unsafe programming...

PTAuth: Temporal Memory Safety via Robust Points-to Authentication

Temporal memory corruptions are commonly exploited software vulnerabilit...

Detile: Fine-Grained Information Leak Detection in Script Engines

Memory disclosure attacks play an important role in the exploitation of ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

WebAssembly (Wasm) has gained traction as the new portable compilation target language for deploying on the web applications written in high-level languages like C, C++, and Rust. Fruit of an unprecedented collaboration between four major browser vendors, Wasm ensures that even buggy or malicious code downloaded from untrusted sources can be executed safely in a web browser (Haas:2017). To enforce security, Wasm programs are validated (type-checked) first and then executed inside a sandbox that isolates untrusted code from the browser. Memory safety is key to the isolation mechanism of the sandboxed execution environment: well-typed programs cannot corrupt the memory outside the sandbox (e.g., the Javascript virtual machine). Unfortunately, Wasm is still far from secure: buffer overflows and use-after-free can still corrupt the memory of a program within the sandbox, opening the door to attacks like cross-site scripting and remote code execution (Chasm:WASM). The presence of memory vulnerabilities in Wasm thwarts the strenuous efforts devoted into securing unsafe languages like C (checkedc; cets; ccured; ccured-toplas; Agten:2015) and developing resource-aware memory-safe languages like Rust (Matsakis:2014; Jung:2017). Current compilers (e.g., Emscripten) do not attempt to protect compiled programs from Wasm-level attackers exploiting well-known memory vulnerabilities. Following the principled tradition of secure compilation (surv; Abadi:2012; rhc), we propose to strengthen the Wasm compilation chain with a provably secure memory-safety-preserving compiler.

Fortunately, several aspects of Wasm promote rigorous reasoning and help us in our study. In particular, Wasm (1) has a (mostly) deterministic formal semantics that rules out undefined behaviour and (2) is type-safe (Haas:2017). The specification of Wasm has even been mechanized and verified (Watt:2018). Furthermore, the existing security mechanisms of Wasm reduce the attack surface available to target level attackers and thus simplify the job of our secure compiler. Wasm features structured control-flow and separates code and data memory segments, which, in combination, enforce coarse-grained control-flow integrity (Abadi:2009; Abadi:2005) removing by construction classic stack-smashing and return-oriented programming attacks. In addition, Wasm provides state and memory encapsulation through modules, which represent natural boundaries where to enforce security (Haas:2017).

Assuming some degree of freedom when setting module boundaries, we believe that a secure compiler could reuse the existing mechanisms of Wasm to enforce memory safety at the target level, in the short term. However, this approach rests on a strong assumption, namely that the compiler has direct control over how code gets compartmentalized. As this may not always be the case, and thus for a long-term solution, we draw inspiration from Memory Safe WebAssembly (MS-Wasm), a recent design proposal for extending Wasm with hardware-supported progressive memory-safety capabilities 

(mswasm). A secure compiler relying on MS-Wasm language-level support for memory-safety enforcement could allow looser module boundaries.

In the rest of this short paper, we discuss what notions of memory safety we wish to enforce and how to formally express them as (hyper)properties.111Properties are defined over single runs of a program, while hyperproperties involve multiple runs (ClarksonS10). Then, we outline MS-Wasm and argue that it is a suitable target candidate for secure compilation. Finally, we discuss which secure compilation criterion to use when preserving memory safety to MS-Wasm.

2. Memory Safety as a (Hyper)Property

Establishing rigorous security guarantees for our compiler requires a formal definition of memory safety, an intuitive notion that has been surprisingly hard to pin down (ms-hicks-blog). The exact definition of memory safety has important ramifications for our work because it determines what class of security properties our compiler has to preserve and thus what protection mechanisms are needed (rhc).

Previous works on safe variants of C (checkedc; cets; ccured; ccured-toplas; Agten:2015) treat memory safety as a simple safety property enforceable by reference monitors (Schneider:2000) that detect specific memory violations (e.g., accessing freed memory or an array out-of-bounds). Seeking a definition that trascends bad behaviours, memsafety associate memory safety with reasoning principles about state akin to non-inteferference (Goguen82). Since non-interference relates pairs of executions, their definition ascribes memory safety to the class of 2-hypersafety (ClarksonS10), which is arguably harder to preserve robustly than safety (rhc; rsc).

Here, we consider a notion of memory safety based on color tags, inspired by a line of work on micro-policies for tag-based security monitors (Amorim15; Dhawan:2015). Briefly, memory locations and pointers are tagged with colors and a memory violation occurs when a pointer accesses memory tagged with a different color. Unlike the definitions of the works mentioned above, this safety property is trace-based and agnostic to the specific semantics of the languages involved and their syntax—the trace only contains memory relevant actions (i.e., memory allocation, free, read and write). Furthermore, this definition let us study various relaxation of memory safety that could describe precisely the progressive guarantees of MS-Wasm, including spatial, (relaxed) temporal safety222Relaxed temporal safety allows memory accesses through dangling pointers as long as the memory pointed to has not been reallocated (mswasm). and pointer integrity, as well as novel properties that considers only data integrity.333To reduce the overhead of enforcing memory-safety, some tools support modes that check only memory writes (softbound; Duck:2016; Duck:17).

3. Memory Safe WebAssembly

Memory Safe WebAssembly (MS-Wasm) is an extension of Wasm designed to capture sufficient metadata about pointers and memory regions to enforce memory safety efficiently, levereging dedicated hardware. In particular, MS-Wasm promotes a progressive enforcement of memory safety, i.e., depending on application-specific security-performance trade-offs and what particular hardware is available, the same abstractions can enforce “weaker” forms of memory safety.

The core features of MS-Wasm design are segment memories, i.e., linearly addressable, zero-initialized, manually managed extents of memory, and handles, i.e., possibly-corrupted (forged) pointers enriched with bounds metadata. To enforce memory safety, MS-Wasm restricts the interaction between segments and memories appropriately (e.g., only handlers can access segments, provided that they point within their bounds).

In order to use MS-Wasm as a target language in our secure compilation chain, we have to first formalize its design and semantics. Then, using variations of our trace-based definition of memory safety from above, we intend to prove its progressive memory-safety guarantees involving spatial, relaxed temporal safety, and pointer integrity, and establish their relative strengths. With the help of MS-Wasm abstractions, we are then going to design a class of secure compilers that preserve clearly-defined notions of memory safety.

4. Secure Compilation to MS-Wasm

To establish the security guarantees of our compilers, we prove that they attain a secure compilation criterion. Then, to further clarify their security guarantees, we consider general compilation criteria that preserve whole classes of security properties (including memory-safety), instead of using an ad-hoc criterion. Given that we can express memory safety as a safety property as well as a 2-hypersafety property, we adopt two of the robust compilation criteria proposed by rhc, namely Robust Safety Property Preservation (RSP) and Robust 2-Hypersafety Preservation (R2HSP). Intuitively, these criteria require compilers to preserve (hyper)properties of source programs even when they are compiled and linked with arbitrary target code, thus protecting robustly against all active target-level attackers. In practice, equivalent property-free characterizations simplify significantly the proofs of robust criteria preservation (rhc). Specifically, for any compiled program and target context triggering a bad behaviour, we have to find a corresponding source-level context that produces the same bad behaviour. To reconstruct suitable source-level contexts systematically, we can apply known proof techniques based on backtranslation (rsc; rhc; NewBA16; DevriesePP16).

The proofs of RSP and R2HSP differ mainly over the kind of bad behaviours involved, which are determined by the properties that they preserve (safety and 2-hypersafety). Since safety is a simple property, bad behaviours are just finite traces (prefixes) in RSP. In R2HSP, bad behaviours consist of pair of prefixes because 2-hypersafety is just a generalization of safety to a 2-hyperproperty (ClarksonS10). By including all memory-relevant actions in our traces, we gain confidence that the criteria above characterizes correctly the class of memory-safety-preserving compilers that we intend to study.

Acknowledgements: This work was partially supported by the German Federal Ministry of Education and Research (BMBF) through funding for the CISPA-Stanford Center for Cybersecurity (FKZ: 13N1S0762).