oo7: Low-overhead Defense against Spectre Attacks

07/16/2018 ∙ by Guanhua Wang, et al. ∙ National University of Singapore 0

The Spectre vulnerability in modern processors has been reported earlier this year (2018). The key insight in this vulnerability is that speculative execution in processors can be misused to access secrets speculatively. Subsequently even though the speculatively executed states are squashed, the secret may linger in micro-architectural data structures such as cache, and hence can be potentially accessed by an attacker via side channels. In this report, we propose oo7, a binary analysis framework to check and fix code snippets against potential vulnerability to Spectre attacks. Our solution employs control flow extraction, taint analysis and address analysis to detect tainted conditional branches and their ability to impact memory accesses. Fixing is achieved by selectively inserting a small number of fences, instead of inserting fences after every conditional branch. Due to the accuracy of our analysis, oo7 suggests inserting less fences, and is shown experimentally to impose acceptably low performance overheads; less than 2 is observed in our experiments on GNU Core utilities. Moreover, the accuracy of the analysis allows oo7 to effectively detect fourteen (14) out of the fifteen (15) Spectre vulnerable code patterns proposed by Paul Kocher, a feat that could not be achieved by the Spectre mitigation in C/C++ compiler proposed by Microsoft. While oo7 is both low-overhead and effective, for large scale deployment of our solution we need to investigate and optimize the time taken by our compile-time analysis. Finally, we show that similar binary analysis solutions are possible for detecting and fixing Meltdown.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

The Spectre (Kocher2018spectre) vulnerabilities in processors were revealed in early 2018. The attacks that exploit these vulnerabilities can potentially affect almost all modern processors irrespective of the vendor (Intel, AMD, ARM) and the computer system (desktop, laptop, mobile) as long as the processor performs speculative execution. Speculative execution (hennessy2011computer) is an indispensable micro-architectural optimizations for performance enhancement, ubiquitous in almost all modern processors except for the simplest micro-controllers. It is an aggressive optimization where the instructions are executed speculatively, but the temporary results created by the speculatively executed instructions execution are maintained in internal micro-architectural states that cannot be accessed by software. The results are committed to the programmer-visible architectural states (registers and memory) only when the speculation is found to be correct; otherwise, the internal micro-architectural states are flushed. The most common example is that of the conditional branches being predicted in hardware and the instructions along the predicted branch path are executed speculatively. Once the conditional branch direction is resolved, the instructions along the speculative path are squashed in case of wrong prediction.

Spectre attacks exploit speculation to deliberately target the execution of certain “transient” instructions. These transient instructions are speculatively executed, and are tricked to bring in secret data into the cache. These transient instructions are subsequently squashed but the secret remains, for example, in the cache. The attacker then carefully accesses the secret content (that is supposed to be hidden to the outside world) through different micro-architectural covert channels, for example, cache side-channel (yarom2014flush+). The website (website) of Spectre states that “As [Spectre] is not easy to fix, it will haunt us for a long time.”

We focus on identifying program binaries that are vulnerable to Spectre attack and patch those binaries with minimal performance overhead as a mitigation technique. We present a comprehensive and scalable solution, called oo7, based on static program analysis. Our solution employs control flow extraction, taint analysis and address analysis at the binary level. Moreover, the analysis needs to model the transient instructions along the speculative path that has never been required in traditional program analysis dealing with only programmer visible execution. We have successfully introduced accurate modeling of speculative execution in oo7.

Once vulnerable code snippets are detected by oo7, we introduce fence instructions at selected program points to prevent speculative execution and thereby protect the code from Spectre attack. We have validated the functional correctness of our protection mechanism with the fifteen litmus test codes from (spectremitigations) on Intel Xeon platform. We note that the current Spectre mitigation approach introduced by Microsoft C/C++ compiler (developerguidance), detects and protects only 2 out of 15 litmus tests for Spectre vulnerabilities (spectremitigations), whereas oo7 can detect all fifteen purpose-built Spectre-vulnerable code patterns. We can launch successful Spectre attack to access arbitrary locations in the victim code prior to the insertion of fence insertions by oo7; but our attempts at Spectre attacks fail after oo7-directed automated identification and patching of the victim code. We experimentally measure the performance overheads from our selective fence insertion and find that the overheads are around 4% on average on SPECint 2006, thereby indicating the practicality of our approach. We also report the results of a large-scale experimental study on applying oo7 to over 500 program binaries (average binary size 261 KB) from different real-world projects.

We demonstrate that oo7 can be tuned to defend against multiple different variants of Spectre attack (see Table 1) that exploit vulnerabilities in the victim code through speculative execution. Moreover, we adapt our approach to detect malicious program binaries that can potentially launch Meltdown and Foreshadow attacks. We also note the limitations of our analysis-based approach in defending against certain variants of Spectre and Foreshadow attacks. The variants that cannot be addressed by oo7 have potential system-level solutions introduced by different vendors with reasonably low overhead. The Spectre variants handled by oo7 with low performance overhead are either not amenable to system-level defense mechanisms, incur high performance overhead or escape detection with existing approaches. Thus oo7 approach via binary analysis is complementary to all the other efforts in mitigating the impact of security vulnerabilities due to speculative execution.

Contributions

The contributions of this paper can be summarized as follows. First, we present a program analysis based approach called oo7 for mitigating Spectre attacks. Our solution is based on binary analysis and does not involve changes to the underlying operating system and hardware. It uses taint analysis, address analysis and speculation modeling to check potentially vulnerable program binaries, and inserts a small number of fences to mitigate the risks of Spectre attack. Our approach is accurate in identifying all the litmus tests for Spectre vulnerabilities (spectremitigations), has low performance overhead (average 4% overhead for SPECint benchmark suite), and is scalable as evidenced by our analysis of over 500 large program binaries.

We show that our program analysis based approach can detect and mitigate certain variants of Spectre vulnerabilities in the application code, but not all (see Table 1). It can also detect malicious program binaries that can potentially launch Meltdown and Foreshadow attacks. Thus our work provides an understanding of the class of attacks for which an analysis based mitigation may be suitable, and for which a system level solution is suitable.

So far, no Spectre attacks have been found in the wild. We have made oo7 available on request in the public domain from https://oo7.comp.nus.edu.sg/. We hope that the search for zero day Spectre attacks in the wild can be substantially accelerated via community participation using our tool.

2. Spectre, Meltdown, and Foreshadow Variants

Classification Exploit name Public vulnerability name oo7 capability
4*Vulnerability in victim code Spectre variant 1 Bounds Check Bypass (BCB) Detect and patch vulnerable victim code
Spectre variant 1.1 Bounds Check Bypass Store (BCBS) Detect and patch vulnerable victim code
Spectre variant 1.2 Read-only protection bypass (RPB) Detect and patch vulnerable victim code
Spectre-NG variant 4 Speculative Store Bypass (SSB) Potentially possible but not handled yet by oo7
2*BTB or RSB poisoning Spectre variant 2 Branch Target Injection (BTJ) -
Spectre RSB Return Mispredict -
6*
Transient out-of-order
execution
Meltdown Rogue Data Cache Load (RDCL) Detect malicious code
ForeShadow L1 Terminal Fault-SGX Detect malicious code
Spectre-NG variant 3a Rogue System Register Read (RSRR) Detect malicious code
Spectre-NG LazyFP Lazy FP State Restore Detect malicious code
2*ForeShadow-NG L1 Terminal Fault-OS/ SMM -
L1 Terminal Fault: VMM -
Table 1. The existing speculative execution based attacks and the ability of oo7 for handling them

A number of Spectre, Meltdown, and Foreshaow vulnerabilities that all take advantage of speculative execution in modern processors have been disclosed recently. A summary of these variants appear in Table 1

. We classify the different vulnerabilities into three categories:

(a) Vulnerability in victim code: Many Spectre attacks rely on vulnerable code snippets inside the victim process and trigger speculative execution of the code snippet to read secret data by supplying carefully selected inputs to the victim process. We detect these vulnerabilities in oo7 by identifying the potentially susceptible code fragments via binary analysis and then introducing fences at selected program points to prevent speculative execution and thereby harden the victim software against any such attacks.

(b) BTB or RSB poisoning: In these Spectre variants, the attacker poisons the Branch Target Buffer (BTB) or Return Stack Buffer (RSB) in the micro-architecture. The victim process, while using the poisoned BTB or the RSB for speculative execution, is then mislead to branch or return to a gadget that leaks the sensitive data. Any indirect branch or return instruction in the victim code is vulnerable to this attack and hence we do not attempt to mitigate these attacks in oo7. There exist potential solutions such as Retpoline (retpoline) or RSB refilling (koruyeh2018spectre) for these vulnerabilities.

(c) Transient out-of order execution: These attacks can be directly launched by a malicious code (malware) without the requirement of any specific vulnerable code fragment or pattern in the victim process. Unlike the first class of attacks where the defense mechanism is to harden the victim software, here oo7 performs malware detection, i.e., it looks for malicious code patterns within a binary.

2.1. Vulnerability in Victim Code

Spectre Variant 1

The following victim code fragment exhibits Spectre vulnerability Variant 1.

[frame=single, fontsize=] void victim_function_v01(size_t x) if (x ¡ array1_size) //TB: Tainted Branch y = array1[x]; //RS: Read Secret y temp &= array2[y * 256]; //LS: Leak Secret y In this example, the parameter x is under the attacker control in the sense that x can be influenced by external input. Hence we consider the conditional branch as a Tainted Branch (TB). The attacker first trains the branch predictor to expect that the branch will be true (i.e., the array bound check will pass). The attacker then invokes the code with an input x value outside the bound of array1. The branch predictor expects the branch condition to be true and the CPU speculatively reads y using malicious value x outside the array bound. We call this action Read Secret (RS) because y can be a potential secret that is not legitimately accessible through malicious input without speculation. This is followed by the CPU speculatively accessing array2 using an address that is dependent on the secret y leading to cache state change. We call this action Leak Secret (LS) because the change in the cache state lingers even after the CPU realizes that the branch prediction was wrong and squashes the speculatively executed instructions. The attacker can now launch cache side-channel attack (yarom2014flush+) to detect this change in cache state and discover the secret y. Specifically, for Prime+Probe side-channel attack, the attacker ensures that array2 was not cached before the memory access by evicting the cache line through priming the cache set. Then the attacker triggers LS action to leak the secret to the cache side channel. Finally, the attacker performs the probe phase to get the timing of the memory accesses for array2 and discovers the value of . The multiplier 256 in array2[y*256] guarantees that different values of y lead to different cache line access, and normally, this value is greater than or equal to the cache line size.

Spectre Variant 1.1

The idea behind the Spectre Variant 1.1, also known as Bounds Check Bypass Store (BCBS), is to bypass bound check and execute a store instruction speculatively (kiriansky2018speculative). In the following example, x can potentially be under attacker control, hence, the conditional x < array1_size is a Tainted Branch (TB). However, unlike the Read Secret (RS) in Spectre Variant 1, this variant uses a Speculative Write (SW) to modify arbitrary memory location. For instance, the example modifies an arbitrary memory location pointed by array1[x] when the conditional branch is mispredicted for a value . Although this speculative store is squashed upon resolving the branch outcome, it can leak secret values from the program. For instance, array1[x] may overwrite the return address and transfer control to a gadget that leaks arbitrary secret value via a side-channel (similar to LS in Spectre Variant 1).

[frame=single, fontsize=] void victim_function_v1.1(size_t x, y) if (x ¡ array1_size) //TB: Tainted Branch array1[x] = y; //SW: speculative Write

Spectre Variant v1.2

This vulnerability bypasses the protection enforced by read-only memory, e.g., code pointers (kiriansky2018speculative). Consider the victim_function_v1.1 where the true valuation of the conditional captures whether x points outside the read-only memory. If x is under attacker control, then the write to a read-only memory can be speculatively executed and modify crucial data structures such as code pointers in the cache. As a result, like Spectre Variant 1.1, the program control may transfer to arbitrary location to execute attacker chosen code. Like Spectre Variant 1.1, this variant also requires the presence of TB and SW.

Spectre Variant 4

Spectre Variant 4, also called Speculative Store Bypass (SSB), is based on the fact that the processor may execute a load instruction speculatively even when a prior store instruction in program order is pending because the address for the store is not yet known. Thus a speculative load may read a stale value that should have been modified by a prior store instruction if they access the same memory address; in that case, the speculative load should be squashed after the store address is known.

oo7 can detect and patch victim binary code with potential Spectre variant 1, 1.1. and 1.2 vulnerabilities. oo7 can potentially handle Spectre variant 4 by identifying the vulnerable code pattern but requires precise address analysis (that the load and the store are accessing the same memory address) that is not supported yet in our framework.

2.2. BTB or RSB poisoning

Spectre Variant 2

Most architectures support indirect branches in the form of “jmp [r1]”. For such jump instructions, the program control is diverted to a location stored in the register r1. For improving program performance, the processor leverage Branch Target Buffer (BTB) to store the frequently used target locations of branch instructions, including indirect branches. An attacker can poison the Branch Target Buffer (BTB) to include its preferred target locations. When the victim executes an indirect branch instruction, it consults this poisoned BTB and the speculative execution can potentially be misled to a target location chosen by the attacker. Any indirect branch is vulnerable to this attack. The indirect branches can be easily identified by static analysis and mitigated by Retpoline (retpoline) approach. Thus we do not consider this variant of Spectre for our analysis based solution.

SpectreRSB

SpectreRSB vulnerability (koruyeh2018spectre) is similar to the Spectre Variant 2. Instead of poisoning the BTB with attacker chosen location, the SpectreRSB vulnerability manipulates the return stack buffer (RSB), which is used by the processor to predict the return address. As a result of a successful exploit, a function may return to an attacker controlled location due to the mis-prediction of the return address inflicted by an attacker. Subsequently, the program may execute arbitrary code in the attacker-controlled location until the return address is finally resolved. All return instructions are potentially vulnerable to such exploit. RSB refilling is a potential approach to mitigate SpectreRSB (koruyeh2018spectre). We do not consider mitigating this attack in oo7.

2.3. Transient out-of-order execution

We now discuss the attacks that can be directly launched from the attacker code and without the requirement of any vulnerability in the victim code. All such attack code share the following common features: (i) To exploit the out-of-order execution and illegally access sensitive data (e.g., kernel memory), and (ii) To use a covert channel, e.g., cache to leak the sensitive data read during the first step.

Meltdown, Foreshadow, and Spectre-NG Variant 3a

Meltdown exploits out-of-order execution to read sensitive data, e.g., kernel memory (Lipp2018meltdown). The basic idea behind Meltdown is shown in the following example. Reading the kernel memory array1[x] raises an exception. However, before the exception is handled, the processor goes ahead with the execution (for performance reasons) and leaks the value of array1[x] via indexing to array2. We note that the leakage of secret is exactly the same as Leak Secret (LS) in Spectre Variant 1.

[frame=single, fontsize=] void attack_function_v03(size_t x, y) //array1 points to kernel memory y = array1[x]; //leak array1[x] via out-of-order exec and cache temp &= array2[y*256]; Foreshadow (van2018foreshadow), a more recent variant of Meltdown, exploits the out-of-order execution as described before in Intel SGX (costan2016intel). This is to leak the sensitive data from the enclave to the user space. Finally, the exploit Rogue System Register Read (Spectre-NG Variant 3a) exploits out-of-order execution to read system control registers instead of the kernel memory. oo7 can detect all these variants.

Spectre-NG LazyFP

Lazy FPU State Leak (stecklina2018lazyfp) is an exploit to illegally read a floating point register in a victim process. In particular, many operating systems today support lazy FPU context switching, where the FPU register states is only restored when necessary, to reduce the context-switch delay incurred in restoring the large floating point registers. Specifically, the operating system tracks the owner of the FPU. When a process that does not own the FPU accesses it the first time, an exception is raised to restore the FPU context for the current process. However, duo to the out-of-order execution, the FPU instruction is still executed in the context of the old owner before the exception is handled. Hence, there is potential leakage of data from the security domain of the old owner to the current process. This feature can be exploited by the attacker to steal sensitive data from the victim process via cache side-channel. In contrast to the basic version of Meltdown, in this attack, the illegal access spans over all floating point registers available in the processor. oo7 can detect this variant.

Foreshadow-NG

Foreshadow-NG (weisse2018foreshadow) is an extension of Foreshadow that exploits the L1 terminal fault to leak sensitive data between the security domains, for example, leakage from victim process space to the attacker process (Foreshadow-OS), leakage from the victim VMs or the hypervisor itself (Foreshadow-VMM). The attacker can use the legal user-space virtual addresses that belong to the attacker process; hence, it is not possible to precisely identify the illegal memory accesses via static analysis. However, it is possible to identify the code that leaks the sensitive data, e.g., the code capturing cache flushes and timing a number of cache accesses when a cache covert channel is used.

oo7 can detect malicious code with the potential to launch Meltdown, Foreshadow, Spectre-NG Variant 3a, and Spectre-NG LazyFP attacks.

3. Spectre Vulnerability Detection

To describe our analysis, we use the notations in Table 2. We say that an instruction is tainted, i.e., is true, if and only if the instruction operates on some tainted operands. We will first discuss the checker for Spectre Variant 1.

3.1. Detecting Spectre Variant 1

An important concept that we need for our analysis is the Speculative Execution Window, abbreviated . We posit that information about needs to be exposed by processor designers for the sake of detecting Spectre attacks. By default, it seems that can be set to the size of the re-order buffer in an out-of-order processor. However, if the size of the re-order buffer is , it is not sufficient to have a lookahead of instructions from a tainted conditional branch , in our search for memory access , in order to detect Spectre attacks. For processor execution, each instruction is decode to a sequence of micro-ops. Each micro-op will occupy one slot of the re-order buffer during execution. However, micro-ops can be fused (opsfusion) both within an instruction as well as across instruction. When micro-ops are fused across instructions (also called macro-fusion), the micro-ops of at most two instructions can be fused into a single micro-op. For this reason, if the size of the re-order buffer is , we conservatively set the Speculative Execution Window to in our analysis, so as to avoid any false negatives in our analysis.

Symbol Interpretation
is a branch instruction
is an timing instruction, e.g., rdtsc
is a memory access instruction
is a load instruction
the data memory address accessed by a memory-related instruction
set of registers accessed by instruction
cache set accessed by memory address
instruction is tainted
instruction operand is tainted where could be a register, memory location or the value located in a memory location
minimum no. of instructions executed to reach from . If is unreachable from , then .
is data-dependent on instruction
set of instructions control-dependent on
value located at memory address
Speculative Execution Window = , where is the size of re-order buffer in the processor
Table 2. Symbols used in describing oo7
(1)

We now elaborate the checking condition for detecting Spectre. oo7 locates , and by checking . Intuitively, the first two lines of capture the presence of tainted branch instructions and tainted memory-access instructions and . The last line shows that and are located within the speculation window of , and they are data-dependent. reflects Spectre variant 1. Later we show that can easily be changed to detect other Spectre variants e.g. variant 1.1.

3.2. Taint Analysis

We use taint analysis (brumley2011bap) to determine whether conditional branch instructions (e.g. ) and the memory-access instructions (e.g. and ) can be controlled via untrusted inputs. In the following, we outline the taint propagation policies and rules used to detect Spectre vulnerabilities. To illustrate our taint propagation policy, we use the following kinds of instructions.

  • : Binary operation on register and register . The operation op can be either arithmetic operation (e.g. addition or subtraction) or a logical operation (e.g. a logical comparison).

  • : Unary operation on register . The operation op can either be arithmetic (e.g. unary minus) or a logical one (e.g. logical negation).

  • : Loads value from memory address to register .

  • : Stores value from register to memory address .

  • : Branch to label if the logical formula is true.

Taint Propagation Policies:

Initially, all variables that read value from un-trusted sources (e.g., files, network) are tainted. The taint from these variables is then propagated via a well-defined set of rules shown in the following; for each rule, the premises appear on the top of the horizontal bar and the conclusions appear below the horizontal bar. Our taint propagation tracks both data dependencies and control dependencies (also known as implicit flows in taint analysis). Typically such implicit flows come in the form of the tainted data enabling or disabling a branch condition , and the outcome of affecting the computation of a variable which would not be tainted otherwise purely by tracking of data dependencies.

[title=Taint Propagation Rules, colback=white!25,arc=0pt,auto outer arc] [Binary operation] [Unary operation] [Memory load] [Memory load] [Memory store]

[] []

Taint Propagation Rules:

Based on the discussions in the preceding paragraphs, the taint propagation rules are shown. In the taint propagation rules, we assume that captures the current instruction for which the taint propagation is being computed. captures the set of operands written by a set of instructions . We observe how is computed for an operand , where could be a register, memory location or the value located in a memory location. The last propagation rule captures control dependence based taint tracking. Our taint propagation rules take care to avoid any under-tainting, by considering the forward transitive closure of all control and data dependencies from the taint sources.

3.3. Detecting other Spectre variants

In the preceding section, we discussed the detection of Spectre variant 1 (Kocher2018spectre). Note that Spectre variant 1 can leak the secret data in other ways instead of performing exactly action. Such variants can be detected via simple manipulation of :

(2)

Our oo7 approach can be fine tuned to detect a variety of other Spectre variants. For instance, consider Spectre Variant 1.1 (cf. Table 1). Such a variant can easily be detected by the following condition:

(3)

where captures the presence of a speculative write instruction, as needed to exploit Spectre Variant 1.1. Spectre Variant 1.2 (read-only protection bypass) needs exactly the same condition to be satisfied, except that the speculative write () happens to be in read-only memory. For the rest of the paper, we do not distinguish between Spectre Variant 1.1 and 1.2, as oo7 uses the same condition to detect both the variants.

To detect Spectre Variant 4, we need to check whether a load instruction () follows a store () to the same address, yet can speculatively load a value not yet written by . Checking for this condition requires accurate address analysis, more accurate than what we can currently support. We are currently working in this direction.

3.4. Code repair

Our repair strategy is based on systematically inserting memory fences after each tainted branch (i.e., ) in vulnerable code fragments for Spectre variants 1, 1.1 and 1.2. The original article describing Spectre attacks (Kocher2018spectre) suggests insertion of memory fences following each conditional branch. However, using our analysis, we can obtain the exact sequence (for Variant 1) or the sequence (for Variant 1.1 and Variant 1.2) vulnerable to Spectre attacks. As a result, we can accurately locate the program point where the memory fence should be inserted. In particular, we insert memory fences following instruction and immediately before the execution of and , respectively, for Spectre Variant 1 and Variant 1.1, 1.2. This prevents execution from loading the secret value into the cache (for Variant 1) and writing to an attacker-controlled location (for Variant 1.1, 1.2) speculatively.

Nevertheless, inserting memory fences may affect the overall program performance. oo7 inserts memory fences only for the branches identified as (for variants 1, 1.1 and 1.2). This has less overheads than inserting fences after each conditional branch, or after each tainted conditional branch. We show empirically that such a strategy has acceptable performance overheads of average 4% for SPECint benchmarks.

4. Evaluation Setup

Figure 1. Overview of oo7 framework. The components in gray are added by oo7.

4.1. Architecture of the tool

Figure 1 provides an overview of oo7 tool. oo7 contains two main modules: a vulnerability detection module for detecting the Spectre vulnerabilities, and a code repair module to fix the Spectre vulnerabilities.

We adopt BAP (brumley2011bap) as our primary taint analysis platform (cf. Section 3.2). BAP provides a toolkit for implementing automated binary analysis and it supports multiple architectures such as x86, x86-64, ARM, PowerPC, and MIPS. In our oo7 framework, BAP first takes a binary program and the taint sources as inputs. A taint source is an API that imports the data from an un-trusted channel such as network, user input or file reader interface. We consider all user inputs (e.g., via console, file and network) as tainted.

Vulnerability detection module:

The detailed architecture of the vulnerability detection module (cf. Section 3) is outlined in Figure 1. BAP disassembles and lifts binary code into the RISC-like intermediate representation (IR) named as BAP Instruction Language (BIL). Program analysis is performed using the BIL representation and it is architecture independent. BAP contains a microexecution framework named Primus to interpret a lifted program, a low-level intermediate representation of code created by BAP by lifting the binary. The core component of Primus is the interpreter. It emulates the execution of a program by using the underlying technology of forced execution (peng2014x).

BAP provides several interfaces to export crucial information to other analysis modules during the interpretation. Such interfaces use a publish/subscribe architecture to watch the interpreter events. The subscribers are allowed to listen to arbitrary changes in the interpreter state (i.e., Global states). During the analysis, BAP wakes up the specific subscriber when analyzing the events registered by the subscriber. For example, the taint engine module is invoked by the interpreter when it completes the interpretation of an instruction (post-execution event). When the subscriber of the taint engine is invoked, it checks the taint data from the taint source and propagates it if the instruction satisfies the taint policy (cf. Section 3.2). The Spectre detector module is invoked by BAP interpreter after a branch is executed to check whether it is tainted and whether it is followed by possible speculative load/store instructions as per the Spectre variants. Spectre detector checks the state of the interpreted instruction in the light of satisfying the condition explained in Equation 1.

Vulnerability repair module:

Once a vulnerable code fragment is detected in the binary, we locate the corresponding assembly code for repair. To this end, we first mark the address(es) of Spectre vulnerable code, as obtained during the detection stage of oo7. Concurrently, we obtained the disassembled code from the binary and the assembly code from the source (via “-S” option in gcc compiler). Since most compiler optimizations are employed during the compiling stage, there does not exist substantial difference between the assembly code and the respective disassembled code. This allows us to easily map the disassembled code back to the assembly code and locate the instructions vulnerable to Spectre.

Finally, our repair module directly modifies the assembly code by inserting memory fence instructions in the appropriate location (e.g inserting lfence before RS for mitigating Spectre Variant 1).

4.2. Subject Programs

We conduct evaluation on three sets of subject programs.

  • We first apply oo7 on 15 code examples purpose-built to demonstrate different variations of Spectre vulnerabilities from Paul Kocher’s blog post (spectremitigations). We call these Litmus Tests.

  • Next, we conduct evaluation on SPECint2006 benchmarks, which have been well-studied by the computer architecture community. These are detailed in Table 3. We concentrate on complete analysis of the SPECint (integer) benchmark suite because it includes more control-intensive code compared to SPECfp (floating point) and Spectre exploits vulnerability through conditional branches. Specint benchmark suite contains 18.31% branches in the instruction mix compared to only 5.75% for SPECfp (kejariwal2008comparative).

  • Last but not the least, we conduct evaluation with a large number of software projects from Google OSS-Fuzz repository (oss-fuzz) and GitHub. The program binaries in these project include the main application and miscellaneous support tools. Table 4 summarizes the characteristics of these projects consisting of a total of 509 program binaries with size ranging from 8.5KB to 21.8MB (average size 261.4KB).

4.3. Evaluation Platform

We conduct experimental evaluation on Intel Xeon Gold 6126 (xeongold) running at 2.6GHz with 192GB memory. The underlying micro-architecture is Skylake with 224-entry reorder buffer (ROB) (skylake). Due to the potential micro-operation fusion in x86 micro-architectures (Section 3), we conservatively set the speculative window to twice the effective ROB size, i.e., . Intel Xeon Gold 6126 is equipped with 12 cores and 19.25MB non-inclusive shared last-level cache (LLC) with 64 byte line size. The LLC cache miss penalty is about 200 cycles. Non-inclusive LLC is more secure than the inclusive cache and can thwart certain LLC based side-channel attacks (e.g, Flush+Flush, Prime+Probe). However, it is still vulnerable to the Flush+Reload attack. Thus the Spectre and Meltdown attacks can be potentially carried out in this platform.

5. Evaluation Results

Our evaluation investigates three different aspects:

  1. Effectiveness How effective is oo7 in detecting Spectre vulnerabilities in program binaries?

  2. Analysis Time How long is the oo7 analysis time to detect Spectre vulnerabilities?

  3. Performance Overhead How much is the performance overhead introduced by oo7 to protect vulnerable code fragments?

5.1. Evaluation with Litmus Tests

oo7 can correctly identify all code snippets purpose-built with different variations of Spectre vulnerabilities (spectremitigations) as potential victim code fragments. 14 code examples are identified with taint propagation only along data dependencies. The remaining code example is detected with taint propagation along program (both control and data) dependencies.

The latest Microsoft Visual C++ compiler (developerguidance) has integrated /Qspectre switch for mitigating a limited set of potentially vulnerable code patterns related to the Spectre vulnerabilities. Specifically, after compiling an application with /Qspectre enabled, the Visual C++ compiler attempts to insert an lfence instruction upon detecting Spectre code patterns. Paul Kocher (spectremitigations) has evaluated the Microsoft compiler using the 15 litmus tests. The blog post (spectremitigations) mentions that only two of the micro-benchmarks are identified and protected by the Visual C++ compiler. In contrast, oo7 can correctly detect all the 15 code examples as potential victims.

The example (v13 (spectremitigations)) that requires taint propagation along both control and data dependencies is given below.

[frame=single, fontsize=] __inline int is_x_safe(size_t x) if (x ¡ array1_size) return 1; else return 0; void victim_function_v13(size_t x) if (is_x_safe(x)) temp &= array2[array1[x]*512];

The branch in the victim function victim_function_v13 is tainted as the return value of is_x_safe(x) is controlled via untrusted input x. However, the return value of is_x_safe(x) is control-dependent and not data-dependent on x. Thus oo7 can detect this code pattern as potential vulnerability only if both data- and control-dependent taint propagation are applied.

We design an attacker process to steal secrets via cache side-channel from the victim process (litmus test example) once the secret data is brought into the cache through Spectre attack. We manage to successfully extract data from arbitrary memory locations in the victim process on our platform. We then allow oo7 to automatically insert lfence instructions at appropriate program locations to prevent speculation in vulnerable code fragments. We verify that the attacker process can no longer extract data from the victim processes running with the oo7 fix.

5.2. Results on SPECint benchmarks

Figure 2. Runtime overhead due to protection against Spectre Variant 1, 1.1, 1.2 compared to original code in SPECint2006. All denotes the overhead from inserting fence at all conditional branches. The overheads from fences introduced by oo7 to protect against Spectre variants 1, 1.1, 1.2 are plotted.
Program Binary Size
Analysis
time(h—s)
Repair
time (s)
#Conditional
branches
# # #
perlbench 1.2MB 125h 5 21972 60 18 5
bzip2 69 KB 0.45h 1 942 102 81 5
gcc 3.6MB 134h 11 59614 15 0 0
mcf 23 KB 58s 1 202 1 0 0
gobmk 3.9 MB 0.86h 1 11549 35 0 0
hmmer 319KB 2.49h 1 4468 31 13 0
sjeng 153 KB 1.9h 1 2146 14 3 0
libquantum 51KB 1.3h 1 444 30 0 0
h264ref 577 KB 22h 1 6743 5 0 0
omnetpp 768KB 20.6h 2 4812 48 26 3
astar 52 KB 864s 1 541 0 0 0
xalancbmk 5.8MB 142h 15 62209 9 4 2
Table 3. Results for the detection of Spectre vulnerable code fragments in SPECint2006. # denotes the tainted conditional branches detected by oo7. satisfies Spectre Variant 1 condition, while satisfies Spectre Variant 1.1, 1.2 conditions as detected by oo7.

We use SPECint 2006 CPU benchmark suite (henning2006spec) to quantify the performance overhead of oo7 protection mechanism, as well as for evaluating the efficacy of our detection and repair. SPECint 2006 benchmark suite contains 12 programs in C and C++. Table 3 outlines the salient features of these program: the binary size, analysis and repair time, the number of conditional branches, the number of tainted branches , the number of pairs as well as the number of pairs.

We note that seven out of twelve programs exhibit the vulnerability pattern of Spectre variant 1 as evidenced by the presence of pattern (perlbench, bzip2, gcc, hmmer, sjeng, omnetpp, xalancbmk). By looking for the pattern, we conservatively assume the strictest security requirement of reading secret data. The subsequent mechanism to leak the secret data can vary with the most common mechanism being the cache side-channel with code pattern. Four of them (perlbench, bzip2, omnetpp, xalancbmk) are vulnerable with Spectre variant 1.1, 1.2 as evidenced by the presence of pattern. The analysis time varies from 58 seconds (specrand) to 142 hours (h264ref). The analysis time not only depends on the binary size but also the complexity of the program logic, more specifically, the number of branches. Our repair works on the assembly code and can complete in 15 seconds for these benchmarks.

We evaluate the runtime overhead due to fence insertion by executing each modified program ten times and report the average values. Figure 2 shows the normalized execution time. The average performance overhead is 442.2% when fences are inserted naively at both destinations of all conditional branches. This is the safest strategy in the absence of an accurate program analyzers such as oo7. In contrast, oo7 only inserts fences at detected conditional branches covering Spectre Variants 1, 1.1, 1.2 and oo7 incurs only 4.46% overhead on an average.

5.3. Evaluation on various software projects

2*Project 2*Project Description 2*# of Programs 2*Avg. Binary size # of Vulnerable programs Avg. analysis time (h)
DD PD DD PD

samba
SMB/CIFS networking protocol 230 124.0KB 50 83 0.24 0.68
coreutils GNU OS file, shell and text manipulation utilities 114 125.3KB 71 81 0.32 0.92
cups Common UNIX Printing System 52 134.3KB 29 46 0.18 0.29
freeradius Popular open-source RADIUS server 47 49.9KB 10 24 0.19 0.42
openldap Lightweight Directory Access Protocol 31 1.3MB 10 12 3.6 5.2
openssh Network utilities based on SSH protocol 11 791.9KB 3 6 2.5 4.7
xrdp Remote desktop protocol (rdp) server 10 107.3KB 0 0 0.14 0.62
ppp PPP daemon and associated utilities 4 322.0KB 1 2 0.68 1.3
dropbear Small SSH server and client 4 1.2MB 2 2 5.6 13.2
botan A cryptography library 2 13.9MB 0 1 28 74
netdata Distributed real-time performance monitoring 2 1.9MB 2 2 4.3 7.9
wget Content retrieval from web servers 1 937.2KB 1 1 2.8 4.5
darknet Convolutional Neural Networks 1 663.9KB 1 1 2.2 6.5
Total - 509 - 180 261 - -
Table 4. Software Projects used from Github and OSS-Fuzz and the detected Spectre v1 Vulnerabilities. DD=Data-dependence, PD=Program-dependence (data & control dependence).

We observe that the detection of Spectre Variant 1 takes the longest time from our experiments with SPECint benchmarks. So we apply oo7 on 509 binaries (Table 4) to detect potential Spectre variant 1 code snippets to evaluate the scalability of our analysis. The column ”# of vulnerable programs” in Table 4 shows the number of program binaries in each project with potential vulnerabilities. We identify a program as vulnerable if it has at least one pattern in the code that can potentially be exploited by the attacker to read secret data, i.e., we conservatively assume the strictest security requirement. As mentioned earlier, the subsequent mechanism to leak the secret data can vary with the most common mechanism being the cache side-channel with code pattern. For each project, we report the number of vulnerable programs under two different taint propagation strategies: data dependencies and program (data & control) dependencies. For example, in project samba, out of total 230 programs, oo7 detects 50 and 83 binaries as potential victims under program- and data-dependence taint propagations. Program-dependence based taint propagation identifies additional vulnerable code fragments compared to data-dependence only. Altogether, 180 (or 261) out of 509 programs are labeled as potential victims by oo7 under data- (or program-) dependence taint propagation. Table  4 also show the analysis time in hours for detecting patterns using data dependencies, and program dependencies.

Figure 3. Potential Spectre vulnerability in trie.c within project freeradius. argv is tainted from a taint source . The triplet is highlighted.
Potential Spectre Vulnerability in Large-scale Code

We show only one example of a Spectre vulnerability unearthed by oo7 in Figure 3. This code snippet is identified by oo7 in a program (src/lib/util/trie.c) within the project freeradius. Note that oo7 identifies the vulnerability at binary level. For the sake of exposition and brevity, we show only the portions corresponding to the Spectre vulnerability pattern at source code level. As comments, we highlight the code fragments detected as , and . The argument argv is a tainted array read from an external file through the taint source gets(). The conditional check in the function command_lcp is therefore a tainted branch (). Taint is propagated to the function fr_trie_path_lcp via the parameter keylen2. Consequently, the array load lcp_end_bit may use the value of e2 (potentially controlled by the attacker through argv[3]) during speculative execution. This speculative execution may take place due to the misprediction of the conditional branch in command_lcp that reflects a bound check. Finally, the array access xor2lcp may reveal information out of the boundary of array lcp_end_bit[] via cache side channel. Though the pattern is found in the wild by oo7, the vulnerable code fragment is executed only once at run-time, making it impossible for the adversary to poison the branch and launch an attack. The distance between and is 145 x86 instructions. The example illustrates that Spectre vulnerability in real-world may span over multiple functions requiring inter-procedural program analysis.

Figure 4. Cumulative distribution of distance (#instructions) between TB and RS for binaries in Table 4 under data- and program dependence based taint propagation.
Sensitivity to Speculative Execution Window size

We set the speculative execution window size as twice the effective ROB size in our platform. This is a conservative assumption to take care of micro-operation fusion. We investigate the sensitivity of our analysis on value. Figure 4 shows the distance in instructions between TB and RS () for vulnerable code fragments across all 509 binaries. The results show that 83% and 79% of the tainted memory accesses (RS) occur within 100 instructions from the tainted branch (TB) for data-and program-dependence based taint propagation, respectively.

Figure 5. Cumulative distribution of analysis time for binaries in Table 4 under data- and program dependence based taint propagation.
Analysis and Repair Time of oo7

The analysis time depends on the size and complexity of the binary. Figure 5 shows the distribution of analysis time across all the binaries. Under data-dependence based taint propagation, the analysis time is less than 20 minutes for 78% of the binaries. Program-dependence based taint propagation increases the analysis time; still the analysis completes within 20 minutes for 55% of the binaries. The repair time is minimal; for all of the 509 binaries, it is within seconds.

Quantitative Analysis of Vulnerabilities

Our analysis shows that on an average only 7.3% (variance 0.3%) of conditional branches are tainted across 238 programs with at least one tainted branch. Moreover, 271 out of 509 binaries do not have any tainted branch at all. Next we check the percentage of conditional branches in that program binary that are tainted (TB) and are followed by tainted memory access (RS) within speculative execution window. If we want to ensure strict security requirements, then

lfence instruction should be inserted after all these tainted branches. On an average, our analysis shows only 3.5% (variance 0.3%) of conditional branches satisfy this criteria leading to very low overhead in fixing Spectre vulnerability. Finally, we check the percentage of conditional branches in that program binary that are tainted (TB) and are followed by tainted memory access (RS) within speculative execution window and a subsequent tainted memory access (LS) to leak the data to the cache. This is denoted as TB+RS+LS. If we assume cache side-channel attack as the only mechanism to leak the secret brought into the cache, then oo7 only needs to add lfence instruction after these branches. On an average, only 2.3% of conditional branches (variance 0.1%) satisfy this criteria. This strongly indicates that the performance overhead from inserting fences suggested by our technique will be low. However we could not collect the exact performance overhead for all of these 509 binaries since it will involve running each binary against many inputs and averaging the performance overhead across inputs. Furthermore, for some of the binaries such as coreutils, a large set of inputs (each input being a command) is possible. For this reason, we computed performance overheads on SPEC2006 benchmarks instead, which are standard benchmark suite with inputs specified for performance analysis.

6. Limitations of our approach

oo7 demonstrates the possibility of low-overhead Spectre mitigation by judiciously inserting fences while still ensuring safety. Still, oo7 has its limitations.

oo7 relies on BAP, which, in turn incorporates a taint analysis engine. The taint analysis statically interprets the code by unrolling the loops up to a certain depth. In order to ensure that our approach does not introduce false negatives we need to pay attention to the following three issues.

  • Optimistic loop unrolling may introduce false negatives (missing vulnerabilities) in oo7. However, with correct or worst-case loop bounds supplied to BAP, such a limitation can be mitigated.

  • Secondly, taint sources are provided to the taint analysis engine, and if taint sources are under-specified then the taint analysis may not identify all the branches that can be controlled by the attacker. We conservatively assume all user inputs via console, file, and network as taint sources to avoid this problem.

  • Finally, the completeness of the control-flow extraction also plays a role to decide whether our analysis will introduce false negatives. If the branch targets of register indirect jumps are not identified, the control flow graph extracted from the binary will not be complete, and as a result the taint analysis results may miss tainted branches. Thus, our approach always depends on the control flow graph being as complete as possible, in trying to ensure that we do not have false negatives in our analysis.

oo7 finds tainted memory accesses following a tainted conditional branch within a fixed speculation window. Incorrect setting of this speculation window size may lead to false positives (window size too big) or false negatives (window size too small). We conservatively set the window length to twice the size of the of the reorder buffer, as explained earlier.

oo7 works on native code for program binaries. We have not investigated Spectre detection on interpreted code.

The capability of taint analysis gives oo7 the flexibility to adapt to various Spectre variants in addition to detecting malware with potential Meltdown or Foreshadow attacks. We have discussed in detail (Section 2) the class of Spectre, Meltdown, Foreshadow variants that we can handle and the ones that cannot be handled. In addition, new variants are constantly being found, we could face some variants in future that oo7 cannot be adapted to handle.

7. Combatting Meltdown

Meltdown is a recently discovered vulnerability which can exploit the side effects of out-of-order-execution (Lipp2018meltdown). Meltdown does not rely on the software vulnerabilities and it can be launched directly from the attacker code. We can easily adapt the machinery of oo7 to detect meltdown signatures in potentially attack code.

For launching Meltdown, the attacker aims to load value from a kernel address . This, of course, would result in an exception. However, exploiting the out-of-order paradigm of execution, the value from the address might already be brought into the cache before the exception is raised. This cached value, then, can be ex-filtrated via standard cache side-channel attacks (e.g. using a probe array array2 in the Spectre code).

Let us assume that captures the set of sensitive addresses that the attacker does not have permission to access. For example, might capture the set of kernel memory addresses. A meltdown signature is detected if a load instruction points to an address in and a memory access is data-dependent on . Specifically, the detection of Meltdown is captured via the condition

(4)

where captures the set of addresses accessed by load instruction . We use value set analysis (shoshitaishvili2016state) to detect the set of values/addresses accessed by an instruction. This computes a sound over-approximation of . If any kernel address belongs to the set , then the program dependency graph is checked to discover dependent on .

We note that the original meltdown code involves flush instruction (to flush the probe array from the cache) and RETSC instruction (to time the access of probe array during access). Detecting the presence of these instructions is straightforward and we remove it from for brevity.

8. Discussion

We have built oo7 for detecting Spectre vulnerabilities in binary code and protecting against the attack with minimal overhead. Our approach is employed post-compilation on native code to take into account all the compiler optimizations. No change to the operating system or the processor is needed as the approach proceeds by program analysis. We demonstrate that systematic analysis is useful both for detecting Spectre vulnerabilities and to repair them with minimal performance overhead. Our work shows the importance of exposing selected micro-architectural information for enhanced application security, and can help strengthen the dialogue between architecture and program analysis communities. Our work also provides an understanding of the class of Spectre attacks for which an analysis based mitigation may be suitable, and for which classes of attacks a system level solution is suitable.

For detecting Spectre vulnerabilities in the wild and promote further research in the area, we have made our Spectre vulnerability detection code accessible via the following web-site https://oo7.comp.nus.edu.sg/

Acknowledgments

This research is supported in part by the National Research Foundation, Prime Minister’s Office, Singapore under its National Cybersecurity R&D Program (Award No. NRF2014NCR-NCR001-21) and administered by the National Cybersecurity R&D Directorate. The research is also partially supported by SUTD research grant no. SRIS17123.

References