Processor Hardware Security Vulnerabilities and their Detection by Unique Program Execution Checking

12/05/2018
by   Mohammad Rahmani Fadiheh, et al.
0

Recent discovery of security attacks in advanced processors, known as Spectre and Meltdown, has resulted in high public alertness about security of hardware. The root cause of these attacks is information leakage across "covert channels" that reveal secret data without any explicit information flow between the secret and the attacker. Many sources believe that such covert channels are intrinsic to highly advanced processor architectures based on speculation and out-of-order execution, suggesting that such security risks can be avoided by staying away from high-end processors. This paper, however, shows that the problem is of wider scope: we present new classes of covert channel attacks which are possible in average-complexity processors with in-order pipelining, as they are mainstream in applications ranging from Internet-of-Things to Autonomous Systems. We present a new approach as a foundation for remedy against covert channels: while all previous attacks were found by clever thinking of human attackers, this paper presents an automated and exhaustive method called "Unique Program Execution Checking" which detects and locates vulnerabilities to covert channels systematically, including those to covert channels unknown so far.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

08/04/2021

An Exhaustive Approach to Detecting Transient Execution Side Channels in RTL Designs of Processors

Hardware (HW) security issues have been emerging at an alarming rate in ...
05/25/2021

Leaky Frontends: Micro-Op Cache and Processor Frontend Vulnerabilities

This paper demonstrates a new class of security vulnerabilities due to t...
01/05/2022

Secure Remote Attestation with Strong Key Insulation Guarantees

Recent years have witnessed a trend of secure processor design in both a...
11/29/2019

Drndalo: Lightweight Control Flow Obfuscation Through Minimal Processor/Compiler Co-Design

Binary analysis is traditionally used in the realm of malware detection....
02/14/2019

Spectre is here to stay: An analysis of side-channels and speculative execution

The recent discovery of the Spectre and Meltdown attacks represents a wa...
07/14/2020

Speculative Leakage in ARM Cortex-A53

The recent Spectre attacks have demonstrated that modern microarchitectu...
07/01/2019

Parametric Timed Model Checking for Guaranteeing Timed Opacity

Information leakage can have dramatic consequences on systems security. ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I introduction

Subtle behaviors of microprocessors, notable at the level of their hardware (HW) implementation in digital logic, are the root cause of security breaches demonstrated in the Spectre [1] and Meltdown [2] attacks. In design flows commonly used in industry, the digital designer describes the logic behavior of the processor in a clock cycle-accurate way and defines for each instruction the elementary steps of its execution based on the processor’s registers. Such design descriptions at the “Register-Transfer Level (RTL)” are often referred to as the microarchitecture

of a processor. Numerous degrees of freedom exist for the designer in choosing an appropriate microarchitecture for a specification of the processor given at the level of its instruction set architecture (ISA).

However, the same degrees of freedom that the designer uses for optimizing a processor design may also lead to microarchitectural side effects that can be exploited in security attacks. In fact, it is possible that, depending on the data that is processed, one and the same program may behave slightly differently in terms of what data is stored in which registers and at which time points. These differences only affect detailed timing at the microarchitectural level and have no impact on the correct functioning of the program at the ISA level, as seen by the programmer. However, if these subtle alterations of program execution at the microarchitectural level can be caused by secret data, this may open a “side channel”. An attacker may trigger and observe these alterations to infer secret information.

In microarchitectural side channel attacks, the possible leakage of secret information is based on some microarchitectural resource which creates an information channel between different software (SW) processes that share this resource. For example, the cache can be such a shared resource, and various attacking schemes have been reported which can deduce critical information from the footprint of an encryption software on the cache [3, 4, 5, 6]. Also other shared resources can be (mis-)used as channel in side-channel attacks, as has been shown for DRAMs [7] and other shared functional units [8].

In these scenarios, the attacker process alone is not capable of controlling both ends of a side channel. In order to steal secret information, it must interact with another process initiated by the system, the “victim process”, which manipulates the secret. This condition for an attack actually allows for remedies at the SW level which are typically applied to security-critical SW components like encryption algorithms. Common measures include constant time encryption [9] and cache access pattern obfuscation [10]. They prohibit the information flow at one end of the channel which is owned by the victim process.

This general picture was extended by the demonstration of the Spectre [1] and Meltdown [2] attacks. They constitute a new class of microarchitectural side channel attacks which are based on so called “covert channels”. These are special cases of microarchitectural side channels in which the attacker controls both ends of the channel, the part that triggers the side effect and the part that observes it. In this scenario, a single user-level attacker program can establish a microarchitectural side channel that can leak the secret although it is not manipulated by any other program. Such HW covert channels not only can corrupt the usefulness of encryption and secure authentication schemes, but can steal data essentially anywhere in the system.

This paper presents new covert channels in average complexity processors that can have severe implications for a wide range of applications in Embedded Systems and Internet-of-Things (IoT) where simple in-order processors are commonly used. Our results show that HW vulnerabilities by covert channels are not only a consequence of early architectural decisions on the features of a processor, such as out-of-order execution or speculative execution. In fact, vulnerabilities can also be introduced at a later design stage in the course of microarchitectural optimizations, targeting speed and energy, for example.

Clearly, it cannot be expected from a designer to anticipate all “clever thinking” of potential attackers who attempt to create covert channels. Therefore, this paper is dedicated to presenting a new technique which automatically detects all microarchitectural side effects and points the designer to the HW components that may be involved in the possible creation of a covert channel.

Fig. 1: In-order pipeline vulnerability: the design decision whether or not a cancellation mechanism is implemented for the transaction between memory and cache does neither have an impact on functional design correctness nor on the correct implementation of the memory protection. However, it may insert a microarchitectural side effect that can form the basis for a covert channel-based attack.

Fig. 1 illustrates how a HW vulnerability may be created by certain design decisions even in a simple in-order pipeline. Consider two instructions as they are executed by a 5-stage pipeline, and let us assume that register R1 contains the address of some secret data that should not be accessible for a user-level program. (Note that the address itself can be routinely processed by any user program.) Let us consider the routine situation where the cache holds a copy of the secret data from an earlier execution of operating system-level code. The first instruction, as it enters the memory (M) stage of the pipeline, makes a READ request to the protected address, attempting to read the secret data. The second instruction at the same time enters the execute (EX) stage of the pipeline and uses the secret data in an address calculation. For performance reasons, a cached copy of the requested secret data may be forwarded directly to the EX stage. This is a common and correct microarchitectural forwarding [11] feature and, by itself, doesn’t cause a problem if the memory protection scheme in place makes sure that neither instruction #1 nor any of the subsequent instructions in the pipeline can complete. We assume that such a memory protection scheme is correctly implemented in our scenario.

In the next clock cycle, as shown in the two bottom parts of Fig. 1, instruction #1 enters the write-back (WB) stage and instruction #2 enters the M stage of the pipeline. Concurrently, the memory protection unit has identified in this clock cycle that instruction #1 attempts access to a protected location. This raises an exception for illegal memory access and causes the pipeline to be flushed. From the perspective of the programmer at the ISA level the shown instructions are ignored by the system. In place of these instructions a jump to the exception handler of the operating system (OS) is executed.

However, depending on certain design decisions, the instruction sequence may have a side effect. Instruction #2 triggers a READ request on the cache. Let us assume that this request is a cache miss, causing the cache controller to fill the corresponding cache line from main memory. Even though the READ request by the pipeline is never fulfilled because it is canceled when the exception flushes the pipeline, it may have measurable consequences in the states of the microarchitecture. In the scenario on the left (“vulnerable design”) the state of the cache is changed because the cache line is updated.

Such a cache foot print dependent on secret data is the basis for the Meltdown attack. It can be exploited as detailed in [2]. However, in Meltdown this critical step to open a covert channel is only accomplished exploiting the out-of-order execution of instructions. In our example, however, the behavior of the cache-to-memory interface may cause the security weakness. Not only the core-to-cache transaction but also the cache-to-memory transaction must be canceled in case of an exception to avoid any observable side effect (“secure design” variant on the right). This is not at all obvious to a designer. There can be various reasons why designers may choose not to implement cancellation of cache updates. First of all, both design variants are functionally correct. Moreover, the vulnerable design prohibits access to protected memory regions. Implementing the cancellation feature incurs additional cost in terms of design and verification efforts, hardware overhead, power consumption, etc. There is no good reason for implementing the feature unless the designer is aware of the vulnerability.

This example shows that a Meltdown-style attack can be based on even subtler side effects than those resulting from out-of-order execution. This not only opens new possibilities for known Meltdown-style attacks in processors with in-order pipelines. These subtle side effects can also form the basis for new types of covert channel-based attacks which have been unknown so far (as demonstrated in Sec. III).

The key contributions of this paper are summarized as follows.

  • This paper, for the first time, presents the experimental evidence that new kinds of covert channel attacks are also possible in simple in-order processors. We present the Orc attack, which uses a so far unknown type of covert channel.

  • We present a method for security analysis by Unique Program Execution Checking (UPEC). UPEC employs a rigorous mathematical (“formal”) analysis on the microarchitectural level (RTL). By employing the proposed UPEC methodology the designer can precisely assess during design time to what extent hardware security is affected by the detailed design decisions.

  • Based on UPEC, for the first time, covert channels can be detected by a systematic and largely automated analysis rather than only by anticipating the clever thinking of a possible attacker. UPEC can even detect previously unknown HW vulnerabilities, as demonstrated by the discovery of the Orc attack in our experiments.

Ii Related Work

Information flow tracking (IFT) has been widely used in the field of security for HW/SW systems. Its core idea is to enhance the hardware and/or the software in a way that the flow of information is explicitly visible. Different techniques have been proposed in order to instrument the processor with additional components enabling it to monitor information flows and possibly block the illegal flows at run time [12, 13]. Also CC-hunter [14] targets illegal information flows at run time. It instruments a processor with a special module called covert channel auditor which uncovers covert channel communications between a trojan and a spy process through detecting specific patterns of conflicting accesses to a shared resource [14]. All these methods incur high overhead on the design and demand modifications at different levels of the system, such as in the instruction set architecture (ISA), the operating system (OS) and the HW implementation. CC-hunter also requires defining the conflict indicator event for each resource which can open a covert channel. This demands a priori knowledge about possible covert channels. In order to capture and remove timing-based side channels in the design the gate-level netlist can be instrumented with IFT capabilities [15]. Such a gate-level IFT method is meant to be applied to selected modules such as a crypto-core or a bus controller. It faces complexity problems when addressing security issues of the larger system. Moreover, since the underlying analysis is based on simulation, relevant corner cases may be missed.

Ardeshiricham et al. [16] developed an RTL transformation framework, called RTLIFT, for enhancing an existing RTL model with information flow tracking (IFT). The method is mainly based on gate-level IFT [13] and has been extended by the Clepsydra approach [17] to covering information flow through timing channels. However, any approach based on inserting constant-time operations into the design, due to the induced overhead, cannot provide a universal solution to the existence of microarchitectural attacks, such as Meltdown [6].

Software security has been seen as a crucial factor in overall system security and security specification and verification have long been studied by the software community [18, 19]. One of the main challenges in this field is the inability of classical property languages such as CTL to specify the needed security requirements [20]. Hyperproperties and also HyperLTL and HyperCTL* have been proposed for specifying and verifying security requirements of software [20, 21]. Hyperproperties provide the possibility of effectively specifying general security requirements such as non-interference and observational determinism [20]. Detecting security vulnerabilities and verifying information flow in software are also targeted by the methods of taint analysis and symbolic execution, as surveyed in [22].

These works have proven to be effective in finding vulnerabilities in software. However, the security threats in HW/SW systems are not limited to the software alone. Vulnerabilities can emerge from HW/SW interaction or from the hardware itself. Also, certain SW vulnerabilities are exploitable through the existence of microarchitectural features (e.g., cache side channel attacks). This category of security threats may only be visible at a specific level of hardware abstraction. Although the security requirements covering such issues are related to the ones addressed in SW verification, the SW security notions cannot be easily adopted in the domain of hardware and HW-dependent software. Open questions such as how to model hardware and the information flow in hardware, how to formulate security requirements for hardware and how to handle the computational complexity pose significant challenges. Especially after Spectre and Meltdown have created awareness for these questions, this has sparked new and intensive research activities in the domain of hardware security.

Instruction level abstraction (ILA) is an approach to create a HW-dependent model of the software and formally verify security requirements [23]. This method has the advantage of capturing vulnerabilities which are not visible in the software/firmware itself, but rather manifest themselves through the communication between software and hardware. However, HW security vulnerabilities in the RTL implementation of the hardware cannot be detected by this approach.

The use of formal techniques for security verification in hardware was pioneered in [24, 25, 26] by adopting the idea of taint analysis which originally comes from the software domain. This is the research which is the most closely related to our work. In those approaches a HW security requirement such as a specific confidentiality requirement is formulated in terms of a taint property [24] along a certain path in the design. The property fails if the taint can propagate, i.e., the information can flow from the source of the path to its destination, while conforming to a propagation condition. This is also related to the notion of observational determinism defined in [20] in the sense that a taint property checks whether the observations at the end of the taint path (destination) are functionally independent of values at the source of the path. This was further developed in [25] and a portfolio of different taint properties in the CTL language was created. In order to formulate the properties in CTL, certain assumptions about the attack are required which significantly restrict the coverage of the method.

As an alternative, a miter-based equivalence checking technique, with some resemblance to our computational model in Fig. 3, has been used in previous approaches [26, 27]. Although this increases the generality of the proof, it still restricts the attack to a certain path. Moreover, since some of this work considers verification at the architectural level, vulnerabilities based on microarchitectural side channels are not detectable.

Taint analysis by these approaches has shown promise for formal verification of certain problems in HW security, for example, for proving key secrecy in an SoC. It also proved useful for security analysis in abstract system models. Note, however, that all these approaches mitigate the complexity of formal security analysis by decomposing the security requirements into properties along selected paths. This demands making assumptions about what paths are suspicious and requires some “clever thinking” along the lines of a possible attacker. As a result, non-obvious or unprecedented side channels may be missed.

Any leakage determined by a taint property will also be found using UPEC because any counterexample to the taint property is also a counterexample to the UPEC property of Eq. IV. The important advantage of UPEC is, however, that it does not need any specification of the expected leakage path.

Micro-architectural side channel attacks are a security threat that cannot be discovered by looking only at software and verifying hyperproperties or taint properties. In order to secure the hardware against side-channel attacks at design time, the notion of non-interference has been adopted in [28, 29] to prove the absence of side channels in a hardware design. Non-interference in hardware is a strong and also conservative security requirement. However, the amount of incurred overhead is usually prohibitive in many hardware design scenarios. Designers usually try to quantitatively evaluate the existence and leakiness of side channels to make a leak/risk tradeoff [30]. Such quantitative approaches usually either use simulation-based methods to collect and analyze a set of execution traces [31]

or estimate leaks through mathematical modeling 

[32].

The whole picture has been significantly altered after the emergence of Meltdown and Spectre, which proved the existence of a new class of side-channel attacks that do not rely on a valid execution of a victim software and can be carried out even if there is no vulnerability in the deployed software. The new findings prove the inadequacy of existing verification techniques and call for new methods capable of addressing the new issues. While the software verification methods are not capable of finding covert channel attacks because they abstract away the whole microarchitecture, hardware taint property techniques came also short of capturing these vulnerabilities due to path limitation, as mentioned above.

This paper addresses the issue of covert channel attack vulnerabilities by proving unique program execution. UPEC defines the requirement for security against covert channel attacks in microarchitecture designs at the RTL (in which most of the related vulnerabilities appear) and provides a methodology to make proofs feasible for mid-size processors. Unlike the above techniques for software verification which try to verify a given software, UPEC models software symbolically so that it exhaustively searches for a program exposing a HW security vulnerability, or proves that no such program exists.

Language-based security is another line of research which advocates the use of more expressive security-driven hardware description languages. SecVerilog [33] extends the Verilog language with a security type system. The designer needs to label storage elements with security types, which enables enforcing information flow properties. Although using Verilog as the base of the language eases the adoption of the method, the labeling process is complicated and the designer may need to relabel the design in order to verify different security properties. All in all, system-wide labeling of an RTL design with thousands of state bits usually is not a trivial task.

Program synthesis techniques have been proposed to target HW security vulnerabilities related to Meltdown, Spectre and their variants [34]. With the help of these techniques, program tests capable of producing a specific execution pattern for a known attack can be generated automatically. In order to evaluate security for a certain microarchitecture against the attack, the user needs to provide a formal description of the microarchitecture and the execution pattern associated with the attack. The synthesis tool can then synthesize the attacker program which can be later used to test the security of the computing system against the considered attack. Although the method automates the generation of an attacker program, the execution pattern of the considered attack must be specified by the user, i.e., the user still needs to develop the attack in an abstract way by reasoning along the lines of a possible attacker. Furthermore, the coverage of the generated tests are restricted to known attacks and vulnerabilities.

Iii Orc: A New Kind of Covert Channel Attack

The term Read-After-Write (RAW) Hazard denotes a well-known design complication resulting directly from pipelining certain operations in digital hardware [11]. The hazard occurs when some resource is to be read after it is written causing a possible stall in the pipeline. RAW hazards exist not only in processor execution pipelines but also elsewhere, e.g., in the core-to-cache interface.

For reasons of performance, many cache designs employ a pipelined structure which allows the cache to receive new requests while still processing previous ones. This is particularly beneficial for the case of store instructions, since the core does not need to wait until the cache write transaction has completed. However, this can create a RAW hazard in the cache pipeline, if a load instruction tries to access an address for which there is a pending write.

A RAW hazard needs to be properly handled in order to ensure that the correct values are read. A straightforward implementation uses a dedicated hardware unit called hazard detection that checks for every read request whether or not there is a pending write request to the same cache line. If so, all read requests are removed until the pending write has completed. The processor pipeline is stalled, repeating to send read requests until the cache interface accepts them.

In the following, we show an example how such a cache structure can create a security vulnerability allowing an attacker to open a covert channel. Let’s assume we have a computing system with a cache with write-back/write-allocate policy and the RAW hazard resolution just described. In the system, some confidential data (secret data) is stored in a certain protected location (protected address).

For better understanding of the example, let us make some more simplifying assumptions that, however, do not compromise the generality of the described attack mechanism. We assume that the cache holds a valid copy of the secret data (from an earlier execution of privileged code). We also simplify by assuming that each cache line holds a single byte, and that a cache line is selected based on the lower 8 bits of the address of the cached location. Hence, in our example, there are a total of cache lines.

1: li x1, #protected_addr // x1 #protected_addr
2: li x2, #accessible_addr // x2 #accessible_addr
3: addi x2, x2, #test_value // x2 x2 + #test_value
4: sw x3, 0(x2) // mem[x2+0] x3
5: lw x4, 0(x1) // x4 mem[x1+0]
6: lw x5, 0(x4) // x5 mem[x4+0]
Fig. 2: Example of an Orc attack: in this RISC-V code snippet, accessible_addr is an address within the accessible range of the attacker process. Its lower 8 bits are zero. The address within its 256 bytes offset is also accessible. test_value is a value in the range of 0…255.

The basic mechanism for the Orc attack is the following. Every address in the computing system’s address space is mapped to some cache line. If we use the secret data as an address, then the secret data also points to some cache line. The attacker program “guesses” which cache line the secret data points to. It sets the conditions for a RAW hazard in the pipelined cache interface by writing to the guessed cache line. If the guess was correct then the RAW hazard occurs, leading to slightly longer execution time of the instruction sequence than if the guess was not correct. Instead of guessing, of course, the attacker program iteratively tries all 256 possible cache locations until successful.

Fig. 2 shows the instruction sequence for one such iteration. The shown #test_value represents the current guess of the attacker and sets the RAW hazard conditions for the guessed cache line. The sequence attempts an illegal memory access in instruction #5 by trying to load the secret data from the protected address into register x4. The processor correctly intercepts this attempt and raises an exception. Neither is the secret data loaded into x4 nor is instruction #6 executed because the exception transfers control to the operating system with the architectural state of instruction #5. However, before control is actually transferred, instruction #6 has already entered the pipeline and has initiated a cache transaction. The cache transaction has no effect on the architectural state of the processor. But the execution time of the instruction sequence depends on the state of the cache. When probing all values of #test_value, the case will occur where the read request affects the same cache line as the pending write request, thus creating a RAW hazard and a stall in the pipeline. It is this difference in timing that can be exploited as a side channel.

Let us look at the sequence in more detail. The first three instructions are all legal for the user process of the attacker. Instruction #1 makes register x1 a pointer to the secret data. Instruction #2 makes register x2 a pointer to some array in the user space. The address of this array is aligned such that the 8 least significant bits are 0. Instruction #3 adds #test_value as an offset to x2. This value is in the range 0…255.

Instruction #4 is a store of some value x3 into the user array at x2. This is a legal instruction that results in a write request to the cache. Note that the destination cache line is determined by #test_value since the lower 8 bits of the write address are used for cache line addressing (cf. Sec. III). The cache accepts the write request and immediately becomes ready for a new request. The cache controller marks this write transaction as pending until it is complete. (This takes a few clock cycles in case of a cache hit and even significantly longer in case of a miss.)

In the next clock cycle, instruction #5 attempts an illegal load from the secret address, producing a read request to the cache. It will take a small number of clock cycles before the instruction has progressed to the write-back (WB) stage of the processor pipeline. In this stage the exception will be raised and control will be transferred to the OS kernel. Until the instruction reaches the WB stage, however, all components including the cache keep working. Since the cache has a valid copy of the secret data, it instantly answers the read request and returns the secret data to the core where it is stored in some internal buffer (inaccessible to software).

Even though the illegal memory access exception is about to be raised, instruction #6 is already in the pipeline and creates a third read request to the cache. This request is created by a forwarding unit using the value stored in the internal buffer. Note that the request is for an address value equal to the secret data. (It does not matter whether an attempt to load from this address is legal or not.)

Let us assume that this address value happens to be mapped to the same cache line as the write request to the user array from instruction #4. This will be a read-after-write (RAW) hazard situation, in case the write request from instruction #4 is still pending. The read transaction must wait for the write transaction to finish. The cache controller stalls the core until the pending write transaction has completed.

In case that the read request affects a different cache line than the pending write request there is no RAW hazard and the processor core is not stalled.

In both cases, the processor will eventually raise the exception and secret data will not appear in any of the program-visible registers. However, the execution time of the instruction sequence differs in the two cases because of the different number of stall cycles. The execution time depends on whether or not a RAW hazard occurs, i.e., whether or not #test_value is equal to the 8 lower bits of the secret data.

Assuming the attacker knows how many clock cycles it takes for the kernel to handle the exception and to yield the control back to the parent process, the attacker can measure the difference in execution time and determine whether the lower 8 bits of the secret are equal to #test_value or not. By repeating the sequence for up to 256 times (in the worst case), the attacker can determine the lower 8 bits of the secret. If the location of the secret data is byte-accessible, the attacker can reveal the complete secret by repeating the attack for each byte of the secret. Hardware performance counters can further ease the attack since they make it possible to explicitly count the number of stalls.

This new covert channel can be illustrated at the example of the RISC-V RocketChip [35]. The original RocketChip design is not vulnerable to the Orc attack. However, with only a slight modification (17 lines of code (LoC) in an RTL design of 250,000 LoC) and without corrupting the functionality, it was possible to insert the vulnerability. The modifications actually optimized the performance of the design by bypassing a buffer in the cache, by which an additional stall between consecutive load instructions with data dependency was removed. There was no need to introduce any new state bits or to change the interface between core and cache. The attack does not require the processor to start from a specific state: any program can precede the code snippet of Fig. 2. The only requirement is that protected_addr and accessible_addr reside in the cache before executing the code in Fig. 2.

The described vulnerability is a very subtle one, compared to Meltdown and Spectre. It is caused by a RAW hazard not in the processor pipeline itself but in its interface to the cache. It is very hard for a designer to anticipate an attack scenario based on this hazard. The timing differences between the scenarios where the RAW hazard is effective and those where it isn’t are small. Nevertheless, they are measurable and can be used to open a covert channel.

This new type of covert channel discovered by UPEC gives some important messages:

  • Subtle design changes in standard RTL processor designs, such as adding or removing a buffer, can open or close a covert channel. This raises the question whether Spectre and Meltdown are only the tip of the iceberg of covert channels existing in today’s designs. Although specific to a particular design, the newly discovered vulnerabilities may inflict serious damage, once such a covert channel becomes known in a specific product.

  • The Orc attack is based on the interface between the core (a simple in-order core in this case) and the cache. This provides the insight that the orchestration of component communication in an SoC, such as RAW hazard handling in the core-to-cache interface, may also open or close covert/side channels. Considering the complex details of interface protocols and their implementation in modern SoCs, this can further complicate verifying security of the design against covert channel attacks.

  • The new insight that the existence of covert channels does not rely on certain types of processors but on decisions in the RTL design phase underlines the challenge in capturing such vulnerabilities and calls for methods dealing with the high complexity of RTL models.

  • The presented attack is based on a so far unsuspicious microarchitectural feature as its covert channel. This makes it resistant to most existing techniques of security verification, as discussed in Sec. II. The verification method, therefore, should be exhaustive and must not rely on a priori knowledge about the possible attacks.

These challenges motivate the proposed UPEC approach. It is meant to be used by the designer during the RTL design phase to detect all possible cases of a covert channel.

Iv Unique Program Execution Checking (UPEC)

Confidentiality in HW/SW systems requires that an untrusted user process must not be able to read protected secret data. In case of a microarchitectural covert channel attack, the attacker cannot read the secret data directly. Nevertheless, confidentiality is violated because the execution timing of the attacker process depends on the secret data, and the timing information is measurable, e.g., through user-accessible counters. These timing differences may stem from various sources that need to be exhaustively evaluated when verifying confidentiality.

In the following, we refer to the computing system to be analyzed for security as System-on-Chip (SoC) and divide its state variables into two sets: state variables associated with the content of its memory (main memory and memory-mapped periphery) and state variables associated with all other parts of the hardware, the logic parts.

Definition 1 (Microarchitectural State Variables).


The microarchitectural state variables of an SoC are the set of all state variables (registers, buffers, flip-flops) belonging to the logic part of the computing system’s microarchitecture.

A subset of these microarchitectural state variables are program-visible:

Definition 2 (Architectural State Variables).


The architectural state variables of an SoC are the subset of microarchitectural state variables that define the state of program execution at the ISA level (excluding the program state that is represented in the program’s memory).

Definition 3 (Secret Data, Protected Location).

A set of secret data  is the content of memory at a protected location , i.e., there exists a protection mechanism such that a user-level program cannot access  to read or write .

The protected location may be in the main memory space, in peripherals or in other type of storage in the non-logic part of the computing system. In addition, there may exist temporary copies of the secret data in the cache system.

Definition 4 (Unique Program Execution).

A program executes uniquely w.r.t. a secret  if and only if the sequence of valuations to the set of architectural state variables is independent of the values of , in every clock cycle of program execution.

In other words, a user-level program executes uniquely if different secrets in the protected location do not lead to different values of the architectural states or to different time points when these values are assigned.

Definition 5 (Confidentiality/Observability).

A set of secret data  in a protected location  is confidential if and only if any user-level program executes uniquely w.r.t. . Otherwise  is observable.

Based on this definition, confidentiality is established by proving unique program execution at the RTL. In UPEC this is analyzed by a mathematically rigorous, “formal”, method. The requirement of unique program execution is formalized as a “property” expressed in a property language which is understood by a (commercially available) tool for property checking. Current tools for property checking, however, are built for functional design verification. In order to make property checking applicable to UPEC, we present a tailor-made computational model and formulate a specific property to be proven on this model.

Fig. 3: Computational model for UPEC: two almost identical instances of the computing system are considered. They contain the same data, except for the secret data that may vary between the two instances. The memory block in this figure stands for any resource that may carry secret data including peripheral devices.

Fig. 3 shows the model that is used in our UPEC approach. It can be derived automatically from the RTL description of the design and only requires the user to provide the protected memory region. In this model, and are two identical instances of the logic part of the SoC under verification. and , as indicated in the figure, hold the same set of values except for the memory location of a defined secret data.

Based on this model, we propose the UPEC property: For a system to be secure w.r.t. covert channel attacks, the computational model derived from the design’s RTL description has to fulfill the following property expressed in the CTL property language [36]:

(1)

In this formulation, micro_soc_state

is a vector of all microarchitectural state variables, as defined in Def. 

1, soc_state is some vector of state variables which includes, as a subset, all architectural state variables as defined in Def. 2 but not necessarily all other microarchitectural state variables. The constraint secret_data_protected specifies that a protection mechanism in the hardware is enabled for the secret data memory location. The CTL operators AG denote that the following condition must be fulfilled at all times and for all possible runs of the system (“All paths, Globally”).

The property in Eq. IV fails if and only if, in the system under verification, there exists a state, soc_state, such that the transition to the next state, soc_state’, depends on the secret data. This covers all situations in which program execution is not unique. Commercial property checking tools are available to check such a property based on standardized languages like SystemVerilog Assertions (SVA) [37]. For reasons of computational complexity, however, standard solutions will fail so that a method specialized to this problem has been developed, as described in Sec. V.

Importantly, in our methodology we will consider situations where soc_state, besides the architectural state variables of the SoC, includes some or all microarchitectural state variables, such as the pipeline buffers. Producing a unique sequence for a superset of the architectural state variables represents a sufficient but not a necessary condition for unique program execution. This is because secret data may flow to microarchitectural registers which are not observable by the user program, i.e., they do not change program execution at any time, and, hence, no secret data is leaked.

We therefore distinguish the following kinds of counterexamples to the UPEC property:

Definition 6 (L-alert).


A leakage alert (L-alert) is a counterexample leading to a state with where the differing state bits are architectural state variables.

L-alerts indicate that secret data can affect the sequence of architectural states. This reveals a direct propagation of secret data into an architectural register (that would be considered a functional design bug), or a more subtle case of changing the timing and/or the values of the sequence without violating the functional design correctness and without leaking the secret directly. UPEC will detect the HW vulnerability in both cases. While the former case can be covered also by standard methods of functionally verifying security requirements, this is not possible in the latter case. Here, the opportunity for a covert channel attack may be created, as is elaborated for the Orc attack in Sec. III. Since the functional correctness of a design is not violated, such a HW vulnerability will escape all traditional verification methods conducted during the microarchitectural design phase.

Definition 7 (P-alert).

A propagation alert (P-alert) is a counterexample leading to a state with where the differing state bits are microarchitectural state variables that are not architectural state variables.

P-alerts show possible propagation paths of secret data from the cache or memory to program-invisible, internal state variables of the system. A P-alert very often is a precursor to an L-alert, because the secret often traverses internal, program-invisible buffers in the design before it is propagated to an architectural state variable like a register in the register file.

The reason why soc_state in our methodology may also include program-invisible state variables will be further elaborated in the following sections. In principle, our method could be restricted to architectural state variables and L-alerts. P-alerts, however, can be used in our proof method as early indicators for a possible creation of a covert channel. This contributes to mitigating the computational complexity when proving the UPEC property.

V UPEC on a bounded model

Proving the property of Eq. IV by classical unbounded CTL model checking is usually infeasible for SoCs of realistic size. Therefore, we pursue a SAT-based approach based on “any-state proofs” in a bounded circuit model. This variant of Bounded Model Checking (BMC) [38] is called Interval Property Checking (IPC) [39] and is applied to the UPEC problem in a similar way as in [40] for functional processor verification.

For a full proof of the property in Eq. IV by our bounded approach, in principle, we need to consider a time window as large as the sequential depth, , of the logic part of the examined SoC. This will be infeasible in most cases. However, employing a symbolic initial state enables the solver to often capture hard-to-detect vulnerabilities within much smaller time windows. A violation of the UPEC property is actually guaranteed to be indicated by a P-alert in only a single clock cycle needed to propagate secret data into some microarchitectural state variable of the logic part of the SoC. In practice, however, it is advisable to choose a time window for the bounded model which is as long as the length, , of the longest memory transaction. When the secret is in the cache, is usually the number of clock cycles for a cache read. When the secret is not in the cache, is the number of clock cycles the memory response takes until the data has entered the cache system, e.g., in a read buffer. This produces P-alerts of higher diagnostic quality and provides a stronger basis for inductive proofs that may be conducted subsequently, as discussed below.

V-a Dealing with spurious counterexamples

In the IPC approach, the symbolic initial state also includes unreachable states of the system. This may result in counterexamples which cannot happen in the normal operation of the system after reset (spurious counterexamples). This problem is usually addressed by strengthening the symbolic initial state by invariants, in order to exclude certain unreachable states. However, developing invariants is mainly done in an ad-hoc way and it is not a trivial task to find sufficient invariants in a large SoC. However, the UPEC computational model and property formulation make it possible to address the problem of spurious counterexamples in a structured way. Since both SoC instances start with the same initial state, all of the unreachable initial states and spurious behaviors have the same manifestation in both SoC instances and therefore do not violate the uniqueness property, except for the states related to the memory elements holding the secret value. Here, we address the problem by adding three additional constraints on the symbolic initial state.

Constraint 1, “no on-going protected accesses”. Protected memory regions are inaccessible to certain user processes but the OS-level kernel process can freely access these regions. This can create a scenario for confidentiality violation in which the kernel process loads secret data from a protected region to a general-purpose register and then instantly branches into a malicious user-level process. Such an explicit revelation of secret data can be considered as spurious counterexample and excluded from consideration since it cannot happen in a SoC running an operating system within its normal operation. To exclude these trivial leakage scenarios, the proof must be constrained to exclude such initial states that implicitly represent ongoing memory transactions in which protected memory regions are accessed. This constraint can be formally expressed by assuming that, at the starting time point, the buffers holding the addresses for ongoing memory transactions do not contain protected addresses. These buffers can be easily identified by inspecting the fan-in of the memory interface address port.

Constraint 2, “cache I/O is valid”. Since the cache can hold a copy of the secret data, an unreachable initial state for the cache controller may lead to a spurious counterexample where the secret data is leaked in an unprivileged access. Deriving invariants for a complex cache controller in order to exclude such spurious behavior requires in-depth knowledge about the design and, also, expertise in formal verification. However, the task can be significantly simplified if we assume that the cache is implemented correctly. This is justifiable because UPEC-based security verification should be carried out on top of conventional functional verification, in order to target vulnerabilities not corrupting functionality but compromising security.

Communication between the cache and the processor and also higher levels of memory typically happens based on a well-defined protocol. Spurious behaviors reveal themselves by violating this protocol. Protocol compliance is usually verified as a part of functional verification (through simulation or formal techniques) and the requirements for such compliance is usually defined by the protocol specification and does not require in-depth knowledge about the implementation. For UPEC, we ensure the protocol compliance and valid I/O behavior of the cache by instrumenting the RTL with a special cache monitor which observes the transactions between the cache and other parts of the system (processor, main memory, etc.) and raises a flag in case of an invalid I/O behavior. Developing hardware monitors is standard practice in verification and does not impose significant overhead in terms of the effort involved.

Constraint 3, “secure system software”. Memory protection mechanisms are meant to protect the content of memory from user software, however, high-privilege softwares can freely access the contents of memory (including the secret data). Since it is not possible to have the complete OS/kernel in our bounded model (due to complexity), this may lead to trivial false counterexamples in which the system software copies the secret data into an unprotected memory location or architectural state. In order to exclude such trivial cases, a constraint is needed restricting the search to systems with secure system software. The system software is considered secure iff, under any possible input, there is no load instruction accessing secret data, i.e., before any load instruction, there is always an appropriate bounds check. In order to reflect that in the proof, the added constraint specifies that when the processor is in kernel mode, at the ISA level of program execution, there is no load instruction of the system software that accesses the secret. At the microarchitectural level this means that either no load instruction of the system software is executed that accesses the secret, or the execution of such a load instruction is invalid, i.e., the load instruction has been speculatively executed for a mispredicted bound check and is invalid at the ISA level.

It should be noted that these constraints do not restrict the generality of our proof. They are, actually, invariants of the global system. Their validity follows from the functional correctness of the OS and the SoC.

V-B Mitigating complexity

As discussed in Sec. IV, in our computational model the content of the memory is excluded from the soc_state in the UPEC property (Eq. IV). Therefore, the content of the memory influences neither the assumption nor the commitment of the property and, thus, can be disregarded by the proof method.

This observation helps us to mitigate the computational proof complexity by black-boxing (i.e., abstracting away) the data fields in the cache tables. This significantly reduces the state space of the model while it does not affect the validity of the proof. It is important to note that the logic parts of the cache, such as tag array, valid bits, cache controller, memory interface, etc., must not be black-boxed since they are part of the microarchitectural state and they play a key role for the security of the system.

In order to ensure that the partly black-boxed cache also conforms with the assumption made about the memory by the UPEC computational model, another constraint needs to be added to the property:

Constraint 4, “equality of non-protected memory”. For any read request to the cache, the same value must be returned by the cache in both instances of the SoC, unless the access is made to the memory location holding the secret data. This constraint is always valid for a cache that is functionally correct and therefore does not restrict the generality of the proof.

V-C UPEC interval property

The interval property for UPEC is shown in Fig. 4. The macro secret_data_protected() denotes that in both SoC instances, a memory protection scheme is enabled in the hardware for the memory region holding the secret data. The macro no_ongoing_protected_access() defines that the buffers in each SoC instance holding the addresses for ongoing transactions do not point to protected locations (Constraint 1). The macro cache_monitor_valid_IO() specifies that in each clock cycle during the time window of the property, the cache I/O behaviors in the two SoC instances comply with Constraints 2 and 4. The last assumption implements Constraint 3.

assume: at : secret_data_protected(); at : ; at : no_ongoing_protected_access(); during ..: cache_monitor_valid_IO(); during ..: secure_system_software(); prove: at : ;
Fig. 4: UPEC property (Eq. IV) formulated as interval property: it can be proven based on a bounded model of length using Satisfiability (SAT) solving and related techniques.

Vi Methodology

Fig. 5: UPEC Methodology: counterexamples to the UPEC property, P-alerts and L-alerts, denote propagation of secret data into microarchitectural registers. Only L-alerts prove a security violation. P-alerts are only necessary but not sufficient to detect a covert channel. However, they can be computed with less effort. Based on this iterative process the designer can browse through different counterexamples and exploit the trade-off between the computational complexity and the diagnostic expressiveness of counterexamples to prove or disprove security.

Fig. 5 shows the general flow of UPEC-based security analysis of computing systems. Checking the UPEC property (Eq. IV) is at the core of a systematic, iterative process by which the designer identifies and qualifies possible hardware vulnerabilities in the design. The UPEC property is initialized on a bounded model of length  and with a proof assumption and obligation for the complete set of microarchitectural state variables.

If the UPEC property can be successfully verified, then the design is proven to be free of side effects that can be exploited as covert channels. If the property fails it produces a counterexample which can be either an L-alert or a P-alert. An L-alert exposes a measurable side effect of the secret data on the architectural state variables, rendering the design insecure. A P-alert documents a side effect of the secret data on microarchitectural state variables that are not directly accessible by the attacker program. In principle, the designer can now remove the affected microarchitectural state variables from the proof obligation of the UPEC property (while keeping the complete set of microarchitectural state variables in the proof assumption), and then re-iterate the process to search for a different counterexample. The process is bound to terminate because, eventually, either an L-alert occurs or the design is secure.

In practice, however, the designer will not simply eliminate a P-alert but instead will analyze the counterexample. As mentioned before, an L-alert may have one or several shorter P-alerts as precursors. Since the P-alert is a shorter counterexample than the corresponding L-alert it can be computed with less computational effort, including shorter proof times. A P-alert belonging to an L-alert documents the earliest manifestation of a side effect and points the designer to the source of the vulnerability that causes the side effect. If the security compromise is already obvious from the P-alert the designer may abort the iterative process of Fig. 5 by deeming the P-alert as “insecure” and change the RTL in order to remove the vulnerability. This may be as simple as adding or removing a buffer. Note that if the designer wrongly deems a P-alert as “secure” then the security compromise is detected by another P-alert later, or, eventually, by an L-alert. The penalty for making such a mistake is the increase in run times for checking the later UPEC property instances.

If the procedure terminates without producing an L-alert this is not a complete proof that the design is secure, unless we increment the length of the model to . The alternative is to take the P-alerts as starting point for proving security by an inductive proof of the property in Eq. IV for the special case of an initial state derived from the considered P-alert. A P-alert can be deemed as secure if the designer knows that the values in the affected microarchitectural state variables will not propagate under the conditions under which the P-alert has occurred. In other words, in a secure P-alert, a certain condition holds which implicitly represents the fact that the propagation of secret data to architectural registers will be blocked. In order to conduct the inductive proof the designer must identify these blocking conditions for each P-alert. Based on the UPEC computational model (Fig. 3) the inductive proof checks whether or not the blocking condition always holds for the system once the corresponding P-alert has been reached.

Finally, there is always the conservative choice of making design changes until no P-alerts occur anymore, thus, establishing full security for the modified design w.r.t. covert channels.

Vii Experiments

Our new insights into covert channels were obtained by implementing UPEC in a prototype verification environment for security analysis. It interfaces with a commercially available property checking tool, OneSpin 360 DV-Verify, which is used to conduct the IPC proofs underlying UPEC.

We explored the effectiveness of UPEC by targeting different design variants of RocketChip [35], an open-source RISC-V [41] SoC generator. The considered RocketChip design is a single-core SoC with an in-order pipelined processor and separate data and instruction level-1 caches.

All results were obtained using the commercial property checker OneSpin 360 DV-Verify on an Intel Core i7-6700 CPU with 32 GB of RAM, running at 3.4 GHz.

In order to evaluate the effectiveness of UPEC for capturing vulnerabilities we targeted the original design of RocketChip and two design variants made vulnerable to (a) a Meltdown-style attack and (b) an Orc attack, with only minimal design modifications. Functional correctness was not affected and the modified designs successfully passed all tests provided by the RISC-V framework. UPEC successfully captured all vulnerabilities. In addition, UPEC found an ISA incompliance in the Physical Memory Protection (PMP) unit of the original design.

For the Meltdown-style attack we modified the design such that a cache line refill is not canceled in case of an invalid access. While the illegal access itself is not successful but raises an exception, the cache content is modified and can be analyzed by an attacker. We call this a Meltdown-style attack since the instruction sequence for carrying out the attack is similar to the one reported by [2]. Note, however, that in contrast to previous reports we create the covert channel based on an in-order pipeline.

For the Orc attack, we conditionally bypassed one buffer, as described in Sec. III, thereby creating a vulnerability that allows an attacker to open a covert timing side channel.

In the following experiments, the secret data is assumed to be in a protected location, , in the main memory. Protection was implemented using the Physical Memory Protection (PMP) scheme of the RISC-V ISA [42] by configuring the memory region holding the location  of the secret data as inaccessible in user mode.

Vii-a Experiments on the original RocketChip design

We conducted experiments on the original design for two cases: (1)  initially resides in the data cache and main memory, and, (2)  initially only resides in the main memory; cf. the columns labeled “ in cache” and “ not in cache” in Tab. I.

Separating the two cases and applying UPEC to them individually is beneficial for the overall efficiency of the procedure, because the solver does not have to consider both cases implicitly within a single problem instance.

cached not cached
5 34
Feasible  9 34
# of P-alerts 20 0
# of RTL registers causing P-alerts 23 N/A
Proof runtime 3 hours 35 min
Proof memory consumption 4 GB 8 GB
Inductive proof runtime 5 min N/A
Manual effort 10 person days 5 person hours
TABLE I: UPEC methodology experiments: For the original RocketChip design the two settings are distinguished whether or not there is a valid copy of the secret data in the cache. For each case the computational and manual efforts are reported when following the UPEC methodology and analyzing possible alerts.

For the experiment with not in the cache, UPEC proves that there exists no P-alert. This means that the secret data cannot propagate to any part of the system and therefore, the user process cannot fetch the secret data into the cache or access it in any other way. As a result, the system is proven to be secure for the case that is not in the cache. Since the property proves already in the first iteration of the UPEC methodology that there is no P-alert, the verification can be carried out within few minutes of CPU time and without any manual analysis.

For the case that is initially in the cache, we need to apply the iterative UPEC methodology (Fig. 5) in order to find all possible P-alerts. We also tried to capture an L-alert by increasing the length  of the time window, until the solver aborted because of complexity. The second row in the table shows the maximum  that was feasible. The following rows show the computational effort for this .

Each P-alert means that the secret influences certain microarchitectural registers. It needs to be verified whether or not each P-alert can be extended to an information flow into program-visible architectural registers. As elaborated in Sec. VI, using standard procedures of commercially available property checking, we can establish proofs by mathematical induction, taking the P-alerts as the base case of the induction. The inductive proofs build upon the UPEC computational model and check whether or not a state sequence exists from any of the known P-alerts to any other P-alert or L-alert. If no such sequence exists, then the system is proven to be secure. In this way, we proved security from covert channels also for the case when the secret is in the cache. The manual effort for this is within a few person days and is small compared to the total design efforts for processor design that usually are on the order of person years for processors of medium complexity. The complexity of an inductive proof for one selected case of the P-alerts is shown in Table I as an example.

Vii-B Experiments on the modified RocketChip designs

Design variant/vulnerability Orc Meltdown-style
Window length for P-alert 2 4
Proof runtime for P-alert 1 min 1 min
Window length for L-alert 4 9
Proof runtime for L-alert 3 min 18 min
TABLE II: Detecting vulnerabilities in modified designs: For two different attack scenarios the computational effort and window lengths to obtain the alerts for disproving security are listed.

Table II shows the proof complexity for finding the vulnerabilities in the modified designs. For each case, the UPEC methodology produced meaningful P-alerts and L-alerts. When incrementing the window length in search for an L-alert, new P-alerts occurred which were obvious indications of security violations. None of these violations exploits any branch prediction of the RocketChip. For example, in the Meltdown-style vulnerability, within seconds UPEC produced a P-alert in which the non-uniqueness manifests itself in the valid bits and tags of certain cache lines. This is a well-known starting point for side channel attacks so that, in practice, no further examinations would be needed. However, if the designer does not have such knowledge the procedure may be continued without any manual analysis until an L-alert occurs. This took about 18 min of CPU time. For the design vulnerable to an Orc attack the behavior was similar, as detailed in Table II.

Vii-C Violation of memory protection in RocketChip

UPEC also found a case of ISA incompliance in the implementation of the RISC-V Physical Memory Protection mechanism in RocketChip. PMP in the RISC-V ISA is managed by having pairs of address and configuration registers (PMP entry). There are 16 registers for 32-bit addresses, each of them associated with an 8-bit configuration register for storing access attributes.

There is also a locking mechanism by which the software can lock PMP entries from being updated, i.e., only a system reboot can change the contents of any PMP entry. According to the ISA specification, if a memory range and the PMP entry for the end address is locked, the PMP entry of the start address must be automatically locked, regardless of the contents of the associated configuration register.

This mechanism has not been correctly implemented in the RocketChip, enabling a modification of the start address of a locked memory range in privileged mode, without requiring a reboot. This is clearly a vulnerability, and a bug with respect to the specification. The detection of this security violation in PMP is actually an example showing that the UPEC property of Eq. IV also covers information leakage through a “main channel”, i.e., in the case where an attacker may gain direct access to a secret. Such cases can be identified by conventional functional design verification techniques. However, in UPEC they are covered without targeting any security specification and can be identified in the same verification framework as side channel-based vulnerabilities. In the current version of RocketChip, the bug has already been fixed.

Viii Conclusion

This paper has shown that covert channel attacks are not limited to high-end processors but can affect a larger range of architectures. While all previous attacks were found by clever thinking of a human attacker, this paper presented UPEC, an automated method to systematically detect all vulnerabilities by covert channels, including those by covert channels unknown so far. Future work will explore measures to improve the scalability of UPEC to handle larger processors. As explained in Sec. V, if there is a security violation, it is guaranteed that this will be indicated within a single clock cycle by a P-alert. Longer unrollings of the UPEC computational model were only chosen to facilitate the manual process for diagnosis and the setting up of induction proofs. Therefore, we intend to automate the construction of induction proofs, as needed for the methodology of Sec. VI. This does not only remove the manual efforts for P-alert diagnosis and for setting up the induction proofs. Also, by merit of this automation, the UPEC computational model can be restricted to only two clock cycles. This drastically reduces the computational complexity. In addition, a compositional approach to UPEC will be explored.

Acknowledgment

We thank Mark D. Hill (U. of Wisconsin), Quinn Jacobson (Achronix Semiconductor Corp.) and Simha Sethumadhavan (Columbia U.) for their valuable feedback. The reported research was partly supported by BMBF KMU-Innovativ 01IS17083C (Proforma) and by DARPA.

References

  • [1] P. Kocher, D. Genkin, D. Gruss, W. Haas, M. Hamburg, M. Lipp, S. Mangard, T. Prescher, M. Schwarz, and Y. Yarom, “Spectre attacks: Exploiting speculative execution,” arXiv preprint arXiv:1801.01203, 2018.
  • [2] M. Lipp, M. Schwarz, D. Gruss, T. Prescher, W. Haas, S. Mangard, P. Kocher, D. Genkin, Y. Yarom, and M. Hamburg, “Meltdown,” arXiv preprint arXiv:1801.01207, 2018.
  • [3] Y. Yarom, D. Genkin, and N. Heninger, “Cachebleed: a timing attack on OpenSSL constant-time RSA,” Journal of Cryptographic Engineering, vol. 7, no. 2, pp. 99–112, 2017.
  • [4] C. Percival, “Cache missing for fun and profit,” in BSDCan, 2005. [Online]. Available: http://www.daemonology.net/papers/htt.pdf
  • [5] D. Gullasch, E. Bangerter, and S. Krenn, “Cache games–bringing access-based cache attacks on AES to practice,” in IEEE Symposium on Security and Privacy (SP).   IEEE, 2011, pp. 490–505.
  • [6] Y. Yarom and K. Falkner, “FLUSH+ RELOAD: a high resolution, low noise, L3 cache side-channel attack.” in USENIX Security Symposium, vol. 1, 2014, pp. 22–25.
  • [7] P. Pessl, D. Gruss, C. Maurice, and S. Mangard, “Reverse engineering intel DRAM addressing and exploitation,” ArXiv e-prints, 2015.
  • [8] O. Aciicmez and J.-P. Seifert, “Cheap hardware parallelism implies cheap security,” in Workshop on Fault Diagnosis and Tolerance in Cryptography (FDTC).   IEEE, 2007, pp. 80–91.
  • [9] D. Jayasinghe, R. Ragel, and D. Elkaduwe, “Constant time encryption as a countermeasure against remote cache timing attacks,” in IEEE 6th International Conference on Information and Automation for Sustainability (ICIAfS).   IEEE, 2012, pp. 129–134.
  • [10] J. Kong, O. Aciicmez, J.-P. Seifert, and H. Zhou, “Deconstructing new cache designs for thwarting software cache-based side channel attacks,” in Proceedings of the 2nd ACM workshop on Computer security architectures.   ACM, 2008, pp. 25–34.
  • [11] D. A. Patterson and J. L. Hennessy, Computer Organization and Design, Fifth Edition: The Hardware/Software Interface, 5th ed.   San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2013.
  • [12] G. E. Suh, J. W. Lee, D. Zhang, and S. Devadas, “Secure program execution via dynamic information flow tracking,” in ACM Sigplan Notices, vol. 39, no. 11.   ACM, 2004, pp. 85–96.
  • [13] M. Tiwari, H. M. Wassel, B. Mazloom, S. Mysore, F. T. Chong, and T. Sherwood, “Complete information flow tracking from the gates up,” in ACM Sigplan Notices, vol. 44, no. 3.   ACM, 2009, pp. 109–120.
  • [14] J. Chen and G. Venkataramani, “CC-Hunter: Uncovering covert timing channels on shared processor hardware,” in Annual IEEE/ACM Intl. Symp. on Microarchitecture.   IEEE, 2014, pp. 216–228.
  • [15] J. Oberg, S. Meiklejohn, T. Sherwood, and R. Kastner, “Leveraging gate-level properties to identify hardware timing channels,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 33, no. 9, pp. 1288–1301, 2014.
  • [16] A. Ardeshiricham, W. Hu, J. Marxen, and R. Kastner, “Register transfer level information flow tracking for provably secure hardware design,” in Design, Automation & Test in Europe Conference (DATE).   IEEE, 2017, pp. 1695–1700.
  • [17] A. Ardeshiricham, W. Hu, and R. Kastner, “Clepsydra: Modeling timing flows in hardware designs,” in IEEE/ACM International Conference on Computer-Aided Design (ICCAD).   IEEE, 2017, pp. 147–154.
  • [18] D. Von Oheimb and S. Mödersheim, “ASLan++—a formal security specification language for distributed spercystems,” in Intl. Symp. on Formal Methods for Components and Objects.   Springer, 2010, pp. 1–22.
  • [19] G. Zanin and L. V. Mancini, “Towards a formal model for security policies specification and validation in the selinux system,” in Proceedings of the ninth ACM symposium on Access control models and technologies.   ACM, 2004, pp. 136–145.
  • [20] M. R. Clarkson and F. B. Schneider, “Hyperproperties,” Journal of Computer Security, vol. 18, no. 6, pp. 1157–1210, 2010.
  • [21] M. R. Clarkson, B. Finkbeiner, M. Koleini, K. K. Micinski, M. N. Rabe, and C. Sánchez, “Temporal logics for hyperproperties,” in International Conference on Principles of Security and Trust.   Springer, 2014, pp. 265–284.
  • [22] E. J. Schwartz, T. Avgerinos, and D. Brumley, “All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask),” in IEEE symposium on Security and privacy (SP).   IEEE, 2010, pp. 317–331.
  • [23] P. Subramanyan, S. Malik, H. Khattri, A. Maiti, and J. Fung, “Verifying information flow properties of firmware using symbolic execution,” in Design, Automation & Test in Europe Conference (DATE).   IEEE, 2016, pp. 337–342.
  • [24] P. Subramanyan and D. Arora, “Formal verification of taint-propagation security properties in a commercial SoC design,” in Design, Automation & Test in Europe Conference (DATE).   IEEE, 2014, p. 313.
  • [25] G. Cabodi, P. Camurati, S. Finocchiaro, C. Loiacono, F. Savarese, and D. Vendraminetto, “Secure embedded architectures: Taint properties verification,” in International Conference on Development and Application Systems (DAS).   IEEE, 2016, pp. 150–157.
  • [26] G. Cabodi, P. Camurati, S. F. Finocchiaro, F. Savarese, and D. Vendraminetto, “Embedded systems secure path verification at the HW/SW interface,” IEEE Design & Test, vol. 34, no. 5, pp. 38–46, 2017.
  • [27] W. Hu, A. Ardeshiricham, and R. Kastner, “Identifying and measuring security critical path for uncovering circuit vulnerabilities,” in International Workshop on Microprocessor and SOC Test and Verification (MTV).   IEEE, 2017.
  • [28] H. M. Wassel, Y. Gao, J. K. Oberg, T. Huffmire, R. Kastner, F. T. Chong, and T. Sherwood, “SurfNoC: a low latency and provably non-interfering approach to secure networks-on-chip,” in ACM SIGARCH Computer Architecture News, vol. 41, no. 3.   ACM, 2013, pp. 583–594.
  • [29] M. Tiwari, J. K. Oberg, X. Li, J. Valamehr, T. Levin, B. Hardekopf, R. Kastner, F. T. Chong, and T. Sherwood, “Crafting a usable microkernel, processor, and i/o system with strict and provable information flow security,” in ACM SIGARCH Computer Architecture News, vol. 39, no. 3.   ACM, 2011, pp. 189–200.
  • [30] J. Demme and S. Sethumadhavan, “Side-channel vulnerability metrics: SVF vs. CSV,” in Proc. of 11th Annual Workshop on Duplicating, Deconstructing and Debunking (WDDD), 2014.
  • [31] J. Demme, R. Martin, A. Waksman, and S. Sethumadhavan, “Side-channel vulnerability factor: A metric for measuring information leakage,” in Annual International Symposium on Computer Architecture (ISCA).   IEEE, 2012, pp. 106–117.
  • [32] L. Domnitser, N. Abu-Ghazaleh, and D. Ponomarev, “A predictive model for cache-based side channels in multicore and multithreaded microprocessors,” in International Conference on Mathematical Methods, Models, and Architectures for Computer Network Security.   Springer, 2010, pp. 70–85.
  • [33] D. Zhang, Y. Wang, G. E. Suh, and A. C. Myers, “A hardware design language for timing-sensitive information-flow security,” ACM SIGPLAN Notices, vol. 50, no. 4, pp. 503–516, 2015.
  • [34] C. Trippel, D. Lustig, and M. Martonosi, “MeltdownPrime and SpectrePrime: Automatically-synthesized attacks exploiting invalidation-based coherence protocols,” arXiv preprint arXiv:1802.03802, 2018.
  • [35] K. Asanovic, R. Avizienis, J. Bachrach, S. Beamer, D. Biancolin, C. Celio, H. Cook, D. Dabbelt, J. Hauser, A. Izraelevitz et al., “The rocket chip generator,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2016-17, 2016.
  • [36] E. A. Emerson and E. M. Clarke, “Using branching time temporal logic to synthesize synchronization skeletons,” Science of Computer programming, vol. 2, no. 3, pp. 241–266, 1982.
  • [37] SystemVerilog Language Working Group, “IEEE standard for system verilog – unified hardware design, specification, and verification language,” IEEE Std. 1800-2009, 2009. [Online]. Available: http://www.systemverilog.org/
  • [38] A. Biere, A. Cimatti, E. Clarke, and Y. Zhu, “Symbolic model checking without BDDs,” in Proceedings of the 5th International Conference on Tools and Algorithms for Construction and Analysis of Systems, ser. TACAS ’99.   London, UK, UK: Springer-Verlag, 1999, pp. 193–207.
  • [39] M. D. Nguyen, M. Thalmaier, M. Wedler, J. Bormann, D. Stoffel, and W. Kunz, “Unbounded protocol compliance verification using interval property checking with invariants,” IEEE Transactions on Computer-Aided Design, vol. 27, no. 11, pp. 2068–2082, November 2008.
  • [40] M. R. Fadiheh, J. Urdahl, S. S. Nuthakki, S. Mitra, C. Barrett, D. Stoffel, and W. Kunz, “Symbolic quick error detection using symbolic initial state for pre-silicon verification,” in Design, Automation & Test in Europe Conference (DATE).   IEEE, 2018, pp. 55–60.
  • [41] A. Waterman, Y. Lee, D. A. Patterson, and K. Asanovi, “The RISC-V instruction set manual. volume 1: User-level ISA, version 2.0,” California Univ Berkeley Dept of Electrical Engineering and Computer Sciences, Tech. Rep., 2014.
  • [42] A. Waterman, Y. Lee, R. Avizienis, D. A. Patterson, and K. Asanović, “The RISC-V instruction set manual volume II: Privileged architecture version 1.9,” EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2016-161, 2016.