Glitching, or fault injection, has been used for over a decade [ARKM] to attack software running on secure execution environments. Due to the upward trend in pricing in the software exploit market [scip] and the increased hardening of security in consumer devices, there has been a rise in popularity of injecting faults to gain control of a device. Fault injections can be used to cause a malfunction in the target’s system-on-chip (SoC) and, when the malfunction is controlled properly, can be used by an attacker to take full control of the device.
Voltage glitching is a specific kind of fault injection and is attractive because is it inexpensive to set up and is widely applicable to most chips . Crowbar voltage glitching was introduced by O’Flynn [OFlynn2016FaultIU] and implemented in the ChipWhisperer open hardware platform to bring these attacks to the mainstream. It works by abusing the capacitance ringing effect caused by introducing a crowbar circuit into the existing system. The ringing causes faults that can be exploited.
We looked at the prior attempts at modeling voltage fault injections and ordered them in terms of abstraction (see table I). Most recently, in a paper by Timmers, Spruyt, and Witteman [TSW], they created an architectural model of fault injection for ARM devices. Their model considers instruction corruption due to bit-flips caused by the fault. Their model is applicable to many kinds of fault injection.
Another paper we drew inspiration comes from Zussa, Dutertre, Clédière and Tria [ZDCT]. Their work focused on confirming empirically that the mechanism for faults induced by voltage glitching is due to setup/hold time violations. They concluded that voltage glitches increased the propagation time of combinational logic which creates setup/hold time violations. Another connection we make is that while their paper wanted empirical evidence for a wide-held belief in how voltage glitches work we want a rigorous theoretical model for the same wide-held belief.
One level down from that is the paper by Djellid-Ouar, Cathebras and Bancel [DCB] which concluded that D-flip-flops (and therefore most memory elements) were mostly immune to standard voltage attacks. Their paper analyzed bistable CMOS elements using the small signal model.
In addition to modeling voltage glitches, there has been many experiments in applying voltage fault injections to affect processor operations on complex SoCs. The aforementioned paper from Timmers, Spruyt, and Witteman [TSW] details a bypass of a secure boot code integrity check on an ARM processor through voltage glitching. O’Flynn [OFlynn2016FaultIU] used his crowbar method on a Raspberry Pi computer to modify the result of a running counter.
The PlayStation Vita was a hand-held gaming console released in 2012. It used a custom designed Samsung 45nm SoC [anandtech]. The SoC includes a MeP architecture processor which we nicknamed “F00D,” that performs cryptographic tasks and serves as the boot processor. The boot ROM used by F00D is unmapped early in the boot process and is then unable to be read out through pure software means.
Our contribution will be in two separate domains. In section II, we will analyze the CMOS transistor behavior in order to understand when the combinational logic is most susceptible to voltage glitch induced faults. Then, in section III we will apply our understanding to perform a fault injection attack on the PlayStation Vita’s SoC to gain early (boot time) execution control of F00D in order to dump the boot ROM.
|TSW [TSW]||2016||Architectural||ARM, any glitches||Faults modeled by corrupted instructions|
|ZDCT [ZDCT]||2013||Digital Logic||Voltage/clock glitching||Fault caused by setup/hold violations|
|DCB [DCB]||2006||Gates/Elements||Voltage glitching||D-flip-flops not susceptible to voltage glitches|
|This Paper||2018||Transistor||Voltage glitching||Effects are data dependent|
Ii Cmos Voltage Glitch Model
To analyze the CMOS behavior during a voltage glitch, we will only consider the voltages near the gate itself. This simplification will disregard everything that happens when the voltage pads on the IC is suddenly changed. For example, if a crowbar circuit is used to quickly shortto for some amount of time, we do not actually observe a short at the MOSFET. Instead, the capacitance and the power-delivery-network of the circuit will create a ringing effect [OFlynn2016FaultIU] that will be observed at the MOSFET. Our analysis will therefore only consider the duration and amplitude of these rings as the “glitch” and not the source of them. We note that a more in depth analysis can incorporate such external effects without affecting our understanding of what happens at the MOSFET.
We will use standard notation where applicable. For convenience, they are defined in table II along with other labels relevant in our analysis.
|Supply voltage (normal operations)|
|Ground voltage (typically )|
|Glitch supply voltage (typically )|
|CMOS input voltage|
|CMOS output voltage|
|Gate load capacitance (simplified)|
|Voltage from PMOS source to PMOS gate|
|PMOS threshold voltage|
|Switching threshold for input low|
|Switching threshold for input high|
|PMOS equivalent resistance with source|
|NMOS equivalent resistance with source|
|PMOS equivalent resistance with source|
|Propagation delay for output going low|
|Propagation delay for output going high|
|Glitch start time|
|Glitch end time|
|Propagation delay for output going|
|low during glitch (to be defined)|
|Propagation delay for output going|
|high during glitch (to be defined)|
Consider a standard 1-input CMOS gate (an inverter, figure 1). Let be the input voltage and be the output. We define a voltage glitch to be a span of time from to during which, we set . is the glitch voltage (and ideally ). is the glitch width. represents the load capacitance and includes the gate parasitic capacitances, the wire capacitance, and the input capacitance of the next gate.
We will focus on the span of time between and when the voltage glitch happens. Looking at only one inverter, we see that in that span of time, the input could either toggle at least one time or not toggle at all. We will analyze the behavior of the inverter in both cases to see when the output value can be influenced by the glitch.
First, consider the case when does not change during the voltage glitch. If the input is a logical ‘1’ () then the output would be at . Since , the PMOS is off and there is no source for to change. Therefore the voltage glitch caused no change in behavior.
If the input is at logical ‘0’ (), then before the voltage glitch, and . When the glitch happens, the PMOS will be on, so we can use the equivalent resistance model defined in Chapter 5.4 of [RC1] to analyze the dynamic behavior.
Using equation 5.17 in [RC1], we find the fall time to be
where is the on-resistance of the PMOS at the glitch voltage . Note that this is slightly different from the commonly used value (fall time of the inverter) because is defined with respect to , the on-resistance of the NMOS.
The output goes to ‘0’ for the duration of the glitch and then is be restored to ‘1’ as is restored to its pre-glitch value. This means that if the input is ‘0’ at the start of the voltage glitch and does not toggle, then the output will be low from to (assuming ). Note this is similar to a static hazard, which can cause setup and hold time issues.
Consider a transition from a logical ‘0’ to ‘1’ at the input during the voltage glitch (from to ). Since this turns off the PMOS, there should be no change in behavior at the output (compared to what happens if it toggles while there is no voltage glitch). However, if the transition is from a logical ‘1’ to ‘0’, then according to [ZDCT], the rise time of the inverter output, increases as decreases. Additionally, if the glitch voltage , then the output will not go high for the duration of the voltage glitch ( would be infinite). So this means the propagation delay increases as decreases up to . This also causes setup and hold time issues.
In both cases, we see that the delay introduced to a single gate by a voltage glitch is composed of a rise/fall time ( or depending on the input value), the glitch width , and finally the delay for when is “restored” and the “correct” input must propagate to the output again (). We can define to be the propagation delay of that inverter during a voltage glitch from the input to the output (we consider the value to be “propagated” only when the output has the correct value). From above, we see this can be broken into four cases.
Now, if we have a chain of inverters, we can find the total propogation through the chain considering that only the first inverter is affected by the voltage glitch (with delay , the average delay of the two cases that are affected by a voltage glitch). Note that if two inverters in a chain are affected, we do not care about the status of the later inverter because the earlier one will already propagate its corrupted value and later corrected value down the chain.
In practice , , , will all be very small compared to 111 and are around the order of for CMOS technology [RC1].. Therefore we can simplify equation 3 to:
Note that the second part is just the CMOS propagation time with fanout 1. That means as long as is much greater than the rise/fall time of the output, the propagation delay is bounded by the CMOS delay plus the glitch width.
Of course in reality, the analysis gets a lot more complicated. First, the voltage glitch will not affect every CMOS at the same time. There will be a sort of “propagation delay” of the voltage change itself. Second, when we consider 2-input gates and higher, there could be a mixture of non-toggling and toggling behavior at each gate. Then of course, there are the non-linear capacitance that makes computing difficult.
However, there are some conclusions we can draw from this analysis.
Asynchronous circuits are most affected by voltage glitches due to the introduction of hazards.
Synchronous circuits are not immune if the voltage glitch cause a setup/hold time violation.
Critical paths can be extended if a voltage glitch happens at the right time (0-static or 1-toggling).
Long critical paths are the best targets for voltage glitching (i.e: processor ALU).
Iii Vita Glitching
The firmware and boot loader are found on an external eMMC storage, which has logical sectors 512 bytes wide. Upon boot, sector 0, the master boot record is read. The Vita’s MBR is a custom format not used in any other device. The details of this MBR format are beyond the scope of this paper, but two fields are of importance. is at MBR offset and is at MBR offset . These two fields are both 4 bytes wide and both the offset and size are defined in number of sectors. They are used by the boot ROM to determine where the boot loader is located.
One thing we discovered early on (through trial and error) was that if , then an assertion fails and the device is rebooted. Otherwise, the eMMC is read starting at the offset for blocks. Our hypothesis is that there is a fixed size buffer that the boot loader is read into which necessitates the size check. If we use a fault injection to bypass this check, we can introduce a buffer overflow vulnerability.
Iii-a Experimental Setup
There were three main components to our setup. First, we needed a way to monitor the eMMC traffic and use it as a trigger for the voltage glitch. Second, we needed a way to perform the voltage glitch. Finally, we wished to automate the steps in order to find the optimal parameters for glitching.
Fortunately, the ChipWhisperer gave us an easy way to do all of this. The hardware has a MOSFET that performs the crowbar voltage glitch [OFlynn2016FaultIU]. The open hardware design allowed us to implement a custom eMMC trigger for the MOSFET (see appendix A-A). Finally, because the timing and duration of the voltage glitch is highly dependent on the power distribution network of the device [OFlynn2016FaultIU], it is difficult to compute the optimal timing parameters for a successful glitch on the size check. Instead, we exhaustively searched for the timing offset and width of the crowbar activation after being triggered that results in a successful fault injection.
Additionally, we made sure to synchronize the ChipWhisperer’s glitch module clock with the device’s external clock input. This way we can make sure the two devices are in phase and decrease the variance in finding working parameters.
For a successful fault injection, we need to find parameters , the clock frequency of the glitch module and the Vita’s external clock input, , the number of cycles after seeing the eMMC trigger before activating the crowbar circuit and , the number of cycles to hold the crowbar on before releasing it. Note that these parameters will determine the response of the ringing effect that will ultimately cause a fault in the size comparison222Many sources mention removing decoupling capacitors for better result without giving a detailed reason. We were able to get voltage glitches to work both with and without removing the decoupling capacitors. It is our belief that removing the decoupling capacitors changes the response of the ringing and therefore the parameters for a successful glitch. But in our case, it does not make it any more or less tractable..
We know from experimentation333Toggle GPIO and measure with the Rigol DS1054Z. that F00D boot ROM runs at (where is the external clock input). A faster clock will increase the chance of a successful fault (due to timing violations). In practice, we cannot over-clock much past the default rate of 37MHz or we will run into non-voltage glitch related timing violations that prevent the circuit from working properly444This might be prevented with, for example, better cooling but we did not go down this avenue.. However, in our case, because of the noisy design of our glitching setup, we picked since it was the fastest we were able to go without running into a variety of signal integrity issues.
For and , we chose to brute force every possible value to find ones that worked.
For the parameter search, we first used manual analysis to get a ballpark idea of when is being checked. We hypothesized that the check must happen after the response packet for the eMMC request for the MBR block. First we narrowed the window for the search to be the period of time between commands. We measured the amount of time from the response of packet (with a valid ) to the request of the 555Defined as part of the eMMC read procedure [emmc]. packet using a Rigol DS1054Z oscilloscope. At , it took 666We discovered that due to a bug, the boot ROM spins the processor to wait for the request to complete. However, the smallest granularity of the spin time is about 1000 times the average amount of time it takes for the command to complete (as we observed on the DS1054Z). So to make things easier, our brute force actually ran backwards starting from the far end of the window.. This yields an upper bound for .
When an attempt fails with a particular , pair, we observe one of the following behaviors:
The device halts. Usually we only see this with very large M so it’s likely the CMOS is losing power and shutting off.
The device reboots and we observe the eMMC packet. This is the most common observation777We believe this is due to the fault happening at the wrong place or not happening at all. Both of which would cause an assertion to fail. Originally we wanted to try some timing attack to differentiate the two cases. However we later found out that the designers actually anticipated this and masked any reboot triggered by an assertion fail to first spin for a random number of cycles determined by a TRNG to make it hard to determine “which” assertion fail caused the reboot..
The device goes into an unknown state and makes an unexpected request (and possibly restarts).
The device requests the first block of the bootloader (this is the success case).
We therefore only need to record the next packet seen after reading MBR (or time out) and check if it is the first block of the boot loader to indicate success. Figure 3 summarizes the process.
After writing a script with the ChipWhisperer API to try all possible and values (see appendix A-B) and running it overnight, we found a successful case with the following parameters (see table III). Due to the effects of wire capacitance and external capacitance, the parameters are highly specific to our setup and environment. However, after finding a valid , pair, even with environmental variations, we can find another valid , pair close by. If the equipment and target board and not moved or touched at all, we can reproduce the injected fault with the same , for of the time.
We managed to find the glitch parameters and that faults the check. However this only gives us a vulnerability. We still need to exploit it. Fortunately, this was made easy when we observed that when the glitch was successful, we see exactly blocks read if . If , then it will read blocks exactly (with a successful glitch). Therefore we guessed that we were overwriting the currently executed code and that guess was correct. With that, we launched a payload that dumped everything we can read through UART.
Unfortunately we then discovered the boot ROM does not seem to be mapped anywhere. It appears that hardware copies the contents of boot ROM to SRAM and the reset vector points directly to SRAM. That means the boot ROM is able to clean up parts of itself as it executes. To get the remaining parts (that were cleaned up), we had to glitch different parts of the boot ROM (running in SRAM) and gain code execution earlier and earlier on. Each time we dump more code, we gain more information and can develop more specific glitch targets. The details of these additional injected faults and their subsequent exploitation is beyond the scope of this paper.
By looking at how voltage glitches introduce timing violations into a digital circuit, we can find good snippets of code to glitch. Once a target is found, we can search for the right timing parameters for our crowbar circuit to cause a fault. We do an exhaustive search because it is difficult to predict how changing the parameters and actually affects the CMOS circuits. Finally, the injected fault introduces a software vulnerability that can be exploited to gain code execution. All of this can be done at a low cost thanks to the open hardware interface of the ChipWhisperer. With a custom script written for ChipWhisperer, we created a working attack on a security hardened consumer device.