1 Introduction
Physical Unclonable Functions (PUFs) [28] have been considered as promising security primitives for the Internet of Things (IoT) for its lightweight hardware implementation. PUFs can be exploited in a variety of applications, such as identification [1] or secret key generation. The randomness of a PUF is extracted from random uncontrollable process variations, and its behavior, or Challenge Response Pair (CRP) [3], is uniquely tied to a given device and is hard to predict or replicate. Since the first physical unclonable identification was fabricated in [4], extensive efforts have been devoted into the area, and different silicon PUF implementations have been proposed, including Arbiter PUF [5], Ring Oscillator (RO) PUF [6], SRAM PUF [29], and many other variations. ^{1}^{1}1This work is a significant extension of [30]
The instability of a parametric PUF potentially limits the practical application of a PUF. Since these PUFs are parametric, they are in nature susceptible to environmental variations, and the behavior of a PUF can be altered consistently in two different environments. To make a PUF more stable, extra overhead is required, including hardware or latency cost [31]. Techniques such as error correction code (ECC) or helper data come with the cost of extra hardware implementation or possible security concerns [32].
Recently, a stabilityguaranteed Locally Enhanced Defectivity PUF (LEDPUF) proposed in [33] shows completely stable responses by utilizing random hard defect generated from Directed Self Assembly (DSA) process. However, it is difficult to fabricate given that DSA is not well accepted into commercial silicon manufacturing yet. In [34] a reliable RRAM PUF with actual PUF fabrication using Resistive Random Access Memory (RRAM) is presented. However, an offchip characterization of the split current and offset for the sense amplifiers are required, and the reliability results under voltage variations are not reported, which can dramatically impact the stability. Another reliable PUF using Hot Carrier Injection (HCI) is presented in [35]. However, post calibration steps are still needed and the randomness of the most stable responses was not reported. In [36], the authors apply high voltage to induce gate oxide breakdown of transistors to extract stable randomness. However, a ”afterburn” phase is performed to all broken oxides to enhance the stability, which would require additional hardware and calibration. In [15], the authors intentionally introduce oxide breakdown by violating antenna rules to generate stable random bits. However, the response time may be long due to the limited leakage current to charge the ID generation output if no breakdown occurs.
A PUF designer needs to meet a desired security level without using an excessive number of gates. The two most dominant factors that can reduce the security level of a PUF are bias of the PUF response as well as instability, that is, noise. For this, various methods for evaluating how secure a PUF is have been presented. Among the most popular methods are inter and intraFractional Hamming Distance (FHD) [6], as well as the NIST test suite for random and pseudorandom number generators [7]. These methods do not require evaluating directly the underlying probability mass function according to which a PUF response is drawn. On the other hand, they do not provide a single measure that quantifies the interplay between noise and bias in terms of the security level; interFHD distance is related to bias whereas intraFHD can be related to noise, and so when using these measures it is not clear whether a PUF with interFHD and intraFHD is more secure than a PUF with interFHD and intraFHD; the NIST test suite is extremely sensitive to bias and does not take into account the effect of noise. Other methods that rely on evaluation of the underlying probability mass function are the minentropy [27], mutual information [8] and guesswork [33]. These methods indeed incorporate the effect of noise and bias into a single quantifiable measure.
1.1 Contributions

We implement stable PUFs using randomness extracted from the plasma induced oxide breakdown and the voltage stressed oxide breakdown.

Test structures violating antenna rules are fabricated with 65nm CMOS bulk technology. Measured results from 99 testchips show that the responses are highly stable across combinations of voltage (0.8V, 1.0V, 1.2V) and temperature variations (25°C, 100°C). Compared to a practical SRAM PUF, significant area reduction can be achieved from eliminating ECC implementation for the highly stable responses.

We analyze the data from these testchips and show based on various statistical distance measures that pairs of bits with the same antenna ratio as well as bits that are located next to each other are effectively statistically independent. We also propose to use guesswork as a new measure for statistical distance that has operational meaning in terms of security.

We discuss various methods for evaluating the security level of PUFs such as minentropy, guesswork, inter and intraFHD as well as the NIST test suite. In addition, we present the merits of guesswork as a method to evaluate the security level of a PUF. Furthermore, we present the tradeoff between hardware size, bias and guesswork based on our measured results from 99 testchips.
2 Stable PUFs Using Gate Oxide Breakdown
In this section, we first introduce the gate oxide breakdown and describe two approaches exploiting the gate oxide breakdown as randomness sources of stable PUFs, followed by PUF bit generation and attack resilience analysis.
2.1 Gate Oxide Breakdown
Gate oxide breakdown is detrimental to metaloxidesemiconductor (MOS) devices because it can cause significant drifts of transistor parameters. The breakdown can be categorized into two types: soft breakdown and hard breakdown, where both mechanisms introduce significant sudden increase of the leakage current. For soft breakdown, the conducting path from gate to the substrate is formed by the charged traps in the gate oxide. Once there is conduction, new traps begin to accumulate due to thermal damage, which in turn increases the conductance. The positive feedback eventually leads to thermal runway and oxide is physically melt in the breakdown spot. This type of breakdown is called hard breakdown. The gate leakage current of an oxide with both soft and hard breakdown can be 100X larger than the leakage current of an oxide without breakdown.
2.2 Plasma Induced Gate Oxide Breakdown
During silicon wafer fabrication, plasma processes are widely used for etching, photoresist stripping, or ion implantation. In the plasma ambient, metal segments, VIAs, or polysilicon electrodes, which are the antenna segments, can be electrically charged by ions or electrons, and therefore produce the antenna voltage. For the antenna segments connected to the gate inputs, the resulting electrical stress from the antennas can potentially damage the underlying gate oxide and create a conducting path from the gate to the substrate. The phenomenon is called plasma induced gate oxide breakdown, or the antenna effect.
Though the maximum voltage rise can be modeled, the actual voltage still cannot be predicted because the exact motion and amounts of ions and electrons collected by the antenna segment are random and unpredictable. The higher the gate voltage is, the higher the probability for the gate oxide breakdown to occur, thus causing a device to fail. Also, systematic plasma variation across wafer does not have much impact on the local randomness because the variation is negligible to a die.
To avoid the antenna effect, design rules of the antenna ratio (AR) as shown in equation (1) must be strictly followed during fabrication. Practical design rules of AR range from 100 to 5000 depending on the process details.
(1) 
Since both soft breakdown and hard breakdown can induce about 100X or more leakage current than a good oxide, they are both considered as breakdown in our proposed stable PUF construction. Since the process parameters of our testchip fabrication are unknown prior manufacturing, we implemented a variety of antenna ratios to measure breakdown probabilities, which are presented in Section 3.2. While foundries try to avoid antenna effect during manufacturing, we exploit the uncontrollable physical phenomena as another randomness source of a stable PUF.
2.3 Voltage Stressed Gate Oxide Breakdown
The purpose of antenna rules is to protect all transistors from having deviated parameters, for example 20% gate leakage increase at 1.4xVDD [9], which could be harmful for a normal fabrication but still far from causing a real breakdown. Therefore, to introduce a noticeable plasma induced breakdown (100X increase of leakage current) with 50% probability of a transistor, an AR larger than 1000X antenna rule may be required, which can result in large area overhead.
To avoid using large antenna segments, one way is to apply high voltage stress to the gate of a transistor directly. By voltage stressing the gate terminal of a transistor, oxide breakdown can be introduced with small AR or even without violating the antenna rules. On the other hand, such a PUF construction requires an additional stress step post manufacturing (or during PUF enrollment). Please note the voltage stressed gate oxide breakdown mechanism is different from the Erasable PUF proposed in [10], where oxide breakdown is introduced to erase targeted bit cells instead of being used as a stable source of randomness.
2.4 Stable Signal Unit Construction
The permanent gate oxide breakdown mechanism, which can be caused by plasma damage or voltage stressed damage, is used to construct a Stable Signal Unit (SSU) as a source of permanent defectivity. A SSU is a pMOS transistor designed to violate antenna rules, and its drain, source, and bulk terminals are connected to capture the effect of the gate oxide breakdown at all possible locations. Similar to a gate oxide breakdown model given in [11], the SSU is attached in series to a precision resistor as given in Fig. 1, where Fig. 1 (a) shows a SSU without oxide breakdown and Fig. 1 (b) shows a SSU with oxide breakdown. If no breakdown occurs as depicted in Fig. 1 (a), the device is essentially a capacitor or a resistor much larger than the precision resistor, thus the output voltage would be lower than 50% VDD when the evaluation signal EVA is VDD; if a breakdown happens, as shown in Fig. 1 (b), the device can be seen as resistors much smaller than the precision resistor, thus the output voltage would be higher than 50% VDD when EVA is VDD. The resistance of the precision resistor (10M) is determined by actual measurements from 99 testchips as described in Section 3.2.
2.5 Attack Resilience
It is worth mentioning that the SSU is more secure than an antifuse cell because an antifuse cell is programmed with hard breakdown only, while the output of the SSU is decided by both soft breakdown and hard breakdown, and a soft breakdown is much harder to detect than a hard breakdown (albeit possible for a very resourceful attacker). For probing attack, the efficiency is limited by the mechanical constraints. For imaging attacks, such as Scanning Electron Microscopy (SEM), Transmission Electron Microscopy (TEM), or Electron Beam Induced Currents (EBIC), it is difficult to efficiently identify a soft breakdown because a soft breakdown because its physical appearance is very similar to a fresh gate oxide without any visible holes. It is also challenging for EBIC to identify a soft breakdown because the limited current of a soft breakdown can induce measurement noises, and the throughput of the electron beam is low. Finally, it is also difficult to observe a soft breakdown from a topdown or crosssection TEM because the image does not effectively tell the depth of the traps.
3 Testchip Fabrication and Measurement
3.1 SSU Implementations
The proposed SSUs are implemented and fabricated on 99 testchips with commercial 65nm GP 1P9M_6X1Z1U CMOS bulk technology with 1V nominal voltage. The smallest gate size (0.0072) of the technology is used for all the SSUs. In our testchips the fabricated SSUs intentionally violate antenna rules by a few hundred times to a few thousand times on different layers.
On each chip, 29 SSUs are implemented with 17 different ARs, therefore the total number of SSU implementations is 2871 from 99 chips. For each of the SSUs, the cell area and detailed antenna violation report are given in Table I, where a zero indicates that there is no antenna rule violation on such layer. The antenna rule violation reports are provided to the foundry to skip such design rule checks. The M_T, V_T, and P_T structures test the effects of metal, VIA, and polysilicon layers from small AR to large AR, respectively. For each of the M_T, V_T, and P_T, two SSUs with same AR are implemented, therefore 24 bits of responses are obtained from these SSUs on a chip. The remaining five test structures are of various combinations of the violating layers, and one SSU is implemented for each of the five test structures. In summary, on each chip, 29 bits are measured, and 24 bits of them are obtained from the duplicated 12 structures of M_T, V_T, and P_T.
Cell  VIA  Metal  Poly  Poly Perimeter  
M_T1  36  0.87  1144.57  0.00  0.00 
M_T2  360  1.17  1468.57  0.00  0.00 
M_T3  1200  0.00  4398.88  0.00  0.00 
M_T4  4800  0.16  36781.89  0.00  0.00 
V_T1  2.4  0.87  1108.57  0.00  0.00 
V_T2  8  2.31  1108.57  0.00  0.00 
V_T3  90  15.27  1185.66  0.00  0.00 
V_T4  804  144.91  1895.05  0.00  0.00 
P_T1  4.8  1.26  1917.53  0.00  0.00 
P_T2  27  1.26  1917.53  18.17  55.59 
P_T3  203  1.26  1917.53  180.07  128.43 
P_T4  1800  1.26  1917.53  1800.07  222.46 
Test1  804  1071.86  5631.11  0.00  0.00 
Test2  4.7  1.86  0.00  0.00  0.00 
Test3  80  0.26  299.20  0.00  0.00 
Test4  60  20.84  318.78  28.07  83.81 
Test5  118  54.40  617.25  56.39  164.72 
3.2 Breakdown Probability Evaluation
To determine the gate oxide breakdown of a SSU, we use Agilent 34411A Digital Multimeter to measure the equivalent resistance of each SSU, and from the distribution of we choose a proper precision resistor as shown in Fig. 1 to determine whether or not an oxide breakdown has occurred. Fig. 2 shows distribution of a SSU implementation (V_T1) with plasma induced and voltage stressed breakdown on 99 chips in an increasing order at 25°C, 1V. For both distributions, the of a SSU implementation with oxide breakdown is at least 100X smaller than a SSU without oxide breakdown. After voltage stress, the are in general smaller and much more oxide breakdowns are introduced. The results are similar for all SSUs. The large gap in the figure can be effectively exploited to generate stable digital signals from SSUs. Therefore, we choose, according to the measurements, a 10M precision resistor to measure the gate oxide breakdown of each SSU.
For the plasma induced breakdown, the results of breakdown probabilities of SSU implementations on 99 chips are shown in Table II. From the table we see that the breakdown probability of each SSU after plasma induced oxide damage is well below 50%. This means the responses of SSUs are highly biased, which is undesirable for its low randomness in each response bit. Using larger AR to further increase the breakdown probability may not be a proper approach due to large area overhead.
For the voltage stressed breakdown, we stress 24 SSUs (M_T, V_T, and P_T groups) on each testchip by applying 5.5V to the EVA for 10 seconds. The results of the stress are shown in Table II. From the table we can see that breakdown probabilities, which are only slightly correlated with the ARs, are elevated to at least 50% even for the SSUs with the smallest ARs. These results show that more unbiased responses compared to plasma induced breakdown can be achieved by using small SSUs such as V_T1. Therefore, a SSU can be implemented with much smaller area, possibly even without violating the antenna rule, than the plasma induced breakdown approach.
Plasma Induced  Voltage Stressed  
M_T1  0.5%  57.6% 
M_T2  0.5%  51.5% 
M_T3  2.5%  57.1% 
M_T4  2.0%  51.0% 
V_T1  0.5%  50.0% 
V_T2  6.1%  54.0% 
V_T3  0.0%  64.7% 
V_T4  0.0%  58.6% 
P_T1  1.0%  50.5% 
P_T2  2.5%  51.5% 
P_T3  1.0%  58.6% 
P_T4  1.0%  60.0% 
Test1  16.2%  N/A 
Test2  2.0%  N/A 
Test3  5.1%  N/A 
Test4  1.0%  N/A 
Test5  3.0%  N/A 
3.3 Stability Evaluation
To evaluate the stability of the SSUs, we measure all SSU responses from 99 chips at 6 corners: temperatures at 25°C and 100°C with voltage variation at 0.8V, 1V, and 1.2V.
3.3.1 Plasma Induced Breakdown
For the plasma induced breakdown, all SSUs from 99 chips (total 2871 bits generated) are completely stable at all corners during multiple measurements. This can be explained by the fact that the change of at different corners are limited. Fig. 3 shows the change of of a SSU (Test1) under voltage and temperature variations. In Fig. 3 (a), the of the SSU with breakdown is only a few K and the changes under extreme temperature and voltage variations are limited. On the other hand, Fig. 3 (b) shows a SSU without oxide breakdown, where the remains at less than 45M, which is still orders of magnitude larger than the SSU with oxide breakdown.
3.3.2 Voltage Stressed Breakdown
Unlike the plasma induced breakdown, for the voltage stressed breakdown, an extremely small portion of the SSUs are not completely stable. To quantize the results of stability evaluation for the voltage stressed breakdown, each SSU is measured 10 times at each corner and we define the responses measured at 25°C with 1V, where all responses are consistent, as the reference responses. A SSU is unstable at a corner if at least one of its values from the 10 measurements is different from the reference response. We define bit error rate (BER) the number of unstable bits divided by 2376, which is the total number of SSUs stressed (24 SSUs on each of the 99 chips). Table III shows the numbers of unstable SSUs and BER at each corner. We found that at several corners, 1 to 3 SSUs out of 2376 SSUs implemented are unstable for the voltage stressed breakdown. Since most responses of unstable SSUs are still consistent with the reference responses, taking the majority vote of multiple measurements can effectively eliminate the erroneous responses.
Corners  0.8V  1V  1.2V 

25°C  0.04%  0.00%  0.12% 
100°C  0.08%  0.08%  0.08% 
3.4 Uniqueness Evaluation
The interFractional Hamming Distance (FHD) [12] is calculated as the uniqueness evaluation of SSUs. Consider the 24 voltage stressed SSUs on each chip as a 24bit weak PUF, the distribution of interFHD of 99 chips are presented in Fig. 4
. The average of interFHD is 51.7% and the standard deviation is 11.4%, where for an ideal Binomial distribution with success probability P=0.5, the mean is 50% and the standard deviation is 10.2%. Please note that the results of uniqueness evaluation are focused on the voltage stressed breakdown SSUs because for the plasma induced breakdown SSUs, the responses are highly biased and post processing would be required to extract randomness, for example using OR gates at the outputs of multiple SSUs to generate an unbiased bit as explained in Section
4.1.3.5 Statistical Analysis of the PUF Responses
We evaluate the statistical dependence between pairs of bits generated by SSUs after voltage stressed oxide breakdown using various statistical distance measures. We consider pairs as we have only bits per location, and so going beyond the pairwise probability mass function can lead to more noisy and less reliable evaluation. We are interested in the level of independence because the more independent the bits are, the more secure the PUF is.
Essentially, we use that data to evaluate the pairwise probability mass functions of bits under the following two restrictions: The pairwise probability mass function of bits that have the same antenna ratio; the pairwise probability mass function of bits that are located next to each other. This in turn enables us to evaluate the statistical dependence of element that are more likely to be statistically dependent, that is, statistical dependence due to similar design rules as well statistical dependence between PUFs that are close together.
We calculate the distance between the evaluated probability mass function (i.e., ) and an independent one with the same marginal probability mass functions (i.e., ) by assigning them to various statistical distance measures. This enables us to demonstrate the level of independence between pairs of bits. The results are presented in Table IV for the following statistical distance measures: The KullbackLeibler (KL) divergence [8] which is defined as
(2) 
and total variation distance (TVD) [13]
(3) 
Table IV shows that the average statistical distance between and is very small across measures; note that the maximum value of both of these measures is
for binary random variables, and that these statistical distances are equal to zero when the probability mass functions are identical. Hence, these results indicate that this PUF response is very close to being statistically independent.
Note that there are many other statistical distance measures that can be used for this purpose and here we provide only a sample of two of the most popular ones. The KL divergence measures the distance between two probability functions in terms of the increase in the average length of codewords when compressing a source which is optimal under according to , whereas the total variation distance is equal to normalized distance between two probability functions in terms of the norm. However, the operational meaning of these statistical distances as well as many others cannot be directly related to security; in Subsection 5.5 we propose guesswork as a new statistical distance measure that has an operational meaning from security perspective.
Statistical Distance  Max  Min  Mean 

KL  
TVD 
4 Gate Oxide Breakdown PUF Implementations
4.1 Plasma Induced Breakdown PUF
To reduce the bias in this structure, we propose to use OR gates at the output of SSUs as a more areaefficient approach than using even larger antenna segments, which shows limited impact on increasing the breakdown probability. Fig. 5 (a) shows an exemplary implementation of plasma induced breakdown PUF. The onchip 10M precision resistor is shared between two SSUs, where only one of EVA and EVA will be asserted. Please note that a precision resistor can be shared by more than two SSUs to reduce the effective area required per bit, but only one of the SSUs is asserted at a time. The outputs of buffer gates are determined by the breakdown of the SSU.
Take Test3 as an example. When 11 Test3 SSUs are ORed together, the probability of generating a zero is , and the area is 880, which is still more areaefficient than a practical SRAM PUF implementation where (511,19,119)BCH is suggested to correct 15% error probability at different corners [16]
. For such SRAM PUF to generate 19 information bits, the estimated BCH implementation is 12000 XOR gates
[23] or an area of 54000 for the 65nm technology we used. To generate the same number of 19 bits of response with Test3, the estimated area is about 16720. The comparison shows that the SRAM PUF is more than 3X of size of the plasma induced breakdown PUF. In addition, the ECC execution latency is eliminated for the plasma induced breakdown PUF.4.2 Voltage Stressed Breakdown PUF
The probability of voltage stressed breakdown is much higher than the plasma induced breakdown, therefore no OR gates are needed to reduce the response bias, but a stress path for each SSU is required. Fig. 5 (b) shows an exemplary implementation of voltage stressed breakdown PUF. A precision resistor is shared by 3 SSUs. Before response generation, the PUF is stressed through the stress path and outputs of SSUs are connected to GND with all EVA signals set to zero. Once SSUs are stressed, a normal voltage is applied to the stress path and one of the EVAs is asserted at a time for evaluation. To generate a bit, approximately 1 inverter and 4 transistors are needed, which translates to an area of only 4 for 65nm technology. The PUF can be stressed on chip, for example with a charge pump with an area overhead of 12200. Therefore, to generate 19 bits of response, the total area is approximately 12276, which is about 30% smaller than the plasma induced breakdown PUF. As the number of bits increases, the area reduction becomes more evident since the charge pump is shared among multiple bits. The PUF can also be stressed from outside of the chip to save even more area, but an antifuse cell may be needed at the stress path. To stress the PUF, the antifuse cell has to be permanently programmed to closed state. Therefore, if the antifuse cell is already in closed state before stress, it means that the PUF has been contaminated and should be discarded. Please note that if the PUF is stressed from outside of the chip, an attacker may destroy the PUF or introduce more breakdowns by further stressing the PUF, but the PUF is not programmable or clonable because the breakdown of each transistor cannot be controlled.
5 Guesswork for Evaluating Security
A PUF is expected to provide a certain security level; PUFs are implemented in hardware and so it is desired to minimize the hardware size required to achieve this security level. Therefore, accurate tools for evaluating the security level as a function of the hardware size are needed. Guesswork has been suggested as a measure for the security level of PUFs [33] by connecting PUF security to the framework of password security. In this section we present guesswork along with other measures for the security level of PUFs, and discuss differences between those measures.
In many scenarios it is reasonable to assume that an attacker can have multiple guesses through which he can try to find a PUF response or alternatively learn its structure. Guesswork can be used in order to evaluate the security level under various types of attacks of this sort such as key stretching [24], the guesswork of strong PUFs when bias is presented (e.g., when a model building attack enables an attacker to better predict the response of the next challenges), and the average number of guesses for various probabilities of attack failure.
In this context it is important to note that in terms of the attack model, it is assumed that the attacker can generate responses to challenges based on the statistical profile of the PUF, but does not have access to the device itself (i.e., it is not doing model building). Essentially it means that in this section we focus on attacks against weak PUFs rather than machine learning attacks against strong PUFs. Finally, we also propose guesswork as a new measure for statistical distance that quantifies from security perspective how close random variables are to being statistically independent.
5.1 Some Background
The inherent random signature in the hardware determines how hard it is to guess the response of a weak PUF. The number of guesses required to correctly find the response is termed guesswork [14], [26] and denoted by , where is a random variable whose response is guessed. is also a random variable where the probability of having guesses is . The
th moment of guesswork is
(4) 
and the th moment of the conditional guesswork is defined to be
(5) 
where is the guesswork when . Furthermore, it has been shown [14] that a dictionary attack, that is, guessing values in descending order in terms of the probability mass function (i.e., the values of ) is optimal in the sense that it minimizes any moment of guesswork as well as maximizes the probability of guessing the right response within a certain number of guesses.
Arikan [14] presented bounds for the th moment of the optimal guesswork, , based on which he showed that when and are i.i.d., the exponential growth rate of the optimal guesswork is
(6) 
where is the size of and , is the probability that , , where , are the th elements in and respectively, and
(7) 
is Renyi’s conditional entropy with parameter [14]. Note that the average guesswork is achieved when in which case the growth rate is equal to with equality only when is the uniform probability mass function, that is, the rate at which the average guesswork increases is larger than the Shannon entropy, which corresponds to guessing over the typical set.
Note that although equation (6) is asymptotic in , that is (i.e., it considers the growth rate of guesswork) the guesswork converges very fast to the exponential term, that is, it converges very fast to ). It is also important to note that just like guesswork, also minentropy [27] and mutual information [8] represent exponential growth rates that have different operational meanings in terms of security.
5.2 A Review of Contemporary Security Measures for PUFs
We divide the methods of evaluating the security level into two groups. Methods that do not require a direct evaluation of the probability mass function, and ones that do require this kind of evaluation.
The main two methods for evaluating the security level without estimating the probability mass function are as follows.

Inter and IntraFHD: Calculating the inter and intraFHD is a very popular method of evaluating the security level [6, 12, 3]. The closer the average interFHD is to the more unique a PUF is considered to be (i.e., the PUF response is more balanced), whereas the smaller the average intraFHD distance is, the more reliable the PUF response is, that is, it is more stable.

NIST statistical test suite for random and pseudorandom number generators: NIST offers a statistical test suite [7] that enables to determine whether or not a string of bits is random enough according to various criteria. This approach can be taken for example when considering PUF based pseudorandom number generators [25].
The other two methods for evaluating the security level that rely on evaluating the underlying probability mass function are the following.

Mutual Information: The mutual information between two random variables and (i.e., ) quantifies the amount of information that reveals on and vice versa. For example, when and are independent , whereas when then . Mutual information has been proposed as a measure for the security level of PUFs [17, 22].

Minentropy: In the context of guessing a secret, the minentropy [27] represents the maximum probability of guessing a secret in a single guess. Since the PUF response is a secret which is chosen at random, minentropy has also been proposed as a measure for the security level [21]. In the context of machine learning attacks minentropy has been related to how quickly a strong PUF can be broken [20]
. Furthermore, it provides a guarantee for the size of a uniformly distributed key
[19] that can be extracted.
5.3 The Merits of Guesswork
In this subsection we explain why guesswork can also be considered as a good measure for the security level of PUFs by going through each of the criteria presented in Subsection 5.2 and explaining what the differences between what they evaluate and what guesswork does.
Guesswork and inter and intraFHD: The two elements that affect the security level of a PUF are bias and noise. InterFHD provides an evaluation of how biased a PUF is and intraFHD evaluates the noise level, but yet they do not provide a single quantifiable measure that enables one to accurately evaluate the interplay between the two. This in turn may lead to an inaccurate evaluation of the security level that might result in either designing a PUF of excessive size or a PUF that does not meet the required security level.
On the other hand, Guesswork incorporates the effect of bias and noise into a single measure that enables us to evaluate how much they affect the security level. This interplay is presented for the th moment of guesswork in [18] by the following equation
(8) 
where is the optimal guesswork when is drawn i.i.d. Bernoulli, where , and the samples of the PUF encounter an additive noise which is drawn i.i.d. Bernoulli, whose entropy is . This is proven in [18].
Therefore, when a PUF designer considers the following two PUFs: A PUF with interFHD and intraFHD ; another PUF with interFHD and intraFHD , he can not determine which one is more secure based on inter and intraFHD. On the other hand, by assigning the bias and level of noise to equation (8) he can see that when the noise level is and bias is (which leads to interFHD when the bits are i.i.d.) the average guesswork is equal to , whereas when the noise is and the bias is the average guesswork equals . Therefore, guesswork enables a PUF designer to determine which one is more secure and by how much; in this example the PUF with interFHD and intraFHD is times more secure than the other one.
NIST statistical test suite: The main disadvantage of NIST statistical test suite is that it does not designed to take into consideration the effect of noise on the security level. Therefore, a PUF with interFHD and intraFHD (i.e., a stable PUF), as well as a PUF with interFHD and intraFHD (i.e., a maximally noisy PUF) will both pass the NIST test suite.
Moreover, it determines whether or not a string is sufficiently random compared to strings that are drawn i.i.d. Bernoulli. Therefore, when there is any inherent bias (e.g. the bits are drawn i.i.d. Bernoulli) the PUF does not pass these tests. However, the effect of small bias is extremely small in terms of guesswork. This can be seen by assigning a small bias to equation (8); for example, when the bits of a stable PUF are drawn i.i.d. Bernoulli the average guesswork is equal to .
Mutual information and minentropy: In many cases, an attacker can have multiple guesses in which he tries to find the PUF response or learn its structure. Guesswork provides a framework through which a PUF designer can evaluate the security level under such attacks.
Similarly to guesswork, mutual information and minentropy both incorporate noise and bias into a single expression. In fact, in the context of finding the correct response to a challenge, mutual information and minentropy are both special cases of guesswork; hence, from this perspective, guesswork can be viewed as a generalization of these two methods.
Minentropy is the exponent of the maximum probability that the number of guesses is equal to which is also the probability that the optimal guesswork is equal to , that is, in the i.i.d. case the minentropy is
(9) 
In addition, it is shown in [18] that minentropy also captures the average guesswork for strong PUFs under model building attacks.
In [18] it is also shown that the mutual information between the initial PUF observation and a noisy one, is the average guesswork when guessing across the typical set [8], that is, the most probable set. In this case the probability that a PUF response is outside this set is very close but not equal to , that is, the average number of guesses is approximately when the probability of attack failure is , where . Therefore, when it comes to guessing, mutual information is the same as guessing with a very small probability of attack failure; this is again a special case of guesswork. The bounds in equation (8) are achieved when the attacker actually stops when he successfully guesses the secret, even if he has to go though all possibilities.
5.4 The Impact of Noise on the security level
In this subsection we evaluate the impact of noisy responses in terms of the average guesswork of a fixed length response; we consider guesswork for the same reasons provided in Subsection 5.3. We compare the voltage stressed breakdown PUF presented in Section 4.2, whose noise level is 0.12%, to show that it performs better than some weak PUFs that have been reported in the literature [16] in terms of the average number of guesses required to break them. For a 128bit response generated by a PUF, we define the number of effective bits as , that is, the effective number of bits according to which the exponent of the average guesswork increases when there is bias and noise as defined in equation (8).
For the voltage stressed breakdown PUF, is 0.9866 as calculated from equation (8) for its 0.12% noise level at a worst corner, therefore its number of effective bits is , which gives an average guesswork of approximately . For a weak PUF with 15% error probability, the number of effective bits is only with an average guesswork of approximately . This means that our new proposed PUFs present a significant improvement at this level in terms of the average guesswork. Figure 6 shows the number of effective bits for various noise levels. We can see that the number of effective bits drops dramatically as the noise level increases, which reflects how severe the noise can affect the security level in terms of the average guesswork.
5.5 Guesswork as a Statistical Distance Measure
In contrast with the previous subsections, which used guesswork to evaluate the security level, in this subsection we propose guesswork as a new statistical measure that enables one to measure the statistical distance between random variables. The advantage of using guesswork as a statistical distance measure over other measures such as those presented in Subsection 3.5 lies in the fact that the number associated to this distance has operational meaning in terms of security.
The new statistical distance is defined as follows:
(10) 
When it means for large that
(11) 
Hence, when and are independent and so , whereas leads to as well as . We present for our evaluations of and the marginals , when (i.e., the average guesswork) in Table V. These results show that when is conditioned on there is a loss of about on average in terms of the exponent of , that is, when , . Therefore, the statistical statistical dependence between and is very weak in terms of guesswork.
Statistical Distance  Max  Min  Mean 

GW 
6 Conclusion
In this paper we implement and analyze highly stable PUFs exploiting uncontrollable plasma induced and voltage stressed gate oxide damage. The proposed SSUs are fabricated and measured from 99 testchips. Measurement results show that the SSUs are highly stable, therefore significant area reduction can be achieved by eliminating ECC implementation. Furthermore, we show that the responses are unbiased and unique, and we analyze the data of our testchips using various statistical distance measures to show that these bits are independent. Finally, we present the merits of guesswork as a measure for evaluating the security level of PUFs.
References
 [1] A. R. Krishna et al. MECCA: a robust lowoverhead PUF using embedded memory array. In CHES, Oct 2011.
 [2] M. T. Rahman, D. Forte, Q. Shi, G. K. Contreras, and M. Tehranipoor. CSST: Preventing distribution of unlicensed and rejected ICs by untrusted foundry and assembly. In IEEE International Symposium on DFT, Oct 2014.
 [3] R. Maes and I. Verbauwhede. Physically Unclonable Functions: A Study on the State of the Art and Future Research Directions. In Towards HardwareIntrinsic Security. Springer Berlin Heidelberg, 2010.
 [4] K. Lofstrom, W. R. Daasch, and D. Taylor. IC identification circuit using device mismatch. In Proc. ISSCC, Feb 2000.
 [5] J.W. Lee et al. A technique to build a secret key in integrated circuits for identification and authentication applications. In IEEE International Symposium on VLSI Circuits, 2004.
 [6] A. Maiti and P. Schaumont. Improving the quality of a Physical Unclonable Function using configurable Ring Oscillators. In International Conference on FPL, Aug 2009.
 [7] A. Rukhin et al. A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. NIST, 2010.
 [8] T.A. Cover and J.A. Thomas. Elements of Information Theory. John Wiley& Sons, 2006.
 [9] P. J. Liao others. Physical origins of plasma damage and its process/gate area effects on highk metal gate technology. In IEEE IRPS, April 2013.
 [10] Ulrich Rührmair, Christian Jaeger, and Michael Algasinger. An attack on pufbased session key exchange and a hardwarebased countermeasure: Erasable PUFs. In International Conference on FC, 2011.
 [11] Kyung Ki Kim. Reliable CMOS VLSI Design Considering Gate Oxide Breakdown.
 [12] V. Gunreddy A. Maiti and P. Schaumont. A Systematic Method to Evaluate and Compare the Performance of Physical Unclonable Functions. In Embedded Systems Design with FPGAs, 2013.

[13]
J. E. Kennedy and M. P. Quine.
The total variation distance between the binomial and poisson distributions.
Ann. Probab., Jan 1989.  [14] E. Arikan. An inequality on guessing and its application to sequential decoding. IEEE Tran. on Inf. Th., 1996.
 [15] F. Tang et al. CMOS OnChip Stable TrueRandom ID Generation Using Antenna Effect. IEEE Electron Device Letters, Jan 2014.
 [16] J. Guajardo et al. FPGA Intrinsic PUFs and Their Use for IP Protections. In CHES, Berlin, Heidelberg, Sep 2007. Springer Berlin Heidelberg.
 [17] T. Ignatenko et al. Estimating the SecrecyRate of Physical Unclonable Functions with the ContextTree Weighting Method. In IEEE ISIT, July 2006.
 [18] W. Wang, Y. Yona, S. Diggavi, and P. Gupta. Design and analysis of stabilityguaranteed pufs. Available at https://arxiv.org/abs/1701.05637.
 [19] Yevgeniy Dodis, Leonid Reyzin, and Adam Smith. Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data. In Proceedings of EUROCRYPT, May 2004.
 [20] J. Delvaux, D. Gu, and I. Verbauwhede. Upper bounds on the minentropy of RO Sum, Arbiter, FeedForward Arbiter, and SArbRO PUFs. In IEEE AsianHOST, Dec 2016.
 [21] Stefan Katzenbeisser et al. PUFs: Myth, Fact or Busted? A Security Evaluation of Physically Unclonable Functions (PUFs) Cast in Silicon. In CHES, Sept 2012.
 [22] P Tuyls et al. InformationTheoretic Security Analysis of Physical Uncloneable Functions. In Financial Cryptography and Data Security, Feb 2005.
 [23] X. Zhang. VLSI Architectures for Modern ErrorCorrecting Codes. New York, NY, USA: Taylor & Francis, 2015.
 [24] J. Kelsey et al. Secure applications of low entropy keys. Proc. of ISW, Sep 1998.
 [25] C.W. O’Donnell, G.E Suh, and S. Devadas. PUFBased Random Number Generation. In MIT CSAIL CSG Technical Memo 481, 2004.
 [26] J.L. Massey. Guessing and entropy. In ISIT, 1994.
 [27] Y. Dodis, A. Reyzin, and A. Smith. Fuzzy extractor, A brief survey of results from 2004 to 2006. In Security with Noisy Data. Springer Berlin Heidelberg, Berlin, Heidelberg, 2007.
 [28] B. Gassend, D. Clarke, M.V. Dijk, and S. Devadas. Silicon physical random functions. In Proc. CCSC, 2002.
 [29] D.E. Holcomb, W.P. Burleson, and K. Fu. PowerUp SRAM State as an Identifying Fingerprint and Source of True Random Numbers. IEEE Transactions on Computers, 2009.
 [30] W. C. Wang et al. Implementation of Stable PUFs Using Gate Oxide Breakdown. In IEEE AsianHOST, Oct 2017.
 [31] M. Majzoobi, F. Koushanfar, and S. Devadas. FPGA PUF using programmable delay lines. In IEEE International Workshop on WIFS, Dec 2010.
 [32] J. Delvaux and I. Verbauwhede. Keyrecovery attacks on various RO PUF constructions via helper data manipulation. In Proc. DATE, March 2014.
 [33] W. Wang, Y. Yona, S. Diggavi, and P. Gupta. LEDPUF: StabilityGuaranteed Physical Unclonable Functions through Locally Enhanced Defectivity. In IEEE HOST, May 2016.
 [34] R. Liu, H. Wu, Y. Pang, H. Qian, and S. Yu. A highly reliable and tamperresistant RRAM PUF: Design and experimental validation. In IEEE HOST, pages 13–18, May 2016.
 [35] Mudit Bhargava and Ken Mai. A High Reliability PUF Using Hot Carrier Injection Based Response Reinforcement. In CHES, Aug 2013.
 [36] N. Liu, S. Hanson, D. Sylvester, and D. Blaauw. OxID: Onchip onetime random ID generation using oxide breakdown. In Symposium on VLSI Circuits, June 2010.
Comments
There are no comments yet.