I Introduction
Reverse engineering of an integrated circuit (IC) aims to reconstruct a behavioral model of the design implemented in the IC. Destructive reverse engineering is an expensive and tedious process which leaves the IC under test unusable [1]. In recent years, nondestructive reverse engineering to recover the functionality of a given IC has gained much interest [2]. Nondestructive techniques based on reconstructing the device layer models of the IC by using hitech xray tomography equipment have been proposed [3, 4, 5]. They require expensive, sophisticated infrastructure and could be extremely time consuming. Certain blackbox functional analysis techniques based on characterizing the machine behavior using only inputoutput observations have also been proposed. These usually perform bruteforce exploration [6, 7, 8, 9]. They are relatively inexpensive but focus on extremely small machines due to exponential algorithmic complexity.
Power Analysis attacks are sidechannel attacks which use power consumption values to leak information from the devices. These attacks are noninvasive in nature and use relatively inexpensive equipment [10]
. By observing the power consumption trace of a system with respect to a series of input vectors, it is possible to guess the internal operations or the data being processed.
With the explosive growth of IoT devices, smart cards and other small electronic gadgets, it is essential to understand various types of vulnerabilities. In this paper, we propose a noninvasive reverse engineering attack against smallscale digital systems. Using combined functional and power analysis, we propose a method to recover finite state machines from their synchronous sequential circuit implementations as shown in Figure 1. Combining the two reduces the attack time and memory requirements while increasing the scalability of the attack.
Ii Groundwork: HDModel from Power Analysis
Let be a deterministic finite state machine (FSM) or Moore machine, where and are finite nonempty sets of inputs, outputs and states respectively, is a state transition function, is an output function and is the startstate.
In sequential circuit implementations of FSMs, states are encoded as Boolean vectors. Let denote a state encoding function where each state is mapped to a Boolean vector of size and is stored in a state register with flipflops.
Let HD(B(),B()) denote the Hamming distance (HD) between two Boolean vectors (B() and B()) of the same length. Circuit implementations in CMOS technology are susceptible to information leakage through power side channels [10, 11]. The Hamming distance model assumes that dynamic power dissipation in a sequential circuit implemented in CMOS during its transition from state to state is correlated to HD(B(),B()). Given an unknown FSM, we are interested in finding the Hamming distances of the transitions using power analysis attacks in order to discover the state encodings.
First, we perform HDmodel based power analysis on known
FSMs to derive a mapping between its transition Hamming distances and the observed power values. This mapping is then used to estimate transition HD values of unknown FSM implementations using power side channel during a reverse engineering attack.
Every FSM state register stores state encoding of the current state of the FSM. During a transition, the contents of the state register get updated which results in power consumption. Hamming distance between these contents should be strongly correlated to its power consumption value. In order to verify the degree of dependency and generate a lookup table to deduce the HD of unknown transitions, sample benchmark machines of varying sizes and connectivity have been tested.
In order to deduce the relationship between the power values and HD between states, for the SAED90nm CMOS technology, a sample set of LGSynth’91 benchmark FSMs [12] of varying sizes were tested for varying lengths of input sequences. Table I shows the Pearson correlation between the HD values and power measurements for 1000 random input vectors. A strong correlation exists between all three statistical measurements of current consumption during transitions and the HD values. In this paper we use average current to infer Hamming distance of transitions. Figure 2 shows the average current consumption of 1000 transitions and the corresponding Hamming distance for TBK FSM from LGSynth’91 suite. The slight overlap between average current values of consecutive Hamming distances in the figure clearly indicates a possible error of during HD inference from power analysis. It is also quite evident that all 0HD transitions (selfloops) consume the least power and are easily identifiable. These observations are key to justify the error in HDinference and efficiently identify selfloops while trying to reverse engineer the behavior of an unknown machine.
Benchmark 

Pearson Correlation Coefficient  




DK15  1000  0.96795  0.92984  0.97055  
BEECOUNT  1000  0.94014  0.93116  0.94269  
BBSSE  1000  0.93645  0.89478  0.93203  
TBK  1000  0.9444  0.95789  0.96032 
The robustness of the power attack and the reliability of the derived power models can be demonstrated as follows. Power attack is performed on SSE benchmark FSM while treating it as an ’unknown’ machine. The attacker can find out it has 7 inputs, 7 outputs and atleast 16 states; and is synthesized using the SAED90nm technology. Testing the unknown machine with 500 randomized input sequences using HSPICE simulation, average current values for all 500 transitions are stored. Table II demonstrates the accuracy of Hamming distances inferred using this power attack. It is observed that 426 transitions out of 500 are inferred correctly and the rest are within the error range .




0  426  85.20%  
1  60  12.00%  
1  14  2.80%  
2  0  0.00%  
2  0  0.00% 
By performing similar attacks on a sample of benchmark FSMs {dk15, beecount, bbsse, tbk} from LGSynth’91 suite, Table III summarizes the findings in the form of a lookup table. This table can now be used to perform a successful attack on unknown machine and infer Hamming distances of its unknown transitions based solely on its power consumption values.
Average Current (uA)  Inferred Hamming Distance 
<40  0 
40 to 95  1 () 
95 to 140  2 () 
140 to 170  3 () 
170 to 205  4 () 
205 to 230  5 () 
>230  6 () 
Iii Boolean Constraint Based Reverse Engineering Attack
To perform the attack, random input sequences are used for machine traversal and the output sequences along with the corresponding average power traces are captured. These responses are converted into a set of Boolean constraints, which can be solved using a satisfiability solver.
Iiia Constraint Formulation for Reverse Engineering
IiiA1 Power Analysis Constraints
IiiA2 Functional Analysis Constraints
Output function of the Moore FSM depends on its current state. Therefore, for any two transitions resulting in different outputs, it can be inferred that their resulting states are distinct from one another. On the other hand, identical outputs after transitions, do not necessarily imply identical new states.
IiiA3 Boolean SAT Formulation
The problem of generating a logically equivalent state machine can be expressed as a Boolean satisfiability (SAT) problem. Let input vectors be applied to the target circuit, resulting in output vectors and ranges of inferred Hamming distance values as per Equation 1:
To discover a binary encoding of bitlength , we define a set of constraints on the encoding. We define a predicate IdenticalStates for states and being identical (in a selfloop transition) by requiring should be equal to zero:
(2) 
Similarly, we define a predicate InferredHD based on Equation 1, where the Hamming distance of the transition lies within a given range of observed Hamming distances:
(3) 
Nonidentical outputs within set at the end of transitions must imply distinct states. We define predicate DistinctStates for states and by requiring the Hamming distance to be a positive integer:
(4) 
For a valid state machine which is logically equivalent to the target machine, we need to find a state assignment with an encoding of length such that it satisfies all the above constraints. In this research, we have used Z3 SMT Solver [13] to solve for valid state assignment, since it is a highly efficient solver which has the ability to generate models involving bitvectors and solve constraints based on them.
IiiB Algorithm for Reverse Engineering Attack
Algorithm 1 shows the process of instantiating and solving the constraints (2), (3) and (4) while progressively increasing . The algorithm finds a valid state encoding for the smallest value of for which it exits.
The algorithm initially assumes that every transition results in a new state. Therefore, for random input vectors, transitions occur resulting in states. The selection of parameter is determined based on the number of states and input bits of the target machine (as explained later). As equivalent states are recognized with the help of power analysis and IdenticalStates constraint, the states are implicitly merged or folded, i.e. the solver provides same encodings to these states. Relations between the other states are also revealed during power analysis which translate to InferredHD constraint. Both these constraints are applied in lines (612), depending on the inferred Hamming distances. In addition, functional analysis reveals inputoutput behavior which helps determine distinct states within the unknown machine. Lines (1317) apply DistinctStates constraint after comparing every transition in the observed Output set. Upon finding a satisfiable solution, Lines (1820) print the solution, else Lines (2123) increment by one. is determined by the number of unique output values observed during application of the vectors.
It should be noted that the encodings generated lead to recovery of a state machine which is isomorphically equivalent to the implemented one.
Selection of an appropriate number of input vectors is essential to ensure traversal of as many transitions as possible. For a machine with states and primary inputs, if the total number of transitions to be recovered is , then . The size of the input vector set is selected to be at least double the value of so that the algorithm explores that many transitions in one round, hence we choose . It is still quite likely that not all transitions would be explored. To cover the missing transitions, the algorithm is repeated with a new set of randomized input vectors to obtain a new state transition graph. By identifying common transitions, based on the input, state and change in output value, the two graphs can be merged to find out new transitions that were not explored in previous rounds. Since every subsequent round will fetch diminishing returns, in our experimental implementation we terminate the process when it recovers 90% of state transitions from the target machine. Figure 3 shows the methodology of generating more input vectors as needed.
Iv Experimental Results and Analysis
All experiments are performed using FSMs from LGSynth’91 benchmark suite. The machines are translated to Moore machine style by changing the output function while preserving the integrity of all state transitions, state reachability and transition loops. Each FSM is converted to Verilog RTL and synthesized with the Synopsys SAED90nm cell library. The resulting gate level netlists are translated to corresponding Spice netlists using a VerilogtoSpice converter for power and logic simulations. Power traces to perform power analysis are obtained using Synopsys HSPICE and NanoSim. The benchmark machines tested have upto 13 states and 1600 transitions. The number of input bits range from 1 bit to 7 bits and output bits range from 1 bit to 9 bits [12].
The execution time of the Z3 solver depends on the size of the machine and the number of test vectors. Due to the randomized nature of stimulus selection, a given target machine with the same test vector size will exhibit different runtimes for every round. Hence, we report the average execution time to compare the overall performance on different benchmarks. Table IV summarizes the average runtimes for different machines with a set of 100 and 1000 input vectors and Figure 4 shows the recovery percentage at the end of first iteration. The bruteforce recovery technique [8] based on inputoutput analysis could recover machines with a single input bit and up to 25 transitions in 1 minute, whereas technique [7] could take several hours and lacks applicability due to the requirement of terminating states. Our technique can handle machines that are 64x larger and also achieve much faster convergence.
FSM  Average Runtime (s)  
Test Vectors=100  Test Vectors=1000  
dk27  0.839  176.275 
lion  0.259  0.884 
shiftreg  0.623  45.652 
train4  0.258  0.833 
bbtas  0.854  146.489 
modulo12  0.71  125.518 
dk17  0.798  132.212 
mc  0.675  39.61 
ex5  0.823  601.05 
lion9  0.653  132.242 
ex3  0.801  102.515 
ex7  0.526  74.304 
train11  0.691  81.086 
beecount  0.586  205.046 
dk14  0.859  101.984 
tav  0.771  70.926 
s8  0.53  45.067 
s27  0.565  40.344 
ex6  1.027  657.988 
bbara  0.86*  179.694 
opus  0.83*  230.357 
ex4  1.2*  301.787 
s386  0.95*  462.514 
*Equivalent machine not recovered due to limited exploration 
Overall performance of the proposed algorithm depends largely on the number of Z3 solver constraints and how relaxed or tight a given set of constraints are. The following contributing factors are worth noting:
Iv1 Selfloops
The solver will quickly generate a satisfiable model for machines with large number of 0HD transitions due to its restricted search space. For example, benchmarks lion, train4 s8 have more than 50% of their total transitions as selfloops and converge faster than other machines of similar size.
Iv2 Number of Primary Outputs and Output Function
The cardinality of output alphabets for machines having fewer primary outputs will naturally be small and hence, its output function will map multiple states to the same output alphabet. Fewer state pairs with dissimilar outputs lead to a smaller set of stateoutput based constraints. Due to such relaxed constraints, the solver will converge faster for such machines. This benefit in speed, however, comes at the cost of suboptimal state folding. Benchmarks train4, train11 and s27 have one primary output and exhibit this behavior, whereas benchmark ex6 has 8 primary outputs and takes the longest to converge.
Iv3 Timeout Parameter
The proposed algorithm aims to obtain a minimal length state encoding so the solver requires more time to converge with an increase in the number of bit vector variables and constraints. For machines with over 35 states, Z3 fails to generate a minimal length encoding within a reasonable time, but succeeds to produce a model having longer encoding length. Nonminimal state encodings are undesirable as state folding in that case will not be optimal and many indistinguishable states will be misidentified as distinct.
V Conclusion
This work proposed a novel approach of combined functional and power analysis to efficiently discover a logically equivalent state machine structure for a target sequential circuit implementation. The proposed technique is faster and scalable to handle larger machines than existing methods. Recovery of 90%100% was achieved for all benchmark FSMs in under 11 minutes. Future work on adaptive input vector generation to perform guided exploration can uncover the remaining transitions for complete recovery.
References
 [1] R. Torrance and D. James, ”The stateoftheart in semiconductor reverse engineering,” in DAC ’11 Proceedings of the 48th Design Automation Conference, San Diego, California, 2011.
 [2] D. M. T. Office, ”Integrity and Reliability of Integrated Circuits (IRIS)”, Available Online: http://www.darpa.mil/Our_Work/MTO/Programs/Trusted_Integrated_Circuits_(TRUST).aspx, DARPA, 2013.
 [3] S Lau, ”Non destructive failure analysis technique with a laboratory based 3D Xray nanotomography system.,” in LSI Testing Symposium, 2006.
 [4] W. Yun, S. Wang, D. Scott, K. Nill and W. Haddad, ”Xray nanotomography (XRMT) tool for nondestructive highresolution imaging of ICs,” in Istfa 2001: Proceedings of the 27Th International Symposium for Testing and Failure Analysis , 2001.
 [5] Z. H. Levine, A. R. Kalukin, S. P. Frigo, I. McNulty and M. Kuhn, ”Tomographic reconstruction of an integrated circuit interconnect,” Applied Physics Letters, vol. 74, no. 1, 4 1 1999.
 [6] M. Brutscheck, Systematic analysis of unknown integrated circuits (Doctoral Thesis), Dublin Institute of Technology, 2009.
 [7] M. Brutscheck, B. Schmidt, M. Franke, A. T. Schwarzbacher and S. Becker, ”Identification of deterministic sequential finite state machines in unknown CMOS ICs,” in IET Irish Signals and Systems Conference (ISSC 2009), Dublin, 2009.
 [8] J. Smith, A nondestructive analysis method for integrated circuitbased finite state machines (Doctoral Thesis), Washington State University, 2016.
 [9] J. Smith, ”Nondestructive state machine reverse engineering,” in 2013 6th International Symposium on Resilient Control Systems (ISRCS), San Franscisco, 2013.
 [10] P. Kocher, J. Jaffe and B. Jun, ”Differential Power Analysis,” in Advances in Cryptology  CRYPTO 1999: Advances in Cryptology — CRYPTO’ 99.
 [11] S. Mangard, E. Oswald, T. Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards, Springer Science+Business Media, LLC, 2007.
 [12] S. Yang, ”Logic Synthesis and Optimization Benchmarks User Guide Version 3.0,” Technical Report 1991IWLSUGSaeyang, MCNC, Microelectronics Center of North Carolina, 1991.
 [13] L. d. Moura and N. Bjorner, ”Z3: An Efficient SMT Solver,” in International Conference on Tools and Algorithms for the Construction and Analysis of Systems, 2008.
Comments
There are no comments yet.