1. Introduction
Simulationbased functional verification is a crucial yet timeconsuming step in modern electronic design automation flows (Foster, 2015). In this step, a design is simulated with a large number of input stimuli, and signals are monitored to determine if coverage goals and/or functional requirements are met. For complex designs, each input stimulus typically spans a large number of clock cycles. Since exhaustive simulation is impractical for real designs, using “good quality” stimuli that result in adequate coverage of the system’s runs in targeted corners is extremely important (Bening and Foster, 2001). Constrained random verification, or CRV, (Yuan et al., 2006; Bhadra et al., 2007; Kitchen and Kuehlmann, 2007; Naveh et al., 2007) offers a practical solution to this problem. In CRV, the user provides constraints to ensure that the generated stimuli are valid and also to steer the system towards bugprone corners. To ensure diversity, CRV allows randomization in the choice of stimuli satisfying a set of constraints. This can be very useful when the exact inputs needed to meet coverage goals or to test functional requirements are not known (Kitchen and Kuehlmann, 2007; Benito et al., 2020)
. In such cases, it is best to generate stimuli such that the resulting runs are uniformly distributed in the targeted corners of its behavior space. Unfortunately, stateoftheart CRV tools
(3; S. Iman and S. Joshi (2007); A. Yehia (2014); 2; C. Spear (2008)) do not permit such uniform random sampling of input stimuli. Instead, they allow inputs to be assigned random values from a constrained set at specific simulation steps. This of course lends diversity to the generated stimuli. However, it gives no guarantees on the distribution of the resulting system runs. In this paper, we take a first step towards remedying this problem. Specifically, we present a technique for generating input stimuli that guarantees uniform (or userspecified bias in) distribution of the resulting system runs. Note that this is significantly harder than generating any one run satisfying a set of constraints.We represent a run of the system by the sequence of states through which it transitions in response to a (multicycle) input stimulus. Important coverage metrics (viz. transition coverage, state sequence coverage, etc. (Foster et al., 2004)) are usually boosted by choosing stimuli that run the system through diverse state sequences. Similarly, functional requirements (viz. assertions in SystemVerilog (Spear, 2008), PSL (25), Specman E (Iman and Joshi, 2007), UVM (3) and other formalisms (Wile et al., 2005)) are often stated in terms of temporal relations between states in a run of the system. Enhancing the diversity of state sequences in runs therefore improves the chances of detecting violations, if any, of functional requirements. Consequently, generating input stimuli such that the resulting sequences of states, or traces, are uniformly distributed among all traces consistent with the given constraints is an important problem. Significantly, given a sequence of states and the nextstate transition function, the input stimuli needed to induce the required state transitions at each clock cycle can be easily obtained by independent SAT/SMT calls for each cycle. Hence, our focus in the remainder of the paper is the core problem of sampling a system’s traces uniformly at random from the set of all traces (of a given length) that satisfy userspecified constraints.
To see why stateoftheart CRV techniques (3; S. Iman and S. Joshi (2007); A. Yehia (2014); 2; C. Spear (2008)) often fail to generate stimuli that produce a uniform distribution of traces, consider the sequential circuit with two latches ( and ) and one primary input, shown in Fig. 1a. The state transition diagram of the circuit is shown in Fig. 1b. Suppose we wish to uniformly sample traces that start from the initial state and have consecutive state transitions. From Fig. 1b, there are such traces: , , , , , and
. Hence, each of these traces must be sampled with probability
. Unfortunately, the state transition diagram of a sequential circuit can be exponentially large (in number of latches), and is often infeasible to construct explicitly. Hence we must sample traces without generating the state transition diagram explicitly. The primary facility in existing CRV techniques to attempt such sampling is to choose values of designated inputs randomly at specific steps of the simulation. In our example, without any information about the state transition diagram, the primary input of the circuit in Fig. 1a must be assigned a value 0 (or 1) with probability independently in each of the steps of simulation. This produces the traces and with probability each, , and with probability each, and and with probability each. Notice that this is far from the desired uniform distribution. In fact, it can be shown that for every choice of bias for sampling values of the primary input at each state, we get a nonuniform distribution of through .The tracesampling problem can be shown to be at least as hard as uniformly sampling satisfying assignments of Boolean formulas. The complexity of the latter problem has been extensively studied (Sipser, 1983; Jerrum et al., 1986; Bellare et al., 2000), and no efficient algorithms are known. Therefore, efficient algorithms for sampling traces are unlikely to exist. Nevertheless, a trace sampling technique that works efficiently in practice for many problem instances is likely to be useful even beyond CRV, viz. in test generation using Bounded Model Checking (Hamon et al., 2004).
The primary contributions of this paper are as follows:

A novel algorithm for sampling fixedlength traces of a transition system using Algebraic Decision Diagrams (ADDs) (Bahar et al., 1997), with provable guarantees of uniformity (or userprovided bias). The following are distinctive features of our algorithm.

It uses iterative squaring, thereby requiring only ADDs to be precomputed when sampling traces of consecutive state transitions. This allows our algorithm to scale to traces of a few hundred transitions in our experiments.

It is easily adapted when the trace length is not a power of , and also when implementing weighted sampling of traces with multiplicative weights.

It precompiles the step transition relation for different values of to ADDs. This allows it to quickly generate multiple trace samples once the ADDs are constructed. Thus the cost of ADD construction gets amortized over the number of samples, which is beneficial in CRV settings.


A comparative study of an implementation of our algorithm (called ) with alternative approaches based on (almost)uniform sampling of propositional models, that provide similar uniformity guarantees. Our experiments demonstrate that our approach offers significant speedup and is the fastest over 90% of the benchmarks.
2. Preliminaries
2.1. Transition Systems and Traces
A synchronous sequential circuit with latches implicitly represents a transition system with states. Hence, synchronous sequential circuits serve as succinct representations of finitestate transition systems. We use “sequential circuits” and “transition systems” interchangeably in the remainder of the paper to refer to such systems.
Formally, a transition system with Boolean state variables and primary inputs is a tuple , where is the set of states, is the input alphabet, is the set of initial states, is the set of target (or final) states, and is the state transition function such that iff there is a transition from state on input to state . We view each state in as a valuation of . For notational convenience, we use the decimal representation of the valuation of as a subscript to refer to individual states. For instance, and are the states with allzero and allone assignments to respectively. We refer to multiple versions of the state variables as
Given a transition system, a trace of length is a sequence of states such that , and , where represents the state in the trace. We denote the set of all traces of length by . Given a trace , finding an input sequence such that the element, viz. , satisfies for all , requires independent SAT solver calls. With stateoftheart SAT solvers (Soos et al., 2009), this is unlikely to be a concern with the number of primary inputs ranging upto tens of thousands. Therefore, finding a sequence of inputs that induces a trace is relatively straightforward, and we will not dwell on this any further. Our goal, instead, will be to sample a trace
uniformly at random. Formally, if the random variable
corresponds to a random choice of traces, we’d like to have . Given a weight function , the related problem of weighted trace sampling requires us to sample such that .Since we are concerned only with sequences of states, we will henceforth assume that transitions of the system are represented by a transition relation , where . For notational convenience, we abuse notation and use for , when there is no confusion.
A multiplicative weight function assigns a weight to each state transition, and defines the weight of a trace as the product of weights of the transitions in the trace. Formally, let be a weight function for state transitions, where if holds, and otherwise. Then, the multiplicative weight of a trace is defined as . The unweighted uniform sampling problem can be seen to be the special case where whenever holds.
Symbol  Meaning 

Set of Boolean variables  
Set of states  
Set of all traces ‘’, of length  
Transition function  
Weight function  
Set of all paths ‘’ in a DD starting at node 
2.2. Decision Diagrams
We use Binary Decision Diagrams (BDDs) (Bryant, 1986) and their generalizations called Algebraic Decision Diagrams (ADDs) (Bahar et al., 1997) to represent transition functions/relations and counts of traces of various lengths between states. Formally, both ADDs and BDDs are tuples where is a set of Boolean variables, the finite set is called the carrier set, is the diagram variable order, and is a rooted directed acyclic graph satisfying the following properties:(i) every terminal node of is labeled with an element of , (ii) every nonterminal node of is labeled with an element of and has two outgoing edges labeled and , and (iii) for every path in , the labels of visited nonterminal nodes must occur in increasing order under .
ADDs and BDDs differ in the carrier set ; for ADDs while for BDDs, . Thus ADDs represent functions of the form while BDDs represent functions of the form , as Directed Acyclic Graphs (DAG). Many operations on Boolean functions can be performed in polynomial time in the size of their ADDs/BDDs. This includes conjunction, disjunction, ifthenelse (ITE), existential quantification etc. for BDDs and product, sum, ITE and additive quantification for ADDs. The reader is referred to (Bryant, 1986; Bahar et al., 1997) for more details on these decision diagrams.
We denote the set of leaves of a decision diagram (DD) by , and the root of the DD by . We denote the vertices of the DAG by , set of parents of in the DAG by , and value of a leaf by . A path from a node to in a DD , denoted as , is defined to be a sequence of nodes such that , and . We use denote the set of all paths to the root starting from some node in the DD. For a set of nodes, we define . The special set represents all paths from all leaves to the root of a DD. Our notational setup is briefly summarized in Tab. 1.
3. Related Work
We did not find any earlier work on sampling traces of sequential circuits with provable uniformity guarantees. As mentioned earlier, constrained random verification tools (2; C. Spear (2008); S. Iman and S. Joshi (2007); 3; A. Yehia (2014)) permit values of selected inputs to be chosen uniformly (or with specified bias) from a constrained set at some steps of simulation. Nevertheless, as shown in Section 1, this does not necessarily yield uniform traces.
Arenas et al. (Arenas et al., 2019) gave a fullypolynomial randomized approximation scheme for approximately counting words of a given length accepted by a Nondeterministic Finite Automaton (NFA). Using Jerrum et al’s reduction from approximate counting to sampling (Jerrum et al., 1986), this yields an algorithm for sampling words of an NFA. Apart from the obvious difference of sampling words vs. sampling traces, Arenas et al’s technique requires the statetransition diagram of the NFA to be represented explicitly, while our focus is on transition systems that implicitly encode large statetransition diagrams.
Given a transition system, sampling traces of length can be achieved by sampling satisfying assignments of the propositional formula obtained by “unrolling” the transition relation times. Technique for sampling models of propositional formulas, viz. (Achlioptas et al., 2018; Sharma et al., 2018; Gupta et al., 2019) for uniform sampling and (Chakraborty et al., 2013, 2014, 2015) for almost uniform sampling, can therefore be used to sample traces. The primary bottleneck in this approach is the linear growth of propositional variables with the trace length and count of Boolean state variables. We compare our tool with stateoftheart samplers WAPS (Gupta et al., 2019) and (Chakraborty et al., 2015), and show that our approach performs significantly better.
4. Algorithms
For clarity, we assume that the length of traces, i.e. , is a power of ; the case when is not a power of is discussed later. A naive approach would be to use a single BDD to represent all traces of length , by appropriately unrolling the transition system, and then sample traces from the BDD. Such monolithic representations, however, are known to blow up (Dudek et al., 2019). Therefore, we use ADDs, where the ADD () represents the count of length paths between different states of the transition system. The ADD is constructed from the ADD by a technique similar to iterative squaring (Burch et al., 1990b, a). A trace is sampled by recursively sampling states from each ADD according to the weights on the leaves.
The detailed algorithm for constructing ADDs is presented in Algorithm 1. We assume that the transition relation is defined over copies, viz. and , of the state variables, and that an additional copies, viz. , are also available. In each step of the for loop on line 2, the ADD is squared to obtain the ADD after additively abstracting out in line 4. Each ADD represents the count of length traces from to that pass through at the halfway point. Note that and in line 3 are the same ADD, but with variables renamed. Finally, in line 5, we take the product of the
ADD with the characteristic functions for the initial and final states, represented as ADDs. Although Algorithm
1 correctly computes all ADDs, in practice, we found that it often scaled poorly for values of beyond a few 10s. On closer scrutiny, we found that this was because the ADD (and other ADDs derived from it) encoded information about transitions from states unreachable in steps (and hence of no interest to us). Therefore, we had to aggressively optimize the ADD computations by restricting (see (Coudert et al., 1990)) each ADD with an overapproximation of the set of reachable states relevant to that . We discuss this optimization in detail in Sec. 5.Once the ADDs are constructed, the sampling of the states of the trace is done by Algorithm 2. States and are sampled from the ADD in a call to Algorithm 4 in line 2. Then Algorithm 3 is recursively called to sample the first and second halves of the trace in lines 3 and 4. In each recursive call, Algorithm 3 invokes the procedure in Algorithm 4, to sample the state at the midpoint of the current segment of the trace under consideration, and recurses on each of the two halves thus generated.
In (Algorithm 4), the ADD is used asis for sampling (lines 1,2), while other ADDs are first simplified by substituting the values of state variables in and , that have been sampled previously and provided as inputs to (lines 3,4). The role of the rest of the algorithm is to sample a path from a leaf to the root in a bottomup fashion, with probability proportional to the value of the leaf. Towards this end, a leaf is first sampled in lines 58. We assume access to a procedure that takes as input a list of elements and their corresponding weights, and returns a random element from the list with probability proportional to its weight. Once a leaf is chosen, we traverse up the DAG in the loop on line 9. This is done by iteratively sampling a parent with probability proportional to the number of paths reaching the parent from the root (lines 1012). The quantity denotes the number of paths from a node to the root, and can be easily computed by dynamic programming. If some levels are skipped between the current node and its parent , then the number of paths reaching the current node from the parent are scaled up by a factor of (line 12). This is because each skipped level contributes a factor of to the number of paths reaching the root. Once a parent is sampled, the value of the corresponding state variable is updated in the trace in lines 1317, where the procedure is assumed to return the index of the state (in the trace ) and the index of the state variable (in the set of state variables) corresponding to the parent node. can be implemented by maintaining a map between the state variables and the variable order in the DD. The random values for variables in the skipped levels between the parent and the current node are sampled in lines 18 and 19.
Non Powerof2 trace lengths
When the trace length is not a power of two, we modify the given sequential circuit so that the distribution of traces of length of the modified circuit is identical to the distribution of length prefixes of these traces. Conceptually, the modification is depicted in Fig. 2. Here, the “SaturateatN” counter counts up from to and then stays locked at . Once the count reaches , the next state and current state of the original circuit are forced to be identical, thanks to the multiplexer. Therefore, the modified circuit’s trace, when projected on the latches of the original circuit, behaves exactly like a trace of the original circuit up to steps. Subsequently, the projection remains stuck at the state reached after steps. Hence, by using the modified circuit and by choosing , we can assume w.l.o.g. that the length of a trace to be sampled is always a power of 2.
Weighted Sampling
A salient feature of Algorithms 14 is that the same framework can be used for weighted sampling (instead of uniform) as defined in Section 2, with one small modification: if the input to Algorithm 1 is an ADD instead of a BDD, where the values of leaves are the weights of each transition, then it can be shown that will sample a trace with probability proportional to its weight, where the weight of a trace is define multiplicatively as in Section 2.
5. Improved Iterative Squaring
In this section, we present a more efficient version of Alg. 1. To see where gains in efficiency can be made, note that the s generated using Alg. 1, encode transitions that are never used during sampling. For instance, the ADD as constructed by Alg. 1, is only used by the procedure for sampling states (given and ) and (given and ). Thus, should only be concerned with states reachable in exactly 0,, , or steps from the initial set. However, the constructed by Alg. 1 also contains information about other step transitions from states not reachable in those many step from the initial set. This information is clearly superfluous and only serves to increase the size of the ADD. Such information is present in all s and exists because the iterative squaring framework of Alg. 1 squares all transitions in the loop on lines 24 regardless of the initial state, final state and reachability conditions. We give an improved squaring framework, presented in Algs. 5 and 6. The idea is to first compute (overapproximations of) sets of states reachable in exactly steps from the initial set, for (Alg. 6). We then restrict each ADD by the overapproximations of only those reachable state sets it depends on (Alg. 5).
The set for in Alg. 6 represents the set that will be used for restricting the variable set of , while the set will be used for restricting the variable set of . The (overapproximate) set of states reachable after exactly steps from the initial state, denoted , is computed in line 5 starting from the initial set by taking the (overapproximate) image under of the reachable set after
steps. Computing an exact image is often difficult for large benchmarks, hence an overapproximation of the image can be used. The literature contains a wide spectrum of heuristic techniques that can be used to tradeoff space for time of computation. Once
is computed, we disjoin the appropriate elements of and with in lines 711. The special case of reachable set is handled separately in lines 1213.After and sets are computed, we use them to restrict and in lines 34 of Alg. 5. The restrict operation is the one proposed in (Coudert et al., 1990). If then wherever is true, and is undefined otherwise. This operation can be more efficient than conjunction, and is sufficient for our purposes as we explicitly enforce initial state condition in Line 7 of Alg. 5.
6. Analysis
6.1. Hardness of Counting/Sampling Traces
Counting and sampling satisfying assignments of an arbitrary Boolean formula, say , can be easily reduced to counting and sampling, respectively, of traces of a transition system. From classical results on counting and sampling in (Valiant, 1979; Stockmeyer, 1983; Jerrum et al., 1986; Bellare et al., 2000), it follows that counting traces is #Phard and uniformly sampling traces can be solved in probabilistic polynomial time with access to an NPoracle.
To see how the reduction works, suppose the support of has variables, say . We construct a transition system , where and the set of state variables is . We let and define the transition function as follows: and , for . In other words, the nextstate bit is determined by regardless of the input , while the rest of the nextstate bits are always . We define and . It is easy to see that counting/sampling traces of length of this transition system effectively counts/samples satisfying assignments of .
6.2. Random Walks and Uniform Traces
It is natural to ask if uniform tracesampling can be achieved by a Markovian random walk, wherein the outgoing transition from a state is chosen according to a probability distribution specific to the state. Unfortunately, we show below that this cannot always be done. Since uniform sampling is a special case of weighted sampling, the impossibility result holds for weighted trace sampling too.
Consider the transition system in Fig. 1. We’ve seen in Section 1 that there are 7 traces of length 4. Hence a uniform sampling would generate each of these traces with probability . Suppose, the probability of transitioning to state from state is given by . For uniform sampling, we require if , and also . Now, consider the traces and . Let and . This implies that . Thus, the probability of sampling is . For uniformity, . Similarly, from , we get . From these two equations, we obtain . Therefore, . It follows from the equation that . However, this is not a valid probability measure. Therefore, it is impossible to uniformly sample traces of this transition system by performing a Markovian random walk.
6.3. Correctness of Algorithms
…  … 
We now turn to proving the correctness of algorithms presented in the previous section. We first prove the correctness of the improved iterative squaring framework (Sec. 5). Alg. 6 (lines 810) ensures that is computed as a disjunction of ’s for values of given in the first column and row of Tab. 2, while is computed from ’s for values of given on the row and second column. Therefore, to show the correctness of Algs. 5 and 6, we show in Lemma 6.1 that the and variable sets of will only be instantiated with (overapproximations of) sets of states reachable in the number of steps given in the appropriate column of Tab. 2.
Lemma 6.1 ().
Proof.
We show by induction on from down to . The base case is shown by the fact that is used exactly once by and the variables are used only for sampling the initial state while is used for sampling . Thus and . The former condition is satisfied by the limits of the forloop in Line 6 of Alg. 6, while the latter condition is satisfied by lines 1213 of Alg. 6. This completes the base case.
Now assume that the lemma holds for some . We will prove that the lemma holds for as well. First note that is used by for sampling some state given states and . Thereafter, is used in 2 cases: (1) for sampling given and ; and (2) for sampling given and . Thus the variables of will be instantiated with the same states as for variables of in case (1). In case (2), vars of will be instantiated with the same states as for vars of . Thus the states instantiating vars of are the union of the states instantiating and variables of , i.e., . The values in Tab. 2 reflect this fact, and by our inductive assumption and were computed correctly. This proves that , for some given in column 1 and row of Tab. 2. To complete the inductive argument we still need to show that , for some given in column 2 and row of Tab. 2. To see this, first note that the variables of will only be instantiated with states reachable in steps from the states instantiating the variables of . This is reflected in Tab. 2. For instance, in row 3 (), in column 2 are exactly the set of states reachable in steps from respectively, in column 1. Since we showed that has been computed correctly, this completes the proof. ∎
Let denote the number of traces of length starting in state and ending in state . Note that . We use the fact that ensures that the parent of a node is sampled independently of the path from an ADD leaf to chosen so far. Conditional independence also holds for whole traces; given the states at two indices in a trace, the states within the trace segment delineated by the indices are sampled independently of the states outside the trace segment. The following lemmas characterize the behavior of the sampling framework (Algs. 2–4).
Lemma 6.2 ().
For , the ADD computed by is such that , we have
Proof.
We will prove by induction on .
Base case: We have by definition. From line 3 of Alg. 1, we then have ,
Induction step: Assume the lemma holds up to some , i.e. . After execution of line 4 of Alg. 1, we will have . Then in the next iteration of the loop after line 3, we will have . ∎
Lemma 6.3 ().
Let Z denote the random path from a leaf to the root of ADD (see Alg. 4) chosen by . Then
(1) 
Proof.
The leaf is sampled with probability . Thereafter, each parent is sampled with probability , where . But note that . Then, substituting in the identity , gives the lemma. ∎
In the next two lemmas, ‘’ and ‘’ refer to trace indices passed as arguments to , and .
Lemma 6.4 ().
Suppose is invoked with , and . Let denote the random state returned by for . Then for all , we have
Proof.
We note that for any ADD s.t. , we reduce the ADD by substituting in line 4 of Alg. 4. In the resultant ADD , each paths from root to leaf yields a valuation for . Therefore, if is the path traversed in corresponding to some state , then . We now need to prove that the R.H.S. of Eqn. 1 is the same as the desired conditional probability expression. In Eqn. 1, the numerator , by Lemma 6.2. The denominator of Eqn. 1 is which is