I Introduction
Boolean functional synthesis is the problem of constructing a Boolean function from a Boolean specification that describes a relation between input and output variables [19, 12, 2, 35]. This problem has been explored in a number of settings including circuit design [20], QBF solving [27], and reactive synthesis [36], and several tools have been developed for its solution. Nevertheless, scalability of Boolean functional synthesis methods remains a concern as the number of variables and size of the formula grows. This is not surprising since Boolean functional synthesis is in fact coNP^{NP}hard.
A standard practice for handling the problem of scalability is based on decomposing the given formula into smaller subspecifications and synthesizing each component separately [19, 2, 35]. The most common form of such decomposition, called factorization, is when the formula is represented as a conjunction of constraints, in which each conjunct can be seen as a subspecification [19, 35]. The main challenge in this approach is that most factors cannot be synthesized entirely separately due to the dependencies created by shared input and output variables. The ways to meet this challenge are usually to either merge factors that share variables [35] or perform additional computations in order to combine the functions synthesized for different factors [19]. All these result in additional work that must be performed during the synthesis.
In this work, we propose an alternative decomposition framework, which follows naturally from the fact that variables in the specification are separated into input and output variables. This idea was originally inspired by [11], which explores the notion of sequential relational decomposition, in which a relation is decomposed into two by introducing an intermediate domain. Differently from factorization, this form of decomposition allows the two components to be synthesized completely independently. That work, however, shows that decomposition is hard in general, and if the relation is given as a Boolean circuit, decomposition is NEXPTIMEcomplete. Furthermore, there is no guarantee that synthesizing the two components independently would be easier than synthesizing the original specification, since the synthesis of one component might ignore useful information given by the other component.
We instead suggest a more relaxed notion of decomposition for specifications described as CNF formulas, in which every clause is split into an input and an output clause and the independent analyses of the input/output components “cooperate” to synthesize a function for the entire specification. Based on this concept, we describe a novel synthesis algorithm for CNF formulas called the “BackandForth” algorithm, where rather than synthesizing the input and output components entirely independently we share information back and forth between the two components to guide the synthesis. More specifically, our algorithm alternates between SAT calls that follow the inputcomponent structure analysis and MaxSAT calls that follow the outputcomponent structure analysis. Thus, this approach builds on recent progress with SAT and MaxSAT solving [30, 21]. A notable consequence of our method is that, as the number of SAT calls is dependent on the structure of the input component, for specifications with some welldefined input structure we can perform synthesis in P^{NP}, compared to the generally mentioned coNP^{NP}hardness. An additional advantage of our algorithm is that it constructs the synthesized function as a decision list [29]. Compared to other data structures for representing Boolean functions, such as ROBDDs or AIGs, decision lists have significant benefits in term of explainability, allowing domain specialists to validate and analyze their behavior (see discussion in Section VI for more details).
We experimentally evaluate the “BackandForth” algorithm on a suite of standard synthesis benchmarks, comparing its performance with that of stateoftheart synthesis tools. Although these tools perform very well on many families of benchmarks, our results show that the “BackandForth” algorithm is able to handle classes of benchmarks that these tools are unable to synthesize, indicating that it belongs in a portfolio of synthesis algorithms.
Ii Related Work
Constructing explicit representations of implicitly specified functions is a fundamental problem of interest to both theoreticians and practitioners. In the contexts of Boolean functional synthesis and certified QBF solving, such functions are also called Skolem functions [8, 19, 14]. Boole [9] and Lowenheim [22] studied variants of this problem when computing most general unifiers in resolutionbased proofs. Unfortunately, their algorithms, though elegant in theory, do not scale well in practice [23]. The close relation between Skolem functions and proof objects in specialized QBF proof systems has been explored in [8, 14]. One of the earliest applications of Boolean functional synthesis has been logic synthesis  see [34] for a survey. More recently, Boolean functional synthesis has found applications in diverse areas such as temporal strategy synthesis [3, 16, 36], certified QBF solving [7, 28, 6, 26], automated program synthesis [33, 31], circuit repair and debugging [18], and the like. This has resulted in a new generation of Boolean functional synthesis tools, cf. [14, 19, 2, 1, 12, 35, 28, 27], that are able to synthesize functions from significantly larger relational specifications than what was possible a decade back.
Recent tools for Boolean functional synthesis can be broadly categorized based on the techniques employed by them. Given a specification , where denotes inputs and denotes outputs, the work of [14] extracts Skolem functions for in terms of from a proof of validity of expressed in a specific format. The efficiency of this technique crucially depends on the existence and size of a proof in the required format. Incremental determinization [27]
is a highly effective synthesis technique that accepts as input a CNF representation of a specification and builds on several successful heuristics used in modern conflictdriven clauselearning (CDCL) SAT solvers
[30].In [12], the compositionbased synthesis approach of [17] is adapted and new heuristics are proposed for synthesizing Skolem functions from an ROBDD representation of the specification. The technique has been further improved in [35] to work with factored specifications represented as implicitly conjoined ROBDDs. CEGARbased techniques that use modern SAT solvers as black boxes [19, 2, 1]
have recently been shown to scale well on several classes of large benchmarks. The idea behind these techniques is to start with an efficiently computable initial estimate of Skolem functions, and use a SAT solver to test if the estimates are correct. A satisfying assignment returned by the solver provides a counterexample to the correctness of the function estimates, and can be used to iteratively refine the estimates. In
[1], it is shown that transforming the representation of the specification to a special negation normal form allows one to efficiently synthesize Skolem functions.Both ROBDD and CEGARbased approaches make use of decomposition techniques to improve performance, the most common of which is factorization [19, 35]. In this method, every conjunct of a conjunctive specification is considered individually. The main drawback in this approach is that the dependencies between conjuncts limit how much each of them can be analyzed independently of the others, requiring either partially combining components, as in [35], or going through a process of refinement of the results [19]. This issue motivates the search for alternative notions of decomposition for synthesis problems. Our approach is loosely inspired by the idea of sequential relational decomposition explored in depth in [11]. A more direct application of this idea to synthesis might still be possible, but requires further exploration. In addition to the above techniques, templates or sketches have been used to synthesize functions when information about the possible functional forms is available a priori [33, 32].
As is clear from above, several orthogonal techniques have been found to be useful for the Boolean functional synthesis problem. In fact, there remain difficult corners, where the specification is stated simply, and yet finding Skolem functions that satisfy the specification has turned out to be hard for all stateoftheart tools. Our goal in this paper is to present a new technique and algorithm for this problem, that does not necessarily outperform existing techniques on all benchmarks, but certainly outperforms them on instances in some of these difficult corners. We envisage our technique being added to the existing repertoire of techniques in a portfolio Skolemfunction synthesizer, to expand the range of problems that can be solved.
Iii Preliminaries
Iiia Boolean Functional Synthesis
A specification for the Boolean functional synthesis problem is a (quantifierfree) Boolean formula over input variables and output variables . Note that can be interpreted as a relation , where is the set of all assignments to and is the set of all assignments to . With that in mind, we denote by and the domain and image of the relation represented by . We also use to denote the image of a specific element . If , then we say that is realizable.
Two Boolean formulas and are said to be logically equivalent, denoted by , if they have the same solution space; that is, for every assignment to , iff . Unless stated otherwise, all Boolean formulas mentioned in this work are quantifier free.
We say that a partial function implements a relation if for every we have that . Such a is also called a Skolem function of . Note that if is realizable, then is a total function. Finally, we define the Booleansynthesis problem as follows:
Problem 1.
Given a specification , construct a partial function that implements .
IiiB Decision lists
Our choice of representation of Skolem functions in this work is inspired by the idea that we can represent an arbitrary Boolean function by a decision list [29]. A decision list is an expression of the form if then else if then else else , where each is a formula in terms of the input variables and each is an assignment to the output variables . The length of the list corresponds to the number of decisions. Clearly, for a specification with input variables we can always synthesize as an implementation a decision list of length , where for every possible assignment of we choose an assignment of that satisfies the specification. Many specifications, however, can be implemented by significantly smaller decision lists, by taking advantage of the fact that multiple inputs can be mapped to the same output. Our analysis identifies and exploits these cases.
Despite being a natural representation, decision lists might not be appropriate for a physical implementation of the synthesized function as a circuit. In this case, it might make sense to collect the decisions into a more compact representation, such as an ROBDD.
IiiC Conjunctive Normal Form
A Boolean formula is in conjunctive normal form (CNF) if is a conjunction of clauses , where every clause is a disjunction of literals (a variable or its negation). A subset of the clauses of a CNF formula is satisfiable if there exists an assignment to the variables in such that for every clause . Similarly, a subset of the clauses of is allfalsifiable if there exists an assignment such that for every clause . A subset of clauses is a maximal satisfiable subset (MSS) if is satisfiable and every superset is unsatisfiable. Similarly, is a maximal falsifiable subset (MFS) if is allfalsifiable and every superset is not allfalsifiable. For more information on MSS and MFS, refer to [15].
Iv Synthesis via InputOutput Separation
In this section, we present a novel algorithm for Boolean functional synthesis from CNF specifications. Our approach is based on a separation of every clause into an input part and an output part. First, we describe how a decision list implementing the specification can be constructed by enumerating MFSs of the input clauses, or similarly by enumerating MSSs of the output clauses. Then, we show how we can benefit from alternating between the two: the MFSs can be used to avoid useless MSSs, while the MSSs can be used to cover multiple MFSs at the same time without enumerating all of them.
Given a CNF formula , assume , where are clauses over and . Let denote the part of clause , that is, the disjunction of all literals in . Similarly, let be the part of clause , the disjunction of all literals in . We call and the set of input and output clauses of the specification, respectively.
In the following sections, we describe how to perform separate analyses of the input component and the output component , and then how to combine these analyses into a single synthesis algorithm that alternates between the two components.
Iva Analysis of the Input Component
In this subsection we assume that the specification is realizable. First, consider a single assignment to the input variables . Let be the subset of input clauses that falsifies. For a set of input clauses, let be the corresponding set of output clauses and let . Note that for every clause . Therefore is the subset of output clauses that must be satisfied in order to satisfy when is the input assignment.
A key observation is that for two different input assignments and , if , then , and therefore every output assignment that satisfies the specification for also satisfies the specification for . Hence, it is enough to consider only assignments for that falsify a maximal number of input clauses. This leads to the following lemma:
Lemma 1.
Let be an MFS of , and be an assignment that satisfies . Then: (1) For every assignment such that , the assignment satisfies ; and (2) There is no assignment such that .
Proof.
(1) For every clause , since , we have that is in and therefore is satisfied by . Therefore, every clause in that is not satisfied by is satisfied by . Note that (2) follows from being maximal. ∎
From Lemma 1 and our assumption that is realizable, we can conclude the following.
Corollary 1.
can be implemented by a decision list of length equal to the number of MFS of , where each in the decision list is of size linear in the size of the specification.
Proof.
Construct by taking the conjunction of all input clauses not contained in the th MFS . Then, is satisfied exactly by those assignments such that is a subset of . Then, set the corresponding output assignment to an arbitrary satisfying assignment of . ∎
Example 1.
Let . We first construct input clauses and output clauses . has three MFS: , and . From these MFS we can construct a decision list implementing in the way described above. Note that this decision list necessarily covers every possible input assignment:
Note that we require to be realizable because otherwise we cannot guarantee that will be satisfiable for every MFS of the input clauses. If is unsatisfiable, however, it is not enough to simply remove the corresponding from the decision list, because there might be a subset for which is satisfiable.
This is the first time to our knowledge that MFS are used for synthesis purposes. An advantage of enumerating MFS is that finding an MFS can be easily done, in a precise sense discussed below. One way to do this is through the conflict graph of the set of input clauses [13]. Given a set of clauses , the conflict graph of is the graph where every vertex corresponds to a clause in , and there is an edge between two vertices iff the corresponding clauses have a complementary pair of literals between them (that is, the same variable appears in positive form in one clause and in negative form in the other). The complement of the conflict graph is called a consensus graph [13].
Since two clauses can be falsified at the same time iff there is no edge between them in the conflict graph, or alternatively there is an edge between them in the consensus graph, there is a onetoone correspondence between MFS of the set of clauses, maximal independent sets (MIS) in the conflict graph, and maximal cliques in the consensus graph. Therefore, we can enumerate the MFS in a set of clauses by either enumerating MIS in the conflict graph or maximal cliques in the consensus graph. The benefit of this reduction is that maximal cliques display a so called polynomialtime listability, meaning that finding a maximal clique can be performed in polynomial time, and therefore enumeration takes polynomial time in the number of maximal cliques [15].
This relation between the set of MFS and maximal cliques implies that the size of the smallest decision list that implements a given specification is upper bounded by the number of maximal cliques in the consensus graph of the input clauses. Therefore we have the following result.
Theorem 1.
Synthesis can be performed in P^{NP} for specifications for which the consensus graph of has a polynomial number of maximal cliques (such as planar or chordal graphs).
Proof.
Given a specification , construct the consensus graph of the input component, enumerate the maximal cliques and for each one use a SAT solver to obtain a corresponding satisfying assignment for the output clauses. Since the number of maximal cliques is polynomial, only a polynomial number of SAT calls is required. ∎
Theorem 1 demonstrates an improvement relative to the general coNP^{NP}hardness of synthesis. Moreover, constructing the consensus graph of the input component is easy, as is testing for certain graph properties, such as planarity, that ensure a small number of maximal cliques. Therefore, Theorem 1 provides an elegant method of deciding whether synthesis can be performed efficiently in practice before even beginning the synthesis process.
To summarize this section, the analysis of the input component provides two insights. First, a decision list implementing the specification can be constructed from the list of MFS of the input clauses. Second, analyzing the graph structure of the input component allows us to identify classes of specifications for which synthesis can be performed more efficiently. Note that this analysis, however, does not take into account the properties of the output component, and as such the decision list produced by ignoring the output component may be longer than necessary. With that in mind, the next section presents a complementary analysis of the output component that can help to produce a smaller decision list.
IvB Analysis of the Output Component
For the analysis of the output component, consider the set , defined in the previous subsection, of output clauses that must be satisfied when is the input assignment. Then for every two input assignments and , if , every output assignment that satisfies the specification for also satisfies the specification for . Therefore, it is enough when constructing the decision list to consider only those satisfiable subsets of that are of maximal size. Similarly to Lemma 1 in the previous section, this insight allows us to state the following lemma:
Lemma 2.
Let be an MSS of and be an assignment that satisfies . Then: (1) for every assignment such that , the assignment satisfies ; and (2) for every assignment such that , there is no such that the assignment satisfies .
Proof.
(1) Since satisfies every clause in , it must be that also satisfies every clause in . Therefore, for every clause in , either is satisfied by (and therefore ) or is satisfied by . Therefore satisfies . (2) Since is maximal, then in this case must be unsatisfiable. Therefore there is no that can satisfy all clauses that does not already satisfy. ∎
Therefore, similarly to the analysis of the input component, we have:
Corollary 2.
can be implemented by a decision list of length equal to the number of MSS of , where each in the decision list is of size linear in the size of the specification.
Proof.
Construct by taking the conjunction of all input clauses such that is not contained in the th MSS . Then, is satisfied exactly by those assignments such that is a subset of . Then, set the corresponding output assignment to an arbitrary satisfying assignment of . ∎
Example 2.
Let , and be the same as in Example 1. has three MSS: , and . From these MSS we can construct a decision list implementing in the way described above. Note that some decisions in the list might be redundant:
Unlike the input component, the output analysis does not require the specification to be realizable to produce the correct answer: for every input for which an output exists, will be contained in some MSS, and therefore will be covered by the decision list. On the other hand, we do not care about the case where an input has no corresponding output . Note, however, that unlike the input component, we do not have here a simple graph structure that can be exploited to obtain the list of MSSs, and finding an MSS is clearly NPhard. Therefore, it is unlikely for us to be able to efficiently identify instances where the number of MSS is polynomial.
More importantly, however, is that taking into account only the output component and ignoring the input component may also lead to a large decision list that includes many MSSs that would never be activated by an input. This fact emphasizes the drawbacks of independent synthesis of the components, and motivates the development of an algorithm that combines the input and output analyses to produce a decision list that is smaller than either of the ones produced by each analysis individually.
IvC Alternating between Input and Output Components
Our next goal is to combine the input and output analyses obtained so far into a synthesis procedure that constructs a decision list of length upperbounded by the minimum among the number of MFS of the input clauses and the number of MSS of the output clauses. Due to the restrictions of the input analysis, if the specification is unrealizable the procedure terminates without producing a decision list. Extending the synthesis to unrealizable specifications is left for future work. We first state the following lemma:
Lemma 3.
If is realizable, then for every MFS of , for some MSS of .
Proof.
For every MFS , since is allfalsifiable, there exists an input assignment such that . Then, since is realizable, is satisfiable, and therefore is contained in some MSS. ∎
Given an MFS for the input clauses, we say that an MSS for the output clauses covers if . Lemma 3 says that for every MFS , there exists at least one MSS that covers . Therefore, instead of producing a satisfying assignment for , we can produce a satisfying assignment for . In fact, such satisfying assignment also takes care of every other MFS covered by , making it unnecessary to generate them.
The above insight gives rise to Algorithm 1, which we call the ”BackandForth” algorithm. In this algorithm, we maintain a list of MSSs that is initially empty. At every iteration of the algorithm, we produce a new MFS that is not covered by the MSSs already in . Then, we find an MSS that covers this new MFS. If no such MSS exists, it means the specification is unrealizable, and so the algorithm emits an error message and terminates. Otherwise, we add this MSS to . After all the MFS have been covered, we construct a decision list from the obtained list of MSS in the same way as described in Section IVB: is a formula that is satisfied exactly when is a subset of the th MSS, and the corresponding output assignment is a satisfying assignment for that MSS.
Example 3.
Let , and be the same as in Examples 1 and 2. In the first iteration, we generate the MFS . Then, we expand into the MSS and add to . Note that also covers, besides , the MFS , and therefore this MFS will not need to be generated. The only remaining MFS is . is already an MSS, so we add it to . Since all MFS have been covered, the procedure terminates. Note that we did not need to add the MSS to , since no MFS is covered by this MSS. From , we can now construct a decision list as described earlier:
Implementation details
The key steps of Algorithm 1 are the generation of the MFS in line 3 and the MSS in line 4. These steps are similar to the input and output analyses in Sections IVA and IVB. Since, however, we use communication between the input and output components, we have additional constraints on the MFS and MSS being generated. At each step the generated MFS must not be covered by the previouslygenerated MSSs, and the generated MSS must cover the most recently generated MFS.
While generating an arbitrary MFS can be done in polynomial time, we prove that adding the restriction that the MFS must not be covered by a previous MSS makes the MFS generation an NPcomplete problem (see appendix for proper theorem and proof). Therefore, we implement the MFS generation in the following way. First, we use a SAT solver as an NP oracle to find an (notnecessarily maximal) allfalsifiable subset of not covered by the previous MSSs. Then, we extend this subset to an MFS by iterating over the remaining input clauses and at each step adding to the growing set a clause that does not conflict with the clauses already present in that set. This process of obtaining an MFS from is easier to implement when we use the conflict graph representation of . Given previous MSSs and the conflict graph , we use the following SAT query to generate an allfalsifiable subset:
We use variable to indicate whether clause is present in the allfalsifiable subset. The first conjunction encodes that for every previous MSS, the subset must include a clause not covered by that MSS. The second conjunction expresses that if two clauses conflict with each other, they cannot both be added to the subset. Note that whenever we generate a new MFS, we only need to add extra clauses of the first form to this query, allowing us to employ incremental capabilities of SAT solvers.
After extending the subset produced by the SAT solver to an MFS , we have to generate a new MSS that covers . For that we use a partial MaxSAT solver as an oracle. In a partial MaxSAT problem, some clauses are set as hard clauses and others are set as soft clauses [4]. The solver then returns an assignment that satisfies all hard clauses and the maximum possible number of soft clauses. We call the MaxSAT solver on the set of output clauses , where the clauses in are set as hard clauses, and all other clauses are set as soft clauses. This way, the MaxSAT solver is guaranteed to return a satisfiable set of clauses containing and of maximum size. Since a satisfiable subset of maximum size is necessarily maximal, the satisfied clauses returned by the MaxSAT solver is an MSS, as desired.
Analysis and Correctness
Since exactly one new MFS and one new MSS are generated at every iteration, the number of iterations in Algorithm 1 is upper bounded by . Yet, since Algorithm 1 does not generate redundant MFS and MSS, the number of iterations, and thus the size of the decision list, can be much smaller.
We now formalize and prove the correctness of Algorithm 1.
Lemma 4.
For a realizable specification , let be the decision list produced by Algorithm 1. Then (1) For every such that , ; (2) For every there is at least one such that .
Proof.
(1) Let be the th MSS generated by the algorithm. Then, by construction, iff , and is a satisfying assignment to . Therefore, if then satisfies , and so satisfies .
(2) For every , there exists an MFS such that . If was generated by the algorithm, then an MSS that covers was added to the MSS list. If was not generated by the algorithm, it must be because there was already a previously generated MSS that covers . Either way, since covers and , covers . Therefore, the corresponding in the decision list is such that . ∎
From Lemma 4 we obtain the following corollary.
Corollary 3.
Given a realizable specification , the decision list produced by Algorithm 1 implements .
It is worth noting that if the number of MFS is small as discussed in Section IVA, then purely enumerating MFS, as in Section IVA can be theoretically faster than using Algorithm 1. That is because finding an MFS can be done in polynomial time, while Algorithm 1 requires calls to a SAT and MaxSAT solvers. In practice, however, we observed that the BackandForth algorithm often avoids a large number of redundant MFS, which makes up for the extra complexity in generating each MFS. Still, for specifications that are known to have a small number of MFS, restriction to the analysis of the input component as in Section IVA can be sufficient.
IvD Partitioning the Specification into Distinct Output Variables
Some of the cases in the backandforth analysis which cause the number of MFS or MSS to be exponential can be simplified by partitioning the specification into sets of clauses that do not share output variables. As an example, consider the specification for the identity function:
or in a CNF form:
It is easy to see that both the number of MFS and MSS for this formula are . Each output variable, however, does not appear in the same clause with other output variables. Therefore, we can consider each pair of clauses as a separate specification and synthesize it independently as a decision list of size 2. As such, the total number of MFS and MSS grow linearly with .
Therefore we propose the following preprocessing step.

Given the specification , construct a graph with a vertex for each clause and an edge between two vertices iff the corresponding clauses share an output variable.

Separate the graph into connected components . Note that the are completely disjoint in terms of output variables.

For every , define a subspecification by taking only the clauses in whose corresponding vertex is in .

Call Algorithm 1 for each specification . This gives us a decision list for that decides on an assignment for only the output variables in .
Since the have disjoint sets of output variables, every decides on an assignment for a different partition of output variables. Therefore, given an input we can produce a corresponding output by simply evaluating each independently on and combining the results.
V Experimental Evaluation
In order to evaluate the performance of the BackandForth synthesis algorithm, we ran the algorithm on benchmarks from the 2QBF track of the QBFEVAL’16 QBFsolving competition [25]. This track is composed of QBF benchmarks of the form , where is a CNF formula. We can see these benchmarks as synthesis problems asking if we can synthesize a Skolem function for the existential variables in terms of the universal variables such that the formula
is satisfied. For this experimental evaluation we used only those benchmarks that are realizable, since adjusting the BackandForth algorithm to handle unrealizable benchmarks is future work. The benchmarks can be classified into seven families:
MutexP (7 instances), Qshifter (6 instances), RankingFunctions (49 instances), ReductionFinding (34 instances), SortingNetworks (22 instances), Tree (5 instances) and FixpointDetection (93 instances). Because benchmarks in the same family tend to have similar properties, it makes sense to evaluate performance over each family, rather than over specific instances.We compared the running time of the BackandForth algorithm on these benchmarks with three stateoftheart tools that employ different synthesis approaches: the CDCLbased CADET [27], the ROBDDbased RSynth [35], and the CEGARbased BFSS [1]. Since the BackandForth algorithm, CADET and RSynth are all sequential algorithms, to ensure fair comparison of computational effort, the version of BFSS used was compiled with the MiniSAT SAT solver [10] instead of the parallelized UniGen sampler used in [1]. We leave for future work the exploration of performance of the different tools in a parallel scenario.
Our implementation of the BackandForth algorithm used the Glucose SAT solver [5], based on MiniSAT, and the OpenWBO MaxSAT solver [24]. The implementation also used the partitioning described in Section IVD. All experiments were executed in the DAVinCI cluster at Rice University, consisting of 192 Westmere nodes of 12 processor cores each, running at 2.83 GHz with 4 GB of RAM per core, and 6 Sandy Bridge nodes of 16 processor cores each, running at 2.2 GHz with 8 GB of RAM per core. Our algorithm has not been parallelized, so the cluster was solely used to run multiple experiments simultaneously. Each instance had a timeout of 8 hours.
Figure 1 shows for each family the percentage of instances each tool was able to solve in the time limit. We can divide the results into three parts:
In the RankingFunctions and FixpointDetection families the BackandForth algorithm timed out on almost all instances, only being able to solve the easiest instances of FixpointDetection. CADET, on the other hand, performed very well, being able to solve all instances. RSynth and BFSS also outperformed the BackandForth algorithm, although they did not perform as well as CADET.
The Tree, MutexP, and Qshifter families had almost all instances solved by the BackandForth algorithm in under 45 seconds (except for the two hardest instances of Qshifter, which timed out), in many cases outperforming RSynth or BFSS. Even so, CADET still performed the best in these classes, solving all instances faster than our algorithm.
Lastly, ReductionFinding and SortingNetworks seem to be the most challenging families for existing tools, with CADET only being able to solve two instances in total, RSynth one, and BFSS none. In contrast, our BackandForth algorithm solved 13 cases in ReductionFinding and 6 in SortingNetworks. Furthermore, as can be seen in Figure 2, every instance that was solved by other tools was also solved by the BackandForth algorithm, which was faster by over an order of magnitude.
In summary, the BackandForth algorithm performed competitively in 5 out of 7 families, and was strictly superior in 2 out of 7 families. Due to the difficulty of analyzing CNF formulas, the exact reason why the algorithm performs well in these particular families and not in others remains an open question, to be explored in future work. Still, the results suggest that the BackandForth algorithm can serve as a good complement to modern synthesis tools, performing well exactly in the cases in which these tools struggle the most, and therefore it would be a good candidate for membership in a portfolio of synthesis algorithms.
Vi Discussion
A recurrent observation in recent evaluations [19, 2, 1, 35] of Boolean functional synthesis tools has been that no single tool or algorithm dominates the others in all classes of benchmarks. To build industrystrength Boolean functional solvers, it is therefore inevitable that a portfolio approach be adopted. Since decompositionbased techniques (beyond factored specifications) have not been used in existing tools so far, our original motivation was to develop a decompositioncentric framework for Boolean functional synthesis that complements (rather than dominates) the strengths of existing tools. As our experiments with the BackandForth algorithm show, we have been able to take the first few steps in this direction by successfully solving some classes of benchmarks that stateoftheart tools choke on. While we have tried to understand features of these benchmarks that make them particularly amenable to our technique, a lot more work remains to be done to elucidate this relation clearly.
Yet another motivation for exploring a decompositioncentric synthesis approach was to be able to generate Skolem functions in a format that lends itself to easy independent validation by domain experts. Interestingly, despite the singular importance of this aspect, it has been largely ignored by existing Boolean functional synthesis tools, most of which construct a circuit representation of the function using an acyclicgraph data structure such as an ROBDD or an AndInverter Graph. While these are known to be efficient representations of Boolean functions, they are not amenable to easy validation by a domain expert, especially when their sizes are large, often requiring a satisfiability solver to check that the generated Skolem functions indeed satisfy the specifications. Synthesizing functions as decision lists is a natural and wellstudied choice for meeting this objective. Along with each decision in the decision list, we can also identify the clauses that contribute to the generation of the outputs (these are clauses whose input components are falsified by the decision), thereby providing clues about which part of the specification is responsible for the outputs generated in a particular branch of the decisionlist representation. Our work shows that decompositionbased techniques lend themselves easily to such representations.
In order to be consistent with performance comparison experiments reported in the literature, all specifications used in our evaluation were prenex CNF (PCNF) formulas taken from the QBFEVAL’16 benchmark suite. While this certainly presents challenging instances of Boolean functional synthesis, PCNF is not a natural choice of representing specifications in several important application areas. For example, the industry standard (IEC 11313) for reactive programs for programmable logic controllers (PLC) includes a set of languages that allow the user to specify combinations of outputs based on different combinations of input conditions. The same is also true in the specification of several bus protocols like the VME Bus or AMBA Bus. Scenariobased specifications such as these are much more amenable to our decompositionbased approach, since there is a natural separation of input and output components of the specification. In addition, with such specifications, it is meaningful to analyze the structure of dependence between the input and output components, and exploit structural properties (viz. the size of the MIS in the conflict graph as explained in Section IV) in synthesis. We believe that as we look beyond PCNF representations of specifications, techniques like those presented in this paper will be even more useful in a portfolio approach to synthesis.
In our experimental evaluation, we chose CADET as a representative of the stateoftheart on Boolean synthesis stemming from the QBF community. This is due to its focus on 2QBF (which suffices for Boolean synthesis of realizable specifications) and its performance on recent QBFEVAL competitions. Another certifying QBF solver, CAQE [28], uses techniques that are similar to the clause splitting used in our algorithm. But CAQE targets QBF instances with arbitrary quantifier alternation, requiring additional mechanisms for handling these cases, and furthermore does not perform the same analysis as here, based on MFS and MSS. Due to their similarities, it would be interesting to perform a comparison between the two algorithms in the future.
Finally, the techniques presented in this work are clearly not the only ways to achieve synthesis via decomposition, and there exists scope for significant innovation and creativity, both in the manner in which a specification is decomposed, and in the way the decomposition is exploited to arrive at an efficient synthesis algorithm. One example lies in identifying algorithms for sequential decomposition, as presented in [11], which are applicable to a synthesis context. In summary, synthesis based on inputoutput decomposition presents uncharted territory that deserves systematic exploration in order to complement the strengths of existing synthesis tools.
Acknowledgment
We thank Assaf Marron for useful discussions, and the anonymous reviewers for their suggestions.
References
 [1] S. Akshay, S. Chakraborty, S. Goel, S. Kulal, and S. Shah. What’s Hard About Boolean Functional Synthesis? In Computer Aided Verification  30th International Conference, CAV 2018, pages 251–269, 2018.
 [2] S. Akshay, S. Chakraborty, A. K. John, and S. Shah. Towards Parallel Boolean Functional Synthesis. In Tools and Algorithms for the Construction and Analysis of Systems  23rd International Conference, TACAS 2017, pages 337–353, 2017.
 [3] R. Alur, P. Madhusudan, and W. Nam. Symbolic Computational Techniques for Solving Games. STTT, 7(2):118–128, 2005.
 [4] C. Ansótegui, M. L. Bonet, and J. Levy. Solving (Weighted) Partial MaxSAT through Satisfiability Testing. In Theory and Applications of Satisfiability Testing  SAT 2009  12th International Conference, pages 427–440, 2009.

[5]
G. Audemard and L. Simon.
Predicting Learnt Clauses Quality in Modern SAT Solvers.
In
Proceedings of the 21st International Joint Conference on Artificial Intelligence, IJCAI 2009
, pages 399–404, 2009.  [6] V. Balabanov and J.H. R. Jiang. Unified QBF Certification and Its Applications. Form. Methods Syst. Des., 41(1):45–65, Aug. 2012.
 [7] V. Balabanov and J. R. Jiang. Resolution Proofs and Skolem Functions in QBF Evaluation and Applications. In Computer Aided Verification  23rd International Conference, CAV 2011, pages 149–164, 2011.
 [8] V. Balabanov, M. Widl, and J. R. Jiang. QBF Resolution Systems and Their Proof Complexities. In Theory and Applications of Satisfiability Testing  SAT 2014  17th International Conference, pages 154–169, 2014.
 [9] G. Boole. The Mathematical Analysis of Logic. Philosophical Library, 1847.
 [10] N. Eén and N. Sörensson. An Extensible SATsolver. In Theory and Applications of Satisfiability Testing  SAT 2003  6th International Conference, pages 502–518, 2003.
 [11] D. Fried, A. Legay, J. Ouaknine, and M. Y. Vardi. Sequential Relational Decomposition. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, pages 432–441, 2018.
 [12] D. Fried, L. M. Tabajara, and M. Y. Vardi. BDDBased Boolean Functional Synthesis. In Computer Aided Verification  28th International Conference, CAV 2016, pages 402–421, 2016.
 [13] R. Ganian and S. Szeider. New Width Parameters for Model Counting. In Theory and Applications of Satisfiability Testing  SAT 2017  20th International Conference, pages 38–52, 2017.
 [14] M. Heule, M. Seidl, and A. Biere. Efficient Extraction of Skolem Functions from QRAT Proofs. In Formal Methods in ComputerAided Design, FMCAD 2014, pages 107–114, 2014.
 [15] A. Ignatiev, A. Morgado, J. Planes, and J. MarquesSilva. Maximal Falsifiability  Definitions, Algorithms, and Applications. In Logic for Programming, Artificial Intelligence, and Reasoning  19th International Conference, LPAR19, pages 439–456, 2013.
 [16] S. Jacobs, R. Bloem, R. Brenguier, R. Könighofer, G. A. Pérez, J. Raskin, L. Ryzhyk, O. Sankur, M. Seidl, L. Tentrup, and A. Walker. The Second Reactive Synthesis Competition (SYNTCOMP 2015). In Proceedings Fourth Workshop on Synthesis, SYNT 2015, pages 27–57, 2015.
 [17] J. R. Jiang. Quantifier Elimination via Functional Composition. In Computer Aided Verification, 21st International Conference, CAV 2009, pages 383–397, 2009.
 [18] S. Jo, T. Matsumoto, and M. Fujita. SATBased Automatic Rectification and Debugging of Combinational Circuits with LUT Insertions. In Proceedings of the 2012 IEEE 21st Asian Test Symposium, ATS ’12, pages 19–24. IEEE Computer Society, 2012.
 [19] A. K. John, S. Shah, S. Chakraborty, A. Trivedi, and S. Akshay. Skolem Functions for Factored Formulas. In Formal Methods in ComputerAided Design, FMCAD 2015, pages 73–80, 2015.
 [20] J. H. Kukula and T. R. Shiple. Building Circuits from Relations. In Computer Aided Verification, 12th International Conference, CAV 2000, pages 113–123, 2000.
 [21] C. M. Li and F. Manyà. MaxSAT, Hard and Soft Constraints. In Handbook of Satisfiability, pages 613–631. 2009.
 [22] L. Lowenheim. Über die Auflösung von Gleichungen in Logischen Gebietkalkul. Math. Ann., 68:169–207, 1910.
 [23] E. Macii, G. Odasso, and M. Poncino. Comparing Different Boolean Unification Algorithms. In Proceedings of 32nd Asilomar Conference on Signals, Systems and Computers, pages 17–29, 2006.
 [24] R. Martins, V. M. Manquinho, and I. Lynce. OpenWBO: A Modular MaxSAT Solver. In Theory and Applications of Satisfiability Testing  SAT 2014  17th International Conference, pages 438–445, 2014.
 [25] M. Narizzano, L. Pulina, and A. Tacchella. The QBFEVAL web portal. In Logics in Artificial Intelligence, pages 494–497. Springer Berlin Heidelberg, 2006.
 [26] A. Niemetz, M. Preiner, F. Lonsing, M. Seidl, and A. Bierre. ResolutionBased Certificate Extraction for QBF  (Tool Presentation). In Theory and Applications of Satisfiability Testing  SAT 2012  15th International Conference, pages 430–435, 2012.
 [27] M. N. Rabe and S. A. Seshia. Incremental Determinization. In Theory and Applications of Satisfiability Testing  SAT 2016  19th International Conference, pages 375–392, 2016.
 [28] M. N. Rabe and L. Tentrup. CAQE: A Certifying QBF Solver. In Formal Methods in ComputerAided Design, FMCAD 2015, pages 136–143, 2015.
 [29] R. L. Rivest. Learning Decision Lists. Machine Learning, 2(3):229–246, 1987.
 [30] J. P. M. Silva, I. Lynce, and S. Malik. ConflictDriven Clause Learning SAT Solvers. In Handbook of Satisfiability, pages 131–153. 2009.
 [31] A. SolarLezama. Program Sketching. STTT, 15(56):475–495, 2013.
 [32] A. SolarLezama, R. M. Rabbah, R. Bodík, and K. Ebcioglu. Programming by Sketching for Bitstreaming Programs. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 281–294, 2005.
 [33] S. Srivastava, S. Gulwani, and J. S. Foster. TemplateBased Program Verification and Program Synthesis. STTT, 15(56):497–518, 2013.
 [34] L. M. Tabajara. BDDBased Boolean Synthesis. Master’s thesis, Rice University, 2018.
 [35] L. M. Tabajara and M. Y. Vardi. Factored Boolean Functional Synthesis. In Formal Methods in Computer Aided Design, FMCAD 2017, pages 124–131, 2017.
 [36] S. Zhu, L. M. Tabajara, J. Li, G. Pu, and M. Y. Vardi. Symbolic LTLf Synthesis. In Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence, IJCAI 2017, pages 1362–1369, 2017.
Appendix
Via On Synthesis via Sequential Decomposition
As a first attempt for synthesis via decomposition, we explored the use of a decomposition method called sequential decomposition, described in [11], for the purpose of synthesis. In sequential decomposition the specification is split into input and output parts by adding intermediate fresh variables defining a domain that serves to communicate between the input domain and the output domain . This intermediate domain should be introduced in such a way as to preserve exactly every input/output pair in . In addition, to preserve the independence of the two parts, as described in [11], we would like each part to be synthesized independently and then recomposed into an implementation for the entire specification. Therefore we define the following.
Definition 1.
Let be Boolean formulas. Then is called a good decomposition of if 1. ; and 2. for every input , .
Property (1) guarantees that for every input assignment and output assignment , satisfies if and only if there exists an intermediate assignment such that satisfies and satisfies . Property (2) guarantees that for all implementations of and of , their composition is welldefined and is an implementation of . Such a decomposition attains a complete separation of the inputs and outputs of , in the sense that no direct knowledge of the output variables is necessary to synthesize , nor of the input variables to synthesize .
We now state the following theorem describing synthesis by sequential decomposition.
Theorem 2.
Let be a specification where are the input variables and are the output variables. If and form a good decomposition of , then for every implementation of and of , implements .
Proof.
Since , we have that if satisfies and satisfies then satisfies . Let and be implementations of and , respectively. Let . Since is a good decomposition of , . Since is an implementation of , satisfies . Furthermore, since , . Then, since , . Therefore, satisfies . Since satisfies and satisfies , then satisfies . Therefore, is an implementation of . ∎
Theorem 2 describes a clean condition for a decomposition that allows synthesis of each component independently. We now provide a concrete example that follows this framework. Specifically we introduce a type of decomposition that satisfies the requirements of a good decomposition according to Definition 1. This decomposition can be applied to all specifications in CNF, and is based on the same concepts used in the input and output analyses in Section IV.
The CNF decomposition of is a pair where
The idea behind the CNF decomposition is to focus on the clauses that can be true/false according to the assignment for the input component. This leads to a natural and very simple decomposition: in fact, is already a function from to on its own, and hence the synthesis of is trivial  just assign every to . Intuitively, this decomposition works by grouping assignments of the input variables into individual variables. Specifically, note that is only assigned to true if that is absolutely necessary, that is, when is not satisfied and we must satisfy instead. As such, we abstract away all the assignments that make the same variables true. Therefore, we only need to concern ourselves with synthesizing from .
We now prove that this decomposition meets the criteria for a good decomposition according to Theorem 2.
Theorem 3.
If and are given by the CNF decomposition of a CNF formula , then is a good decomposition of .
Proof.
We first prove that .
Next, we prove that for every input . Assume , that is, there exists