A Theory of Formal Synthesis via Inductive Learning

by   Susmit Jha, et al.
berkeley college
United Technologies

Formal synthesis is the process of generating a program satisfying a high-level formal specification. In recent times, effective formal synthesis methods have been proposed based on the use of inductive learning. We refer to this class of methods that learn programs from examples as formal inductive synthesis. In this paper, we present a theoretical framework for formal inductive synthesis. We discuss how formal inductive synthesis differs from traditional machine learning. We then describe oracle-guided inductive synthesis (OGIS), a framework that captures a family of synthesizers that operate by iteratively querying an oracle. An instance of OGIS that has had much practical impact is counterexample-guided inductive synthesis (CEGIS). We present a theoretical characterization of CEGIS for learning any program that computes a recursive language. In particular, we analyze the relative power of CEGIS variants where the types of counterexamples generated by the oracle varies. We also consider the impact of bounded versus unbounded memory available to the learning algorithm. In the special case where the universe of candidate programs is finite, we relate the speed of convergence to the notion of teaching dimension studied in machine learning theory. Altogether, the results of the paper take a first step towards a theoretical foundation for the emerging field of formal inductive synthesis.



There are no comments yet.


page 1

page 2

page 3

page 4


Are There Good Mistakes? A Theoretical Analysis of CEGIS

Counterexample-guided inductive synthesis CEGIS is used to synthesize pr...

An Inductive Synthesis Framework for Verifiable Reinforcement Learning

Despite the tremendous advances that have been made in the last decade o...

Efficient Synthesis with Probabilistic Constraints

We consider the problem of synthesizing a program given a probabilistic ...

Inductive Synthesis for Probabilistic Programs Reaches New Horizons

This paper presents a novel method for the automated synthesis of probab...

Process Discovery for Structured Program Synthesis

A core task in process mining is process discovery which aims to learn a...

Counterexample-Driven Synthesis for Probabilistic Program Sketches

Probabilistic programs are key to deal with uncertainty in e.g. controll...

QBF Solving by Counterexample-guided Expansion

We introduce a novel generalization of Counterexample-Guided Inductive S...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The field of formal methods has made enormous strides in recent decades. Formal verification techniques such as model checking 

[16, 49, 17] and theorem proving (see, e.g. [47, 38, 24]) are used routinely in the computer-aided design of integrated circuits and have been widely applied to find bugs in software, analyze models of embedded systems, and find security vulnerabilities in programs and protocols. At the heart of many of these advances are computational reasoning engines such as Boolean satisfiability (SAT) solvers [43], Binary Decision Diagrams (BDDs) [14], and satisfiability modulo theories (SMT) solvers [9]. Alongside these advances, there has been a growing interest in the synthesis of programs or systems from formal specifications with correctness guarantees. We refer to this area as formal synthesis. Starting with the seminal work of Manna and Waldinger on deductive program synthesis [44] and Pnueli and Rosner on reactive synthesis from temporal logic [48], there have been several advances that have made formal synthesis practical in specific application domains such as robotics, online education, and end-user programming.

Algorithmic approaches to formal synthesis range over a wide spectrum, from deductive synthesis to inductive synthesis. In deductive synthesis (e.g., [44]), a program is synthesized by constructively proving a theorem, employing logical inference and constraint solving. On the other hand, inductive synthesis [21, 59, 54] seeks to find a program matching a set of input-output examples. At a high level, it is thus an instance of learning from examples, also termed as inductive inference or machine learning [7, 45]. Many current approaches to synthesis blend induction and deduction in the sense that even as they generalize from examples, deductive procedures are used in the process of generalization (see [53, 36] for a detailed exposition). Even so, the term “inductive synthesis” is typically used to refer to all of them. We will refer to these methods as formal inductive synthesis to place an emphasis on correctness of the synthesized artifact. These synthesizers generalize from examples by searching a restricted space of programs. In machine learning, this restricted space is called the concept class, and each element of that space is often called a candidate concept. The concept class is usually specified syntactically. It has been recognized that this syntax guidance, also termed as a structure hypothesis, can be crucial in helping the synthesizer converge quickly to the target concept [57, 53, 2].

The fields of inductive formal synthesis and machine learning have the same high-level goal: to develop algorithmic techniques for synthesizing a concept

(function, program, or classifier)

from observations (examples, queries, etc.). However, there are also important differences in the problem formulations and techniques used in both fields. We identify some of the main differences below:

  • Concept Classes: In traditional machine learning, the classes of concepts to be synthesized tend to be specialized, such as linear functions or half-spaces [63], convex polytopes [27]

    , neural networks of specific forms 

    [10], Boolean formulas in fixed, bounded syntactic forms [28]

    , and decision trees 

    [50]. However, in formal synthesis, the target concepts tend to be fairly general programs or automata with constraints or finite bounds imposed mainly to ensure tractability of synthesis.

  • Learning Algorithms: In traditional machine learning, just as concept classes tend to be specialized, so also are the learning algorithms for those classes [45]. In contrast, in formal inductive synthesis, the trend is towards using general-purpose decision procedures such as SAT solvers, SMT solvers, and model checkers that are not specifically designed for inductive learning.

  • Exact vs. Approximate Learning: In formal inductive synthesis, there is a strong emphasis on exactly learning the target concept; i.e., the learner seeks to find a concept that is consistent with all positive examples but not with any negative example. The labels for examples are typically assumed to be correct. Moreover, the learned concept should satisfy a formal specification. In contrast, the emphasis in traditional machine learning is on techniques that perform approximate learning, where input data can be noisy, some amount of misclassification can be tolerated, there is no formal specification, and the overall goal is to optimize a cost function (e.g., capturing classification error).

  • Emphasis on Oracle-Guidance:

    In formal inductive synthesis, there is a big emphasis on learning in the presence of an oracle, which is typically implemented using a general-purpose decision procedure or sometimes even a human user. Moreover, it is often the case that the design of this oracle is part of the design of the synthesizer. In contrast, in traditional machine learning, the use of oracles is rare, and instead the learner typically selects examples from a corpus, often drawing examples independently from an underlying probability distribution. Even when oracles are used, they are assumed to be black boxes that the learner has no control over.

In this paper, we take first steps towards a theoretical framework and analysis of formal inductive synthesis. As noted above, most instances of inductive synthesis in the literature rely on an oracle that answers different types of queries. In order to capture these various synthesis methods in a unifying framework, we formalize the notion of oracle-guided inductive synthesis (). While we defer a detailed treatment of to Section 2, we point out here three key dimensions in which techniques differ from each other:

  • Characteristics of concept class: The concept class for synthesis may have different characteristics depending on the application domain. For instance, the class of programs from which the synthesizer must generate the correct one may be finite, as in the synthesis of bitvector programs [57, 32, 26], or infinite, as in the synthesis of guards for hybrid automata [33, 35]. In the former case, termination is easily guaranteed, but it is not so simple for the case of infinite-size concept classes.

  • Query types: Different applications may impose differing constraints on the capabilities of the oracle. In some cases, the oracle may provide only positive examples. When verification engines are used as oracles, as is typical in formal synthesis, the oracle may provide both positive examples and counterexamples — negative examples which refute candidate programs. More fine-grained properties of queries are also possible — for instance, an oracle may permit queries that request not just any counterexample, but one that is “minimum” according to some cost function.

  • Resources available to the learning engine: As noted above, the learning algorithms in formal inductive synthesis tend to be general-purpose decision procedures. Even so, for tractability, certain constraints may be placed on the resources available to the decision procedure, such as time or memory available. For example, one may limit the decision procedure to use a finite amount of memory, such as imposing an upper bound on the number of (learned) clauses for a SAT solver.

We conduct a theoretical study of by examining the impact of variations along the above three dimensions. Our work has a particular focus on counterexample-guided inductive synthesis (CEGIS) [57], a particularly popular and effective instantiation of the framework. When the concept class is infinite size, termination of CEGIS is not guaranteed. We study the relative strength of different versions of CEGIS, with regards to their termination guarantees. The versions vary based on the type of counterexamples one can obtain from the oracle. We also analyze the impact of finite verses infinite memory available to the learning algorithm to store examples and hypothesized programs/concepts. Finally, when the concept class is finite size, even though termination of CEGIS is guaranteed, the speed of termination can still be an issue. In this case, we draw a connection between the number of counterexamples needed by a CEGIS procedure and the notion of teaching dimension [22] previously introduced in the machine learning literature.

To summarize, we make the following specific contributions in this paper:

  • We define the formal inductive synthesis problem and propose a class of solution techniques termed as Oracle-Guided Inductive Synthesis (). We illustrate how

    generalizes instances of concept learning in machine learning/artificial intelligence as well as synthesis techniques developed using formal methods. We provide examples of synthesis techniques from literature and show how they can be represented as instantiations of


  • We perform a theoretical comparison of different instantiations of the paradigm in terms of their synthesis power. The synthesis power of an technique is defined as the class of concepts/programs (from an infinite concept class) that can be synthesized using that technique. We establish the following specific novel theoretical results:

    • For learning engines that can use unbounded memory, the power of synthesis engines using oracle that provides arbitrary counterexamples or minimal counterexamples is the same. But this is strictly more powerful than using oracle which provides counterexamples which are bounded by the size of the positive examples.

    • For learning engines that use bounded memory, the power of synthesis engines using arbitrary counterexamples or minimal counterexamples is still the same. The power of synthesis engines using counterexamples bounded by positive examples is not comparable to those using arbitrary/minimal counterexamples. Contrary to intuition, using counterexamples bounded by positive examples allows one to synthesize programs from program classes which cannot be synthesized using arbitrary or minimal counterexamples.

  • For finite concept classes, we prove the NP hardness of the problem of solving the formal inductive synthesis problem for finite domains for a large class of techniques. We also show that the teaching dimension [22] of the concept class is a lower bound on the number of counterexamples needed for a CEGIS technique to terminate (on an arbitrary program from that class).

The rest of the paper is organized as follows. We first present the Oracle Guided Inductive Synthesis () paradigm in Section 2. We discuss related work in Section 3. We present the notation and definitions used for theoretical analysis in Section 4 followed by the theoretical results and their proofs in Section 5 and Section 6. We summarize our results and discuss open problems in Section 7. A preliminary version of this paper appeared in the SYNT 2014 workshop [34].

2 Oracle-Guided Inductive Synthesis:

We begin by defining some basic terms and notation. Following standard terminology in the machine learning theory community [5], we define a concept as a set of examples drawn from a domain of examples . In other words, . An example can be viewed as an input-output behavior of a program; for example, a (pre, post) state for a terminating program, or an input-output trace for a reactive program. Thus, in this paper, we ignore syntactic issues in representing concepts and model them in terms of their semantics, as a set of behaviors. The set of all possible concepts is termed the concept class, denoted by . Thus, . The concept class may either be specified in the original synthesis problem or arise as a result of a structure hypothesis that restricts the space of candidate concepts. Depending on the application domain, can be finite or infinite. The concept class can also be finite or infinite. Note that it is possible to place (syntactic) restrictions on the concept so that is finite even when is infinite.

One key distinguishing characteristic between traditional machine learning and formal inductive synthesis is the presence of a formal specification in the latter. We define a specification as a set of “correct” concepts, i.e., . Any example such that there is a concept where is called a positive example. Likewise, an example that is not contained in any is a negative example. We will write to denote that is a positive example.

Note that standard practice in formal methods is to define a specification as a set of examples, i.e., . This is consistent with most properties that are trace properties, where represents the set of allowed behaviors — traces, (pre,post) states, etc. — of the program. However, certain practical properties of systems, e.g., certain security policies, are not trace properties (see, e.g., [18]), and therefore we use the more general definition of a specification.

We now define what it means for a concept to satisfy . Given a concept we say that satisfies iff . If we have a complete specification, it means that is a singleton set comprising only a single allowed concept. In general, is likely to be a partial specification that allows for multiple correct concepts.

We now present a first definition of the formal inductive synthesis problem:

Given a concept class and a domain of examples , the formal inductive synthesis problem is to find, using only a subset of examples from , a target concept that satisfies a specification .

This definition is reasonable in cases where only elements of can be accessed by the synthesis engine — the common case in the use of machine learning methods. However, existing formal verification and synthesis methods can use a somewhat richer set of inputs, including Boolean answers to equivalence (verification) queries with respect to the specification , as well as verification queries with respect to other constructed specifications. Moreover, the synthesis engine typically does not directly access or manipulate the specification . In order to formalize this richer source of inputs as well as the indirect access to , we introduce the concept of an oracle interface.

Definition 2.1

An oracle interface is a subset of where is a set of query types, is a corresponding set of response types, and defines which pairs of query and response types are semantically well-formed.

A simple example of an oracle interface is one that defines a single query that returns examples from . In this case, the synthesis problem is to learn a correct program from purely positive examples. The more common case in machine learning (of classifiers) is to have an oracle that supports two kinds of queries, one that returns positive examples and another that returns negative examples. As we will see in Sec. 2.1, there are richer types of queries that are commonly used in formal synthesis. For now, we will leave and as abstract sets.

With this notion of an oracle interface, we now introduce our definition of formal inductive synthesis (FIS):

Definition 2.2

Given a concept class , a domain of examples , a specification , and an oracle interface , the formal inductive synthesis problem is to find, using only an oracle interface to examples from and to , a target concept that satisfies .

Thus, an instance of FIS is defined in terms of the tuple . We next introduce a family of solution techniques for the FIS problem. A FIS problem instance defines an oracle interface and a solution technique for that problem instance can access the domain and the specification only through that interface. For example, the interface might include a query type which provides the specification as response. Another query type could admit positive examples from the domain as valid responses.

2.1 : A family of synthesizers

Oracle-guided inductive synthesis () is an approach to solve the formal inductive synthesis problem defined above, encompassing a family of synthesis algorithms.

Figure 1: Oracle Guided Inductive Synthesis

As illustrated in Figure 1, comprises two key components: an inductive learning engine (also sometimes referred to as a “Learner”) and an oracle (also referred to as a “Teacher”). The interaction between the learner and the oracle is in the form of a dialogue comprising queries and responses. The oracle is defined by the types of queries that it can answer, and the properties of its responses. Synthesis is thus an iterative process: at each step, the learner formulates and sends a query to the oracle, and the oracle sends its response. For formal synthesis, the oracle is also tasked with determining whether the learner has found a correct target concept. Thus, the oracle implicitly or explicitly maintains the specification and can report to the learner when it has terminated with a correct concept.

We first formalize the notions of learner and oracle. Let be a set of instances of queries of types , and be a set of instances of responses of types . We allow both and to include a special element indicating the absence of a query or response. An element is said to conform to an oracle interface if is of type , is of type and . A valid dialogue pair for an oracle interface , denoted , is a (query,response) pair such that , and conforms to . The set of valid dialogue pairs for an oracle interface is denoted by and denotes the set of valid dialogue sequences — finite sequences of valid dialogue pairs. If is a valid dialogue sequence, denotes a sub-sequence of of length and denotes the -th dialogue in the sequence.

Definition 2.3

An oracle is a mapping . A learner is a mapping .

Given these definitions, we can now define the approach formally.

Definition 2.4

Given a FIS , a set of query types , and a set of response types , an oracle-guided inductive synthesis () procedure (engine) is a tuple , comprising an oracle and a learner , where is a set of instances of queries of types , is a set of instances of responses of types , and is a valid dialogue set for .

In other words, an engine comprises an oracle that maps a “dialogue history” and a current query to a response, and a learner that, given a dialogue history, outputs a hypothesized concept along with a new query. Upon convergence, the final concept output by is the output of the procedure.

We also formalize the definition of when an engine solves an FIS problem.

Definition 2.5

A dialogue sequence corresponding to procedure is such that is where for some query and some concept , and .

The procedure is said to solve the FIS problem with dialogue sequence if there exists an such that , and satisfies , and for all , , that is, the procedure converges to a concept that satisfies .

The procedure is said to solve the FIS problem if there exists a dialogue sequence with which it solves that problem.

The convergence and computational complexity of an procedure is determined by the nature of the FIS problem along with three factors: (i) the complexity of each invocation of the learner ; (ii) the complexity of each invocation of the oracle , and (iii) the number of iterations (queries, examples) of the loop before convergence. We term first two factors as learner complexity and oracle complexity, and the third as sample complexity. Sometimes, in procedures, oracle complexity is ignored, so that we simply count calls to the oracle rather than the time spent in each call.

An procedure is defined by properties of the learner and the oracle. Relevant properties of the learner include (i) its inductive bias that restricts its search to a particular family of concepts and a search strategy over this space, and (ii) resource constraints, such as finite or infinite memory. Relevant properties of the oracle include the types of queries it supports and of the responses it generates. We list below the common query and response types. In each case, the query type is given in square brackets as a template comprising a query name along with the types of the formal arguments to that query, e.g., examples or concepts . An instance of each of these types is formed by substituting a specific arguments (examples, concepts, etc.) for the formal arguments.

  • Membership query: [] The learner selects an example and asks “Is positive or negative?” The oracle responds with a label for , indicating whether is a positive or negative example.

  • Positive witness query: [] The learner asks the oracle “Give me a positive example”. The oracle responds with an example , if one exists, and with otherwise.

  • Negative witness query: [] The learner asks the oracle “Give me a negative example”. The oracle responds with an example , if one exists, and with otherwise.

  • Correctness (Verification) query: [] The learner proposes a candidate concept and asks “Is correct?” (i.e., “does it satisfy ?”). If so, the oracle responds “YES” (and the synthesis can terminate). If not, i.e. , the oracle responds “NO” and provides a counterexample . Here is an example such that either but , or and there exists some other concept containing .

    For the special case of trace properties, the correctness query can take on specific forms. One form is termed the equivalence query, denoted , where the counterexample is in the symmetric difference of the single correct target concept and . The other is termed the subsumption query, denoted , where the counterexample is a negative example present in , and is used when is a partial specification admitting several correct concepts.

    It is important to note that, in the general case, a verification query does not, by itself, specify any label for a counterexample. One may need an additional membership query to generate a label for a counterexample.

  • Crafted Correctness (Verification) query: [] As noted earlier, oracles used in formal inductive synthesis tend to be general-purpose decision procedures. Thus, they can usually answer not only verification queries with respect to the specification for the overall FIS problem, but also verification queries for specifications crafted by the learner. We refer to this class of queries as crafted correctness/verification queries. The learner asks “Does satisfy ?” for a crafted specification and a crafted concept .

    As for one can define as special cases a crafted equivalence query type and a crafted subsumption query type .

  • Distinguishing input query: [] In this query, the learner supplies a finite set of examples and a concept , where , and asks “Does there exist another concept s.t. and ?” If so, the oracle responds “YES” and provides both and an example . The example forms a so-called “distinguishing input” that differentiates the two concepts and . If no such exists, the oracle responds “NO”.

    The distinguishing input query has been found useful in scenarios where it is computationally hard to check correctness using the specification , such as in malware deobfuscation [32].

The query/response types , , , , and listed above are not meant to be exhaustive. Any subset of such types can form an oracle interface . We note here that, in the machine learning theory community, there have been thorough studies of query-based learning; see Angluin’s review paper [6] for details. However, in our formalization of , new query types such as and are possible due to the previously-identified key differences with traditional machine learning including the general-purpose nature of oracle implementations and the ability to exert control over the responses provided by the oracle. Moreover, as we will see, our theoretical analysis raises the following questions that are pertinent in the setting of formal synthesis where the learner and oracle are typically implemented as general-purpose decision procedures:

  • Oracle design: When multiple valid responses can be made to a query, which ones are better, in terms of convergence to a correct concept (convergence and complexity)?

  • Learner design: How do resource constraints on the learner or its choice of search strategy affect convergence to a correct concept?

2.2 Examples of

We now take three example synthesis techniques previously presented in literature and illustrate how they instantiate the paradigm. These techniques mainly differ in the oracle interface that they employ.

Example 2.1

Query-based learning of automata [4]:
Angluin’s classic work on learning deterministic finite automata (DFAs) from membership and equivalence queries [4] is an instance of with . The learner is a custom-designed algorithm called , whereas the oracle is treated as a black box that answers the membership and equivalence queries; in particular, no assumptions are made about the form of counterexamples. Several variants of have found use in the formal verification literature; see [19] for more information.

Example 2.2

Counterexample-guided inductive synthesis (CEGIS) [57]:
CEGIS was originally proposed as an algorithmic method for program synthesis where the specification is given as a reference program and the concept class is defined using a partial program, also referred to as a “sketch” [57]. It has since proved very versatile, also applying to partial specifications (via more general correctness queries, see, e.g., [37]), and other ways of providing syntax guidance; see [2] for a more detailed treatment. In CEGIS, the learner (synthesizer) interacts with a verifier that can take in a candidate program and a specification, and verify whether the program satisfies the specification, providing a counterexample when it does not. In CEGIS, the learner is typically implemented on top of a general-purpose decision procedure such as a SAT solver, SMT solver, or model checker. The oracle (verifier) is also implemented similarly. Thus, CEGIS is an instance of with .

Many general-purpose verifiers typically support not only correctness queries with respect to the original specification, but also crafted correctness queries. Moreover, they can also answer membership queries, which are special cases of the verification problem where the specification is checked on a single input/output behavior. We term an instantiation of CEGIS with these additional query types as generalized CEGIS, which has an oracle interface . We will restrict our attention in this paper to the standard CEGIS.

Example 2.3

Oracle-guided program synthesis using distinguishing inputs [32]:
Our third example is an approach to program synthesis that uses distinguishing inputs when complete specification is either unavailable or it is expensive to verify a candidate program against its specification [32]. In this case, distinguishing input queries, combined with witness and membership queries, provide a way to quickly generate a corpus of examples that rule out incorrect programs. When there is only a single program consistent with these examples, only then does a correctness query need to be made to ascertain its correctness. Thus, the oracle interface with being used sparingly. The learner and the oracle are implemented using SMT solving.

Example 2.4

ICE learning for invariant synthesis [20]:
Recently, an approach to invariant generation has been proposed that uses learning from “implications, counterexamples, and examples” (ICE) — positive and negative examples (states) coupled with queries to a solver to determine whether a hypothesized invariant is inductive. When the latter query is answered in the negative, it is accompanied by a counterexample that is in the form of a pair of states (seemingly different from the positive/negative examples that are single states), and which also does not indicate any specific +/- label for these states. However, ICE learning is also an instance of , when one observes that correctness queries in general do not provide labels. Thus, . Additionally, one can lift the domain of examples from single states to pairs of states, and define an corresponding concept class equivalent to the class of candidate invariants.

Our focus in this paper is on CEGIS, one of the most popular and effective instantiations of the framework. We describe next the variations of CEGIS of which we perform a theoretical analysis.

2.3 Counterexample-Guided Inductive Synthesis (CEGIS)

Consider the CEGIS instantiation of the framework. In this paper, we consider a general setting where the concept class is the set of programs corresponding to the set of recursive (decidable) languages; thus, it is infinite. The domain of examples is also infinite. We choose such an expressive concept class and domain because we want to compare how the power of CEGIS varies as we vary the oracle and learner. More specifically, we vary the nature of responses from the oracle to correctness and witness queries, and the memory available to the learner.

For the oracle, we consider four different types of counterexamples that the oracle can provide in response to a correctness query. Recall that in formal synthesis, oracles are general-purpose verifiers or decision procedures whose internal heuristics may determine the type of counterexample obtained. Each type describes a different oracle and hence, a different flavor of CEGIS. Our goal is to compare these synthesis techniques and establish whether one type of counterexample allows the synthesizer to successfully learn more programs than the other. The four kinds of counterexamples considered in this paper are as follows:

  • Arbitrary counterexamples: This is the “standard” CEGIS technique (denoted ) that makes no assumptions on the form of the counterexample obtained from the oracle. Note however that our focus is on an infinite concept class, whereas most practical instantiations of CEGIS have focused on finite concept classes; thus, convergence is no longer guaranteed in our setting. This version of CEGIS serves as the baseline for comparison against other synthesis techniques.

  • Minimal counterexamples: For any total order on , we require that the validation engine provide a counterexample which is minimal. This choice of counterexamples is motivated by literature on debugging. Significant effort has been made on improving validation engines to produce counterexamples which aid debugging by localizing the error [46, 15]. The use of counterexamples in conceptually is an iterative repair process and hence, it is natural to extend successful error localization and debugging techniques to inductive synthesis.

  • Constant-bounded counterexamples: Here the “size” of the counterexamples produced by the validation engine is bounded by a constant. This is motivated by the use of bounds in formal verification such as bounded model checking [11] and bug-finding in concurrent programs [8] using bounds on context switches.

  • Positive-bounded counterexamples: Here the counterexample produced by the validation engine must be smaller than a previously seen positive example. This is motivated from the industrial practice of validation by simulation where the system is often simulated to a finite length to discover bugs. The length of simulation often depends on the traces which illustrate known positive behaviors. It is expected that errors will show up if the system is simulated up to the length of the largest positive trace. Mutation-based software testing and symbolic execution also has a similar flavor, where a sample correct execution is mutated to find bugs.

In addition to the above variations to the oracle, we also consider two kinds of learners that differ based on their ability to store examples and counterexamples:

  • Infinite memory: In the typical setting of CEGIS, the learner is not assumed to have any memory bound, allowing the learner to store as many examples and counterexamples as needed. Note that, for an infinite domain, this set of examples can grow unbounded.

  • Finite memory: A more practical setting is one where the learner only has a finite amount of memory, and therefore can only store a finite representation of examples or hypothesized programs. This notion of finite memory is similar to that used classically for language learning from examples [64]. We give the first theoretical results on the power of CEGIS and its variants, for general program synthesis, in this restricted setting.

We introduce notation to refer to these variants in a more compact manner. The synthesis engine using arbitrary counterexamples and with infinite memory is motivated by algorithm [57] and hence, we refer to it as in the paper even though it was used for finite programs in that original work. The variant of which is restricted to use finite memory is referred to as . Similarly, the the synthesis engine using minimal counterexamples and infinite memory is called minimal counterexample guided inductive synthesis (). The variant of the approach using finite memory is referred to as . The synthesis engine using counterexamples which are smaller than a fixed constant is called a constant bounded counterexample guided inductive synthesis, and is termed if the memory is not finite and if the memory is finite. The synthesis engine using counterexamples which are smaller than the largest positive examples is called positive-history bounded counterexample guided inductive synthesis, and is termed if the memory is not finite and if the memory is finite.

For the class of programs corresponding to the set of recursive languages, our focus is on learning in the limit, that is, whether the synthesis technique converges to the correct program or not (see Definition 4.14 in Sec. 4 for a formal definition). This question is non-trivial since our concept class is not finite. We do not discuss computational complexity of synthesis, and the impact of different types of counterexamples on the speed of convergence in this paper. Investigating the computational complexity for concept classes for which synthesis is guaranteed to terminate is left as a topic for future research.

We also present an initial complexity analysis for in case of finite concept classes. The decidability question for finite class of programs is trivial since convergence is guaranteed as long as the queries provide new examples or some new information about the target program. But the speed at which the synthesis approach converges remains relevant even for finite class of programs. We show that the complexity of these techniques is related to well-studied notions in learning theory such as the Vapnik-Chervonenkis dimension [13] and the teaching dimension [22].

3 Background and Related Work

In this section, we contrast the contributions of this paper with the most closely related work and also provide some relevant background.

3.1 Formal Synthesis

The past decade has seen an explosion of work in program synthesis (e.g. [56, 57, 32, 58, 39, 60]. Moreover, there has been a realization that many of the trickiest steps in formal verification involve synthesis of artifacts such as inductive invariants, ranking functions, assumptions, etc. [53, 25]. Most of these efforts have focused on solution techniques for specific synthesis problems. There are two main unifying characteristics across most of these efforts: (i) syntactic restrictions on the space of programs/artifacts to be synthesized in the form of templates, sketches, component libraries, etc., and (ii) the use of inductive synthesis from examples. The recent work on syntax-guided synthesis (SyGuS) [2] is an attempt to capture these disparate efforts in a common theoretical formalism. While SyGuS is about formalizing the synthesis problem, the present paper focuses on formalizing common ideas in the solution techniques. Specifically, we present as a unifying formalism for different solution techniques, along with a theoretical analysis of different variants of CEGIS, the most common instantiation of . In this sense, it is complementary to the SyGuS effort.

3.2 Machine Learning Theory

Another related area is the field of machine learning, particularly the theoretical literature. In Section 1, we outlined some of the key differences between the fields of formal inductive synthesis and that of machine learning. Here we focus on the sub-field of query-based learning that is the closest to the framework. The reader is referred to Angluin’s excellent papers on the topic for more background [5, 6].

A major difference between the query-based learning literature and our work is in the treatment of oracles, specifically, how much control one has over the oracle that answers queries. In query-based learning, the oracles are treated as black boxes that answer particular types of queries and only need to provide one valid response to a query. Moreover, it is typical in the query-based learning literature for the oracle to be specified a priori as part of the problem formulation. In contrast, in our framework, designing a synthesis procedure involves also designing or selecting an oracle. The second major difference is that the query-based learning literature focuses on specific concept classes and proves convergence and complexity results for those classes. In contrast, our work proves results that are generally applicable to programs corresponding to recursive languages.

3.3 Learning of Formal Languages

The problem of learning a formal language from examples is a classic one. We cover here some relevant background material.

Gold [21] considered the problem of learning formal languages from examples. Similar techniques have been studied elsewhere in literature [31, 65, 12, 3]. The examples are provided to learner as an infinite stream. The learner is assumed to have unbounded memory and can store all the examples. This model is unrealistic in a practical setting but provides useful theoretical understanding of inductive learning of formal languages. Gold defined a class of languages to be identifiable in the limit if there is a learning procedure which identifies the grammar of the target language from the class of languages using a stream of input strings. The languages learnt using only positive examples were called text learnable and the languages which require both positive and negative examples were termed informant learnable. None of the standard classes of formal languages are identifiable in the limit from text, that is, from only positive examples [21]. This includes regular languages, context-free languages and context-sensitive languages.

A detailed survey of classical results in learning from positive examples is presented by Lange et al. [41]. The results summarize learning power with different limitations such as the inputs having certain noise, that is, a string not in the target language might be provided as a positive example with a small probability. Learning using positive as well as negative examples has also been well-studied in literature. A detailed survey is presented in [29] and [40]. Lange and Zilles [42] relate Angluin-style query-based learning with Gold-style learning. They establish that any query learner using superset queries can be simulated by a Gold-style learner receiving only positive data. But there are concepts learnable using subset queries but not Gold-style learnable from positive data only. Learning with equivalence queries coincides with Gold’s model of limit learning from positive and negative examples, while learning with membership queries equals finite learning from positive data and negative data. In contrast to this line of work, we present a general framework to learn programs or languages and Angluin-style or Gold-style approaches can be instantiated in this framework. Our theoretical analysis focusses on varying the oracle and the nature of counterexample produced by it to examine the impact of using different types of counterexamples obtainable from verification or testing tools.

3.4 Learning vs. Teaching

We also study the complexity of synthesizing programs from a finite class of programs. This part of our work is related to previous work on the complexity of teaching in exact learning of concepts by Goldman and Kearns [22]. Informally, the teaching dimension of a concept class is the minimum number of instances a teacher must reveal to uniquely identify any target concept from the class. Exact bounds on teaching dimensions for specific concept classes such as orthogonal rectangles, monotonic decision trees, monomials, binary relations and total orders have been previously presented in literature [22, 23]. Shinohara et al. [55] also introduced a notion of teachability in which a concept class is teachable by examples if there exists a polynomial size sample under which all consistent learners will exactly identify the target concept. Salzberg et al. [52] also consider a model of learning with a helpful teacher. Their model requires that any teacher using a particular algorithm such as the nearest-neighbor algorithm learns the target concept. This work assumes that the teacher knows the algorithm used by the learner. We do not make any assumption on the inductive learning technique used by the synthesis engine. Our goal is to obtain bounds on the number of examples that need to be provided by the oracle to synthesize the correct program by relating our framework to the literature on teaching.

4 Theoretical Analysis of CEGIS: Preliminaries

Our presentation of formal inductive synthesis and so far has not used a particular representation of a concept class or specification. In this section, we begin our theoretical formalization of the counterexample-guided inductive synthesis (CEGIS) technique, for which such a choice is necessary. We precisely define the formal inductive synthesis problem for concepts that correspond to recursive languages. We restrict our attention to the case when the specification is partial and is a trace property — i.e., the specification is defined by a single formal language. This assumption, which is the typical case in formal verification and synthesis, also simplifies notation and proofs. Most of our results extend to the case of more general specifications; we will make suitable additional remarks about the general case where needed. For ease of reference, the major definitions and frequently used notation are summarized in Table  1.

4.1 Basic Notation

We use to denote the set of natural numbers. denotes a subset of natural numbers . Consider a set . denotes the minimal element in . The union of the sets is denoted by and the intersection of the sets is denoted by . denotes set minus operation with the resultant set containing all elements in and not in .

We denote the set as . A sequence is a mapping from to . We denote a prefix of length of a sequence by . So, of length is a mapping from to . is an empty sequence also denoted by for brevity. The set of natural numbers appearing in the sequence is defined using a function , where . The set of sequences is denoted by .

Languages and Programs: We also use standard definitions from computability theory which relate languages and programs [51]. A set of natural numbers is called a computable or recursive language if there is a program, that is, a computable, total function such that for any natural number ,

We say that identifies the language . Let denote the language identified by the program . The mapping is not necessarily one-to-one and hence, syntactically different programs might identify the same language. In formal synthesis, we do not distinguish between syntactically different programs that satisfy the specification. Additionally, in this paper, we restrict our discussion to recursive languages because it includes many interesting and natural classes of languages that correspond to programs and functions of various kinds, including regular, context free, context sensitive, and pattern languages.

Given a sequence of non-empty languages , is said to be an indexed family of languages if and only if for all languages , there exists a recursive function such that and for some . Practical applications of program synthesis often consider a family of candidate programs which contain syntactically different programs that are semantically equivalent, that is, they have the same set of behaviors. Formally, in practice program synthesis techniques permit picking such that and for all where the set represents the syntactically different but semantically equivalent programs that produce output on an input if and only if the input natural number belongs to . Intuitively, a function defines an encoding of the space of candidate programs similar to encodings proposed in the literature such as those on program sketching [57] and component interconnection encoding [32]. In the case of formal synthesis where we have a specification , we are only interested in finding a single program satisfying . In the general case, comprises a set of allowed languages, and the task of synthesis is to find a program identifying some element of this set. In the case of partial specifications that are trace properties, comprises subsets of a single target language . Any program identifying some subset of is a valid solution, and usually positive examples are used to rule out programs identifying “uninteresting” subsets of . Thus, going forward, we will define the task of program synthesis as one of identifying the corresponding correct language .

Ordering of elements in the languages: A language corresponds to a set of program behaviors. We model this set in an abstract manner, only assuming the presence of a total order over this set, without prescribing any specific ordering relation. Thus, languages are modeled as sets of natural numbers. While such an assumption might seem restrictive, we argue that this is not the case in the setting of CEGIS, where the ordering relation is used specifically to model the oracle’s preference for returning specific kinds of counterexamples. The assumption ensures that the preferential selection of examples from the language by an oracle is deterministic. For example, consider the case where elements of a language are input/output traces. We can construct a totally ordered set of all possible input/output traces using the length of the trace as the primary ordering metric and the lexicographic ordering as the secondary ordering metric. Thus, an oracle producing smallest counterexample would produce an unique trace which is shortest in length and is lexicographically the smallest. The exact choice of ordering is orthogonal to results presented in our paper, and using the natural numbers allows us to greatly simplify notation.

4.2 CEGIS Definitions

We now specialize the definitions from Sec. 2 for the case of CEGIS. The family of languages defines the concept class for synthesis. The domain for synthesis is the set of natural numbers and the examples are . Recall that we restrict our attention to the special case where the specification is captured by a single target language, i.e., comprising all permitted program behaviors. Therefore, the formal inductive synthesis (FIS) problem defined in Section 2 (Definition 2.2) can be restricted for this setting as follows:

Definition 4.1

Given a language class , a domain of examples , the specification defined by a target language , and an oracle interface , the problem of formal inductive synthesis of languages (and the associated programs) is to identify a language in using only the oracle interface .

Counterexample-guided inductive synthesis (CEGIS) is a solution to the problem of formal inductive synthesis of languages where the oracle interface is defined as follows.

Definition 4.2

A counterexample-guided inductive synthesis (CEGIS) oracle interface where where , , , and the specification is defined as subsets of a target language . The positive witness query returns a positive example , and the correctness query takes as argument a candidate language and either returns a counterexample showing that the candidate language is incorrect or returns if .

Symbol Meaning Symbol Meaning
natural numbers natural numbers less than
minimal element in set set minus
set intersection set union
sequence of numbers empty sequence
sequence of length th element of sequence
language (a subset of ) complement of language
program for language corresponding to
natural numbers in set of sequences
family of languages family of programs
transcript counterexample transcript
synthesis engine inductive learning engine
verification oracle for minimal counterexample oracle
bounded counterexample oracle positive bounded counterexample oracle
20cm language identified by
inf memory cegis engine 20cmlanguage identified by
finite memory cegis engine
with with
with with
with with
Table 1: Frequently used notation in the paper

We now define the verification oracle in CEGIS that produces arbitrary counterexamples, as well as its three other variants which generate particular kinds of counterexamples instead of arbitrary counterexamples.

Definition 4.3

A verifier for language is a mapping from to such that if and only if , and otherwise.

Remark: For more general specifications that are a set of languages, the definition of changes in a natural way: it returns if and only if and otherwise returns an example that is in the intersection of the symmetric differences of each language and the candidate language .

We define a minimal counterexample generating verifier below. The counterexamples are minimal with respect to an ordering of members in the language. In practice, if traces are used as examples, then a minimal counterexample can be a trace of minimal length.

Definition 4.4

A verifier for a language is a mapping from to such that
if and only if , and otherwise.

Next, we consider another variant of counterexamples, namely (constant) bounded counterexamples. Bounded model-checking [11] returns a counterexample trace for an incorrect design if it can find a counterexample of length less than the specified constant bound. It fails to find a counterexample for an incorrect design if no counterexample exists with length less than the given bound. Verification of concurrent programs by bounding the number of context switches [8] is another example of bounded verification technique. This motivates the definition of a verifier which returns counterexamples bounded by a constant .

Definition 4.5

A verifier is a mapping from to such that where for the given bound , and otherwise.

The last variant of counterexamples is positive bounded counterexamples. The verifier for generating positive bounded counterexample is also provided with the transcript seen so far by the synthesis engine. The verifier generates a counterexample smaller than the largest positive example in the transcript. If there is no counterexample smaller than the largest positive example in the transcript, then the verifier does not return any counterexample. This is motivated by the practice of mutating correct traces to find bugs in programs and designs. The counterexamples in these techniques are bounded by the size of positive examples (traces) seen so far.111Note that we can extend this definition to include counterexamples of size bounded by that of the largest positive example seen so far plus a constant. The proof arguments given in Sec. 5 continue to work with only minor modifications.

Definition 4.6

A verifier is a mapping from to such that where for some , and otherwise.

The sequence of responses of the positive witness query is called the transcript, and the sequence of the responses to the correctness queries is called the counterexample sequence. The positive witness queries can be answered by the oracle sampling examples from the target language. Our work uses the standard model for language learning in the limit [21], where the learner has access to an infinite stream of positive examples from the target language. This is also realisic in practical CEGIS settings for infinite concept classes (e.g. [37]) where more behaviors can be sampled over time. We formalize these terms below.

Definition 4.7

A transcript for a specification language is a sequence with . denotes the prefix of the transcript of length . denotes the -th element of the transcript.

Definition 4.8

A counterexample sequence for a specification language from a correctness query is a sequence with , where denotes the prefix of the counterexample sequence of length , denotes the -th element of the counterexample sequence, and is the argument of the -th invocation of the query .

We now define the oracle for counterexample guided inductive synthesis. We drop the queries in dialogue since there are only two kind of queries and instead only use the sequence of responses: transcript and the counterexample sequence . The oracle also receives as input the current candidate language to be used as the argument of the query. The overall response of the oracle is a pair of elements in .

Definition 4.9

An oracle for counterexample-guided inductive synthesis (CEGIS oracle) is a mapping such that where is the response to positive witness query and is the response to correctness query . The oracle can use any of the four verifiers presented earlier to generate the counterexamples for the correctness query. The oracle using is called , the one using is called , the one using is called and the one using is called .

We make the following reasonable assumption on the oracle. The oracle is assumed to be consistent: it does not provide the same example both as a positive example (via a positive witness query) and as a negative example (as a counterexample). Second, the oracle is assumed to be non-redundant: it does not repeat any positive examples that it may have previously provided to the learner; for a finite target language, once the oracle exhausts all positive examples, it will return .

The learner is simplified to be a mapping from the sequence of responses to a candidate program.

Definition 4.10

An infinite memory learner is a function such that where includes all positive examples in and excludes all examples in 222This holds due to the specialization of to a partial specification, and as a trace property. For general , the learner need not exclude all counterexamples. is a predefined constant representing an initial guess of the language, which, for example, could be .

We now define a finite memory learner which cannot take the unbounded sequence of responses as argument. The finite memory learner instead uses the previous candidate program to summarize the response sequence. We assume that languages are encoded in terms of a finite representation such as a program that identifies that language. Such an iterative learner only needs finite memory.

Definition 4.11

A finite memory learner is a recursive function such that for all , , where includes all positive examples in and excludes all examples in . We define to be the initial guess of the language, which for example, could be .

The synthesis engine using infinite memory can now be defined as follows.

Definition 4.12

An infinite memory engine is a pair comprising a CEGIS oracle and an infinite memory learner , where for any transcript and counterexample sequence , and for all , if then .

A synthesis engine with finite memory cannot store unbounded infinite transcripts. So, the bounded memory synthesis engine uses a finite memory learner .

Definition 4.13

A finite memory engine is a tuple comprising a CEGIS oracle and a finite memory learner where, for any transcript and counterexample sequence , and for all , if then .

Similar to Definition 2.4, the convergence of counterexample-guided synthesis engine is defined as follows:

Definition 4.14

We say that converges to using transcript , written
, if and only if there exists such that for all ,

This notion of convergence is standard in language learning in the limit [21]. For the case of general specifications , as given in Definition 2.4, the synthesizer must converge to some language in .

Definition 4.15

identifies a language if and only if for all transcripts for the specification language , converges to , that is, .

As per Definition 4.7, a transcript is an infinite sequence of examples which contains all the elements in the target language. Definition 4.14 requires the synthesis engine to converge to the correct language after consuming a finite part of the transcript. This notion of convergence is standard in the literature on language learning in the limit [21] 333In this framework, a synthesis engine is only required to converge to the correct concept without requiring it to recognize it has converged and terminate. For a finite concept or language, termination can be trivially guaranteed when oracle is assumed to be non-redundant and does not repeat examples. The synthesis engine terminates when no new counterexample or positive example is possible. (which is adopted in many implementations of CEGIS for finite target concept).. Definition 4.15 requires that, for any transcript, the synthesis engine must identify the language using only a finite number of positive examples and counterexamples even if the language is infinite.

We extend Definition 4.15 to general specifications as follows: identifies a specification if it identifies some language in . As noted before, this section focuses on the case of a partial specification that is a trace property. In this case, comprises all subsets of a target language . Since Definition 4.7 defines a transcript as comprising all positive examples in and Definition 4.15 requires convergence for all possible transcripts, the two notions of identifying and identifying coincide. We therefore focus in Sec. 5 purely on language identification with the observation that our results carry over to the case of “specification identification”.

Definition 4.16

identifies a language family if and only if identifies every language .

The above definition extends to families of specifications in an exactly analogous manner.

We now define the set of language families that can be identified by the inductive synthesis engines as formally below. (By replacing “language family” with “specification family”, we get the analogous definitions for the general case.)

Definition 4.17


The convergence of synthesis engine to the correct language, identification condition for a language, and language family identified by a synthesis engine are defined similarly in other cases. The synthesis engines and the corresponding set of language families are listed in Table 2.

Learner / Oracle
Finite memory
Infinite memory
Table 2: Synthesis engines and corresponding sets of language families

5 Theoretical Analysis of CEGIS: Results

In this section, we present the theoretical results when the class of languages (programs) is infinite. We consider two axes of variation. We first consider the case in which the inductive learning technique has finite memory in Section 5.1, and then the case in which it has infinite memory in Section 5.2. For both cases, we consider the four kind of counterexamples mentioned in Section 1 and Section 4; namely, arbitrary counterexamples, minimal counterexamples, constant bounded counterexamples and positive bounded counterexamples.

For simplicity, our proofs focus on the case of partial specifications that are trace properties, the common case in formal verification and synthesis. Thus, comprises subsets of a target specification language . However, many of the results given here extend to the case of general specifications. Most of our theorems show differences between language classes for CEGIS variants — i.e., theorems showing that there is a specification on which one variant of CEGIS converges while the other does not — and for these, it suffices to show such a difference for the more restricted class of partial specifications. The results also extend in the case of equality between language classes (e.g., Theorem 5.1) in certain cases; we make suitable remarks alongside.

5.1 Finite Memory Inductive Synthesis

We investigate the four language classes and identified by the synthesis engines , , and and establish relations between them. We show that , and .

5.1.1 Minimal vs. Arbitrary Counterexamples

We begin by showing that replacing a deductive verification engine which returns arbitrary counterexamples with a deductive verification engine which returns minimal counterexamples does not change the power of counterexample-guided inductive synthesis. The result is summarized in Theorem 5.1.

Theorem 5.1

The power of synthesis techniques using arbitrary counterexamples and those using minimal counterexamples are equivalent, that is, .

  • is a special case of in that a minimal counterexample reported by can be treated as arbitrary counterexample to simulate using . Thus, .

    The more interesting case to prove is . For a language , let converge to the correct language on transcript . We show that can simulate and also converge to on transcript . The proof idea is to show that a finite learner can simulate by making a finite number of calls to . Therefore, the learner sees the same counterexample sequence with as with and thus converges to the same language in both cases.

    Consider an arbitrary step of the dialogue between learner and verifier when a counterexample is returned. Let the arbitrary counterexample returned by verifier for a candidate language be , that is . Thus, is an upper bound on the minimal counterexample returned by . The latter can be recovered using the following characterization:

    The learner can thus perform at most queries to to compute the minimal counterexample that would be returned by . In case of totally ordered set (such as ), we could do this more efficiently using binary search. At each stage of the iteration, the learner needs to store the smallest counterexample returned so far. Thus, the work performed by the learner in each iteration to craft queries to can be done with finite memory. can be computed using finite memory and using at most calls of .

    Thus, can simulate by finding the minimal counterexample at each step using the verifier iteratively as described above. This implies that .

Thus, successfully converges to the correct language if and only if also successfully converges to the correct language. So, there is no increase or decrease in power of synthesis by using deductive verifier that provides minimal counterexamples.

Remark: The above result (and its analog in Sec. 5.2) also holds in the case of general specifications when CEGIS is replaced by Generalized CEGIS. In particular, if either crafted correctness () or membership queries () are introduced, then it is easy to show that can simulate by mimicking each step of by recovering the same counterexample it used with suitable or queries. In this case, can converge to every language that converges to, and hence identifies the same class of specifications.

5.1.2 Bounded vs. Arbitrary Counterexamples

We next investigate and compare its relative synthesis power compared to . As intuitively expected, is strictly less powerful than as summarized in Theorem 5.2 which formalizes the intuition.

Theorem 5.2

The power of synthesis techniques using bounded counterexamples is less than those using counterexamples, that is, .

  • Since bounded counterexample is also a counterexample, we can easily simulate a bounded verifier using a by ignoring counterexamples from if they are larger than a specified bound which is a fixed parameter and can be stored in the finite memory of the inductive learner. Thus, .

    We now describe a language class for which the corresponding languages cannot be identified using bounded counterexamples.

    Language Family 1

    : where is the bound used by the verifier producing constant bounded counterexamples.

    The language family 1 formed by lower bounding the elements by some fixed constant, that is, where is the bound used by the verifier producing bounded counterexamples. Clearly, such a verifier would not return any counterexample for a language in this class and hence, cannot identify languages in this class of languages while can. So, .

We next analyze , and show that it is not equivalent to or contained in it. So, replacing a deductive verification engine which returns arbitrary counterexamples with a verification engine which returns counterexamples bounded by history of positive examples has impact on the power of the synthesis technique. But this does not strictly increase the power of synthesis. Instead, the use of positive history bounded counterexamples allows languages from new classes to be identified but at the same time, language from some language classes which could be identified by can no longer be identified using positive bounded counterexamples. The main result regarding the power of synthesis techniques using positive bounded counterexamples is summarized in Theorem 5.3.

Theorem 5.3

The power of synthesis techniques using arbitrary counterexamples and those using positive bounded counterexamples are not equivalent, and none is more powerful than the other. . In fact, and .

We prove this using the following two lemma. The first lemma 5.4 shows that there is a family of languages from which a language can be identified by but, this cannot be done by . The second lemma 5.5 shows that there is another family of languages from which a language can be identified by but not by .

Lemma 5.4

There is a family of languages such that cannot identify a language in but can identify , that is, .

  • Now, consider the language family 2 formed by upper bounding the elements by some fixed constant. Let the target language (for which we want to identify . In rest of the proof, we also refer to this family as for brevity.

    Language Family 2

    such that

    If we obtain a transcript at any point in synthesis using positive bounded counterexamples, then for any intermediate language proposed by , would always return since all the counterexamples would be larger than any element in . This is the consequence of the chosen languages in which all counterexamples to the language are larger than any positive example of the language. So, cannot identify the target language .

    But we can easily design a synthesis engine using arbitrary counterexamples that can synthesize corresponding to the target language . The algorithm starts with as its initial guess. If there is no counterexample, the algorithm next guess is . In each iteration , the algorithm guesses as long as there are no counterexamples. When a counterexample is returned by on the guess , the algorithm stops and reports the previous guess as the correct language.

    Since the elements in each language is bounded by some fixed constant , the above synthesis procedure is guaranteed to terminate after iterations when identifying any language . Further, did not return any counterexample up to iteration and so, . And in the next iteration, a counterexample was generated. So, . Since, the languages in form a monotonic chain . So, . In fact, and in the -th iteration, the language is correctly identified by .

    Thus, .

This shows that can be used to identify languages when will fail. Putting a restriction on the verifier to only produce counterexamples which are bounded by the positive examples seen so far does not strictly increase the power of synthesis.

We now show that this restriction enables identification of languages which cannot be identified by .

In the proof below, we construct a language which is not distinguishable using arbitrary counterexamples and instead, it relies on the verifier keeping a record of the largest positive example seen so far and restricting counterexamples to those below the largest positive example.

Lemma 5.5

There is a family of languages such that, cannot identify a language in but can identify , that is, .

  • Consider the language

    We now construct a family of languages which are finite subsets of and have at least one member of the form , that is,

    We now consider the language

    Now, let be the family of languages such that the smallest element member in the language is the same as the index of the language, that is,

    Now, we consider the following family of languages below.

    Language Family 3

    We refer to this language as in rest of the proof for brevity. We show that there is a language in such that the language cannot be identified by but can identify any language in .

    The key intuition is as follows. If the examples seen by synthesis algorithm till some iteration are all of the form , then any synthesis technique cannot differentiate whether the language belongs to or . If the language belongs to , the synthesis engine would eventually obtain an example of the form (since each language in has at least one element of this kind and these languages are finite). While the synthesis technique using arbitrary counterexamples cannot recover the previous examples, the techniques with access to the verifier which produces positive bounded counterexamples can recover all the previous examples.

    We now specify a which can identify languages in . The synthesis approach works in two possible steps.

    • [leftmargin=*]

    • Until an example is seen by the synthesis engine, let be the smallest member element seen so far in the transcript, the learner proposes as the language. If the target language , the learner would eventually identify the language since the minimal element will show up in the transcript. If the target language , then eventually, an example of the form will be seen since must have one such member element. And after such an example is seen in the transcript, the synthesis engine moves to second step.

    • After an example of the form is seen, the synthesis engine can now be sure that the language belongs to and is finite. Now, the learner can discover all the positive examples seen so far using the following trick. We first discover the upper bound on positive examples seen so far.

      Recall that are not in the target language since they are not in any of the languages in the to which the target language belongs. will return the only element in the proposed candidate language as a counterexample as long as there is some positive example seen previously such that . So, is the upper bound on all the positive examples seen so far. The learner can now construct singleton languages for such that . If a counterexample is returned by then is not in the target language. If no counterexample is returned, then is in the target language. This allows the synthesis engine to recover all the positive examples seen previously in finite steps. As we recover the positive examples, we run a Gold style algorithm for identifying finite languages [30] to converge to the correct language. Thus, the learner would identify the correct language using finite memory.

    We now prove that does not identify this family of languages. Let us assume that . So, there is a synthesis engine which can identify languages in . So, must converge to any language after some finite transcript . Let us consider an extension of such that