In recent years, program synthesis has emerged as a promising technique for automating low-level aspects of programming (Gulwani et al., 2012; Solar-Lezama, 2013; Torlak and Bodík, 2014). Synthesis technology enables users to create programs by describing desired behavior with input-output examples (Osera and Zdancewic, 2015; Feser et al., 2015; Smith and Albarghouthi, 2016; Feng et al., 2017a; Feng et al., 2017b, 2018; Wang et al., 2018, 2017), natural language (Yaghmazadeh et al., 2017), and partial or complete formal specifications (Srivastava et al., 2010; Kneuss et al., 2013; Polikarpova et al., 2016; Inala et al., 2017; Qiu and Solar-Lezama, 2017). If the input is a formal specification, synthesis algorithms can not only create a program but also a proof that the program meets the given specification (Srivastava et al., 2010; Kneuss et al., 2013; Polikarpova et al., 2016; Qiu and Solar-Lezama, 2017).
One of the greatest challenges in software development is to write programs that are not only correct but also efficient with respect to memory usage, execution time, or domain specific resource metrics. For this reason, automatically optimizing program performance has long been a goal of synthesis, and several existing techniques tackle this problem for low-level straight-line code (Schkufza et al., 2013; Phothilimthana et al., 2014; Sharma et al., 2015; Phothilimthana et al., 2016; Bornholt et al., 2016) or add efficient synchronization to concurrent programs (Cerný et al., 2011; Gupta et al., 2015; Cerný et al., 2015a; Ferles et al., 2018). However, the developed techniques are not applicable to recent advances in the synthesis of high-level looping or recursive programs manipulating custom data structures (Kneuss et al., 2013; Osera and Zdancewic, 2015; Feser et al., 2015; Polikarpova et al., 2016; Inala et al., 2017; Qiu and Solar-Lezama, 2017). These techniques lack the means to analyze and understand the resource usage of the synthesized programs. Consequently, they cannot take into account the program’s efficiency and simply return the first program that arises during the search and satisfies the functional specification.
In this work, we study the problem of synthesizing high-level recursive programs given both a functional specification of a program and a bound on its resource usage. A naive solution would be to first generate a program using conventional program synthesis and then use existing automatic static resource analyses (Hoffmann et al., 2012; Peng Wang, 2017; Cicek et al., 2017) to check whether its resource usage satisfies the bound. Note, however, that for recursive programs, both synthesis and resource analysis are undecidable in theory and expensive in practice. Instead, in this paper we propose resource-guided synthesis: an approach that tightly integrates program synthesis and resource analysis, and uses the resource bound to guide the synthesis process, generating programs that are efficient by construction.
In a nutshell, the idea of this work is to combine type-driven program synthesis, pioneered in the work on Synquid (Polikarpova et al., 2016), with type-based automatic amortized resource analysis (AARA) (Hofmann and Jost, 2003; Jost et al., 2010; Hoffmann et al., 2011, 2017) as implemented in Resource Aware ML (RaML) (Hoffmann, 2018). Type-driven synthesis and AARA are a perfect match because they are both based on decidable, constraint-based type systems that can be easily checked with off-the-shelf constraint solvers.
In Synquid, program specifications are written as refinement types (Vazou et al., 2013; Knowles and Flanagan, 2009). The key to efficient synthesis is round-trip type checking, which uses an SMT solver to aggressively prune the search space by rejecting partial programs that do not meet the specification (see Sec. 2.1). Until now, types have only been used in the context of synthesis to specify functional properties.
AARA is a type-based technique for automatically deriving symbolic resource bounds for functional programs. The idea is to add resource annotations to data types, in order to specify a potential function that maps values of that type to non-negative numbers. The type system ensures that the initial potential is sufficient to cover the cost of the evaluation. By a priori fixing the shape of the potential functions, type inference can be reduced to linear programming (seeSec. 2.2).
The Type System
The first contribution of this paper is a new type system, which we dub —for refinements and resources—that combines polymorphic refinement types with AARA (Sec. 3). is a conservative extension of Synquid’s refinement type system and RaML’s affine type system with linear potential annotations. As a result, can express logical assertions that are required for effectively specifying program synthesis problems. In addition, the type system features annotations of numeric sort in the same refinement language to express potential functions. Using such annotations, programmers can express precise resource bounds that go beyond the template potential functions of RaML.
The features that distinguish from other refinement-based type systems for resource analysis (Peng Wang, 2017; Cicek et al., 2017; Radicek et al., 2018) are (1) the combination of logical and quantitative refinements and (2) the use of AARA, which simplifies resource constraints and naturally applies to non-monotone resources like memory that can become available during the execution. These features also pose nontrivial technical challenges: the interaction between substructural and dependent types is known to be tricky (Krishnaswami et al., 2015; Lago and Gaboardi, 2011), while polymorphism and higher-order functions are challenging for AARA (one solution is proposed in (Jost et al., 2010), but their treatment of polymorphism is not fully formalized).
In addition to the design of , we prove the soundness of the type system with respect to a small-step cost semantics. In the formal development, we focus on a simple call-by-value functional language with Booleans and lists, where type refinements are restricted to linear inequalities over lengths of lists. However, we structure the formal development to emphasize that can be extended with user-defined data types, more expressive refinements, or non-linear potential annotations. The proof strategy itself is a contribution of this paper. The type soundness of the logical refinement part of the system is inspired by TiML (Peng Wang, 2017). The main novelty is the soundness proof of the potential annotations using a small-step cost semantics instead of RaML’s big-step evaluation semantics.
Type-Driven Synthesis with
The second contribution of this paper is a resource-guided synthesis algorithm based on . In Sec. 4, we first develop a system of synthesis rules that prescribe how to derive well-typed programs from types, and prove its soundness wrt. the type system. We then show how to algorithmically derive programs using a combination of backtracking search and constraint solving. In particular this requires solving a new form of constraints we call resource constraints, which are constrained linear inequalities over unknown numeric refinement terms. To solve resource constraints, we develop a custom solver based on counter-example guided inductive synthesis (Solar-Lezama et al., 2006) and SMT (de Moura and Bjørner, 2008).
The ReSyn Synthesizer
The third contribution of this paper is the implementation and experimental evaluation of the first resource-aware synthesizer for recursive programs. We implemented our synthesis algorithm in a tool called ReSyn, which takes as input (1) a goal type that specifies the logical refinements and resource requirements of the program, and (2) types of components (i.e. library functions that the program may call). ReSyn then synthesizes a program that provably meets the specification (assuming the soundness of components).
To evaluate the scalability of the synthesis algorithm and the quality of the synthesized programs, we compare ReSyn with baseline Synquid on a variety of general-purpose data structure operations, such as eliminating duplicates from a list or computing common elements between two lists. The evaluation (Sec. 5) shows that ReSyn is able to synthesize programs that are asymptotically more efficient than those generated by Synquid. Moreover, the tool scales better than a naive combination of synthesis and resource analysis.
2. Background and Overview
This section provides the necessary background on type-driven program synthesis (Sec. 2.1) and automatic resource analysis (Sec. 2.2). We then describe and motivate their combination in and showcase novel features of the type system (Sec. 2.3). Finally, we demonstrate how can be used for resource-guided synthesis (Sec. 2.4).
2.1. Type-Driven Program Synthesis
Type-driven program synthesis (Polikarpova et al., 2016) is a technique for automatically generating functional programs from their high-level specifications expressed as refinement types (Knowles and Flanagan, 2009; Rondon et al., 2008). For example, a programmer might describe a function that computes the common elements between two lists using the following type signature: common::l1:List a -> l2:List a -> _v:List a|elems _v == elems l1 *set elems l2 Here, the return type of common is refined with the predicate elems _v == elems l1 *set elems l2, which restricts the set of elements of the output list 111Hereafter the bound variable of the refinement is always called and the binding is omitted. to be the intersection of the sets of elements of the two arguments. Here elems is a user-defined logic-level function, also called measure (Kawaguchi et al., 2009; Vazou et al., 2013). In addition to the synthesis goal above, the synthesizer takes as input a component library: signatures of data constructors and functions it can use. In our example, the library includes the list constructors Nil and Cons and the function member::x:a -> l:List a -> Bool|_v = (x in elems l) which determines whether a given value is in the list. Given this goal and components, the type-driven synthesizer Synquid (Polikarpova et al., 2016) produces an implementation of common in Fig. 1.
The Synthesis Mechanism
Type-driven synthesis works by systematically exploring the space of programs that can be built from the component library and validating candidate programs against the goal type using a variant of liquid type inference (Rondon et al., 2008). To validate a program against a refinement type, liquid type inference generates a system of subtyping constraints over refinement types. The subtyping constraints are then reduced to implications between refinement predicates. For example, checking common xs l2 in line 3 of Fig. 1 against the goal type reduces to validating the following implication:
Since this formula belongs to a decidable theory of uninterpreted functions and arrays, its validity can be checked by an SMT solver (de Moura and Bjørner, 2008). In general, the generated implications may contain unknown predicates. In this case, type inference reduces to a system of constrained horn clauses (Bjørner et al., 2015), which can be solved via predicate abstraction.
Synthesis and Program Efficiency
The program in Fig. 1 is correct, but not particularly efficient: it runs roughly in time , where is the length of l1 and is the length of l2, since it calls the member function (a linear scan) for every element of l1. The programmer might realize that keeping the input lists sorted would enable computing common elements in linear time by scanning the two lists in parallel. To communicate this intent to the synthesizer, they can define the type of (strictly) sorted lists by augmenting a traditional list definition with a simple refinement: data SList a where SNil::SList a SCons::x:a -> xs:SList a|x < _v -> SList a This definition says that a sorted list is either empty, or is constructed from a head element x and a tail list xs, as long as xs is sorted and all its elements are larger than x.222Following Synquid, our language imposes an implicit constraint on all type variables to support equality and ordering. Hence, they cannot be instantiated with arrow types. This could be lifted by adding type classes. Given an updated synthesis goal (where selems is a version of elems for SList) common’::l1:SList a -> l2:SList a -> _v:List a|elems _v == selems l1 *set selems l2 and a component library that includes List, SList, and (but not member!), Synquid can synthesize an efficient program shown in in Fig. 2.
However, if the programmer leaves the function member in the library, Synquid will synthesize the inefficient implementation in Fig. 1. In general, Synquid explores candidate programs in the order of size and returns the first one that satisfies the goal refinement type. This can lead to suboptimal solutions, especially as the component library grows larger and allows for many functionally correct programs. To avoid inefficient solutions, the synthesizer has to be aware of the resource usage of the candidate programs.
2.2. Automatic Amortized Resource Analysis
To reason about the resource usage of programs we take inspiration from automatic amortized resource analysis (AARA) (Hofmann and Jost, 2003; Jost et al., 2010; Hoffmann et al., 2011, 2017). AARA is a state-of-the-art technique for automatically deriving symbolic resource bounds on functional programs, and is implemented for a subset of OCaml in Resource Aware ML (RaML) (Hoffmann et al., 2017; Hoffmann, 2018). For example, RaML is able to automatically derive the worst-case bound on the number of recursive calls for the function common and for common’ 333In this section we assume for simplicity that the resource of interest is the number of recursive calls. Both AARA and our type system support user-defined cost metrics (see Sec. 3 for details)..
AARA is inspired by the potential method for manually analyzing the worst-case cost of a sequence of operations (Tarjan, 1985a). It uses annotated types to introduce potential functions that map program states to non-negative numbers. To derive a bound, we have to statically ensure that the potential at every program state is sufficient to cover the cost of the next transition and the potential of the following state. In this way, we ensure that the initial potential is an upper bound on the total cost.
The key to making this approach effective is to closely integrate the potential functions with data structures (Hofmann and Jost, 2003; Jost et al., 2010). For instance, in RaML the type stands for a list that contains one unit of potential for every element. This type defines the potential function . The potential can be used to pay for a recursive call (or, in general, cover resource usage) or to assign potential to other data structures.
Potential annotations can be derived automatically by starting with a symbolic type derivation that contains fresh variables for the potential annotations of each type, and applying syntax directed type rules that impose local constraints on the annotations. The integration of data structures and potential ensures that these constraints are linear even for polynomial potential annotations.
2.3. Bounding Resources with
To reason about resource usage in type-driven synthesis, we integrate AARA’s potential annotations and refinement types into a novel type system that we call . In , a refinement type can be annotated with a potential term of numeric sort, which is drawn from the same logic as refinements. Intuitively, the type denotes values of refinement type with units of potential. In the rest of this section we illustrate features of on a series of examples, and delay formal treatment to Sec. 3.
With potential annotations, users can specify that common’ must run in time at most , by giving it the following type signature: common’::l1:SList a^1 -> l2:SList a^1 -> _v:List a|elems _v == selems l1 *set selems l2 This type assigns one unit of potential to every element of the arguments l1 and l2, and hence only allows making one recursive call per element of each list. Whenever resource annotations are omitted, the potential is implicitly zero: for example, the elements of the result carry no potential.
Our type checker uses the following reasoning to argue that this potential is sufficient to cover the efficient implementation in Fig. 2
. Consider the recursive call in line 4, which has a cost of one. Pattern-matching l1 against SCons x xs transfers the potential from l1 to the binders, resulting in typesand . The unit of potential associated with x can now be used to pay for the recursive call. Moreover, the types of the arguments, xs and l2, match the required type , which guarantees that the potential stored in the tail and the second list are sufficient to cover the rest of the evaluation. Other recursive calls are checked in a similar manner.
Importantly, the inefficient implementation in Fig. 1 would not type-check against this signature. Assuming that member is soundly annotated with member::x:a -> l:List a^1 -> Bool|_v = (x in elems l) (requiring a unit of potential per element of l), the guard in line 2 consumes all the potential stored in l2; hence the occurrence of l2 in line 3 has the type , which is not a subtype of .
Dependent Potential Annotations
In combination with logical refinements and parametric polymorphism, this simple extension to the Synquid’s type system turns out to be surprisingly powerful. Unlike in RaML, potential annotations in can be dependent, i.e. mention program variables and the special variable . Dependent annotations can encode fine-grained bounds, which are out of reach for RaML. As one example, consider function range a b that builds a list of all integers between and ; we can express that it takes at most steps by giving the argument a type . As another example, consider insertion into a sorted list insert x xs; we can express that it takes at most as many steps as there are elements in that are smaller than , by giving the type (i.e. only assigning potential to elements that are smaller than ). These fine-grained bounds are checked completely automatically in our system, by reduction to constraints in SMT-decidable theories.
Another source of expressiveness in is parametric polymorphism: since potential annotations are attached to types, type polymorphism gives us resource polymorphism for free. Consider two functions in Fig. 3, triple and tripleSlow, which implement two different ways to append a list l to two copies of itself. Both of them make use of a component function append, whose type indicates that it makes a linear traversal of its first argument. Intuitively, triple is more efficient that tripleSlow because in the former both calls to append traverse a list of length , whereas in the latter the outer call traverses a list of length . This difference is reflected in the signatures of the two functions: tripleSlow requires three units of potential per list element, while triple only requires two.
Checking that tripleSlow satisfies this bound is somewhat nontrivial because the two applications of append must have different types: the outer application must return List Int, while the inner application must return (i.e. carry enough potential to be traversed by append). RaML’s monomorphic type system is unable to assign a single general type to append, which can be used at both call sites. So the function has be reanalyzed at every (monomorphic) call site. , on the other hand, handles this example out of the box, since the type variable a in the type of append can be instantiated with Int for the outer occurrence and with for the inner occurrence, yielding the type xs:List Int^2 -> ys:List Int^1 -> List Int^1|…
As a final example, consider the standard map function: map::(a -> b) -> List a -> List b Although this type has no potential annotations, it implicitly tells us something about the resource behavior of map: namely, that map applies a function to each list element at most once. This is because a can be instantiated with a type with an arbitrary amount of potential, and the only way to pay for this potential is with a list element (which also has type a).
2.4. Resource-guided Synthesis with ReSyn
We have extended Synquid with support for types in a new program synthesizer ReSyn. Given a resource-annotated signature for common’ from Sec. 2.3 and a component library that includes member, ReSyn is able to synthesize the efficient implementation in Fig. 2. The key to efficient synthesis is type-checking each program candidate incrementally as it is being constructed, and discarding an ill-typed program prefix as early as possible. For example, while enumerating candidates for the function common’, we can safely discard the inefficient version from Fig. 1 even before constructing the second branch of the conditional (because the first branch together with the guard use up too many resources). Hence, as we explain in more detail in Sec. 4, a key technical challenge in ReSyn has been a tight integration of resources into Synquid’s round-trip type checking mechanism, which aggressively propagates type information top-down from the goal and solves constraints incrementally as they arise.
In addition to making the synthesizer resource-aware, types also subsume and generalize Synquid’s termination checking mechanism. To avoid generating diverging functions, Synquid uses a simple termination metric (the tuple of function’s arguments), and checks that this metric decreases at every recursive call. Using this metric, Synquid is not able to synthesize the function range from Sec. 2.3, because it requires a recursive call that decreases the difference between the arguments, . In contrast, ReSyn need not reason explicitly about termination, since potential annotations already encode an upper bound on the number of recursive calls. Moreover, the flexibility of these annotations enables ReSyn to synthesize programs that require nontrivial termination metrics, such as range.
3. The Type System
In this section, we define a subset of as a formal calculus to prove type soundness. This subset includes Booleans that are refined by their values, and lists that are refined by their lengths. The programs in Sec. 1 and Sec. 2 use Synquid’s surface syntax. The gap from the surface language to the core calculus involves inductive types and refinement-level measures. The restriction to this subset in the technical development is only for brevity and proofs carry over to all the features of Synquid.
Fig. 4 presents the grammar of terms in via abstract binding trees (Harper, 2016). The core language is basically the standard lambda calculus augmented with Booleans and lists. A value is either a boolean constant, a list of values, or a function. Expressions in are in a-normal-form (Sabry and Felleisen, 1992), which means that syntactic forms occurring in non-tail position allow only atoms , i.e., variables and values; this restriction simplifies typing rules for applications, as we explain below. We identify a subset of that contains atoms interpretable in the refinement logic. Intuitively, the value of an should be either a Boolean or a list. The syntactic form is introduced as a placeholder for unreachable code, e.g., the else-branch of a conditional whose predicate is always true.
The syntactic form is used to specify resource usage, and it is intended to cost units of resource and then reduce to . If the cost is negative, then units of resource will become available in the system. terms support flexible user-defined cost metrics: for example, to count recursive calls, the programmer may wrap every such call in ; to keep track of memory consumption, they might wrap every data constructor in , where is the amount of memory that constructor allocates.
The resource usage of a program is determined by a small-step operational cost semantics. The semantics is a standard one augmented with a resource parameter. A step in the evaluation judgment has the form where and are expressions and are nonnegative integers. For example, the following is the rule for . The multi-step evaluation relation is the reflexive transitive closure of . The judgment expresses that with units of available resources, evaluates to without running out of resources and resources are left. Intuitively, the high-water mark resource usage of an evaluation of to is the minimal such that . For monotone resources like time, the cost is the sum of costs of all the evaluated expressions. In general, this net cost is invariant, that is, if and , where is the relation obtained by self-composing for times.
We now combine Synquid’s type system with AARA to reason about resource usage. Fig. 5 shows the syntax of the type system. Refinements
are distinct from program terms and classified by sorts. ’s sorts include Booleans , natural numbers , and uninterpreted symbols . Refinements can be logical formulas and linear expressions, which may reference program variables. Logical refinements have sort , while potential annotations have sort . interprets a variable of Boolean type as its value, list type as its length, and type variable as an uninterpreted symbol with a corresponding sort . We use the following interpretation to reflect interpretable atoms in the refinement logic:
We classify types into four categories. Base types include Booleans, lists and type variables. Type variables are annotated with a multiplicity , which denotes an upper bound on the number of usages of a variable like in bounded linear logic (Girard et al., 1992). For example, denotes a universal list whose elements can be used at most twice.
Refinement types are subset types and dependent arrow types. The inhabitants of the subset type are values of type that satisfy the refinement . The refinement is a logical predicate over program variables and a special value variable , which does not appear in the program and stands for the inhabitant itself. For example, is a type of , and represents Boolean lists of length at most 5. Dependent arrow types are function types whose return type may reference the formal argument . As type variables, these function types are also annotated with a multiplicity restricting the number of times the function may be applied.
To apply the potential method of amortized analysis (Tarjan, 1985b), we need to define potentials with respect to the data structures in the program. We introduce resource-annotated types as a refinement type augmented with a potential annotation, written . Intuitively, assigns units of potential to values of the refinement type . The potential annotation may also reference the value variable . For example, describes Boolean lists with units of potential where is the length of . The same potential can be expressed by assigning units of potential to every element using the type .
Type schemas represent (possibly) polymorphic types. Note that the type quantifier can only appear outermost in a type.
Similar to Synquid, we introduce a notion of scalar types, which are resource-annotated base types refined by logical constraints. Intuitively, interpretable atoms are scalars and only allows the refinement-level logic to reason about values of scalar types. We will abbreviate as , as , as , and as .
In , the typing context is a sequence of variable bindings , type variables , path conditions , and free potentials . Our type system consists of five judgments: sorting, well-formedness, subtyping, sharing, and typing. We omit sorting and well-formedness rules and include them in Appendix A. The sorting judgment states that a refinement has a sort under a context . A type is said to be well-formed under a context , written , if every referenced variable in it is in the correct scope.
Fig. 6 presents selected typing rules for . The typing judgment states that the expression has type in context . The intuitive meaning is that if there is at least the amount resources as indicated by the potential in the context then this suffices to evaluate to a value , and after the evaluation there are at least as many resources available as indicated by the potential in . The auxiliary typing judgment assigns base types to interpretable atoms. Atomic typing is useful in the rule (T-SimpAtom), which uses the interpretation to derive a most precise refinement type for interpretable atoms.
The subtyping judgment is defined in a standard way, with the extra requirement that the potential in should be greater than or equal to that in . Subtyping is often used to “forget” some program variables in the type to ensure the result type does not reference any locally introduced variable, e.g., the result type of cannot have in it and the result type of cannot reference or .
To reason about logical refinements, we introduce validity checking, written , to state that a logical refinement is always true under any instance of the context . The validity checking relation is established upon a denotational semantics for refinements. Validity checking in is decidable because it can be reduced to Presburger arithmetic. The full development of validity checking is included in Appendix B.
We reason about inductive invariants for lists in rule (T-MatL), using interpretation . In our formalization, lists are refined by their length thus the invariants are: (i) has length , and (ii) the length of is the length of plus one. The type system can be easily enriched with more refinements and data types (e.g., the elements of a list are the union of its head and those of its tail) by updating the interpretation as well as the premises of rule (T-MatL).
Finally, notable are the two typing rules for applications: (T-App) and (T-App-SimpAtom). In the former case, the function return type does not mention , and hence can be directly used as the type of the application (this is the case e.g. for all higher-order applications, since our well-formedness rules prevent functions from appearing in refinements). In the latter case, mentions , but luckily any argument of a scalar type must be a simple atom , so we can substitute with its interpretation . The ability to derive precise types for dependent applications motivates the use of a-normal-form in .
The rule (T-Consume-P) states that an expression is only well-typed in a context that contains a free potential term . To transform the context into this form, we can use the rule (S-Transfer) to transfer potential within the context between variable types and free potential terms, as long as we can prove that the total amount of potential remains the same. For example, the combination of (S-Transfer) and (S-Relax) allows us to derive both and (but not ).
The typing rules of form an affine type system (Walker, 2002). To use a program variable multiple times, we have to introduce explicit sharing to ensure that the program cannot gain potential. The sharing judgment means that in the context , the potential indicated by is apportioned into two parts to be associated with and . We extend this notion to context sharing, written , which states that has the same sequence of bindings as , but the potentials of type bindings in are shared point-wise, and the free potentials in the are also split. A special context sharing is used in the typing rules (T-Abs) and (T-Fix) for functions. The self-sharing indicates that the function can only reference potential-free free variables in the context. This is also used to ensure that the program cannot gain more potential through free variables by applying the same function multiple times.
Restricting functions to be defined under potential-free contexts is undesirable in some situations. For example, a curried function of type might require nonzero units of potential on its first argument , which is not allowed by rule (T-Abs) or (T-Fix) on the inner function type . We introduce another rule (T-Abs-Lin) to relax the restriction. The rule associates a multiplicity with the function type, which denotes the number of times that the function could be applied. Instead of context self-sharing, we require the potential in the context to be enough for function applications. Note that in ReSyn’s surface syntax used in the Sec. 2, every curried function type implicitly has multiplicity 1 on the inner function: .
Recall the function triple from Fig. 3, which can be written as follows in core syntax:
Next, we illustrate how uses the signature of append:
to justify the resource bound on triple. Suppose is a typing context that contains the signature of append. The argument is used three times, so we need to use sharing relations to apportion the potential of . We have , , and we assign , , and to the three occurrences of respectively in the order they appear in the program. To reason about , we instantiate append with , inferring its type as
|and by (T-App-SimpAtom) we derive the following:|
We then can typecheck with the same instantiation of append:
(where is the type of ). Finally, by subtyping and the following valid judgment in the refinement logic
we conclude .
The type soundness for is based on progress and preservation. The progress theorem states that if we derive a bound for an expression with the type system and resources are available, then can make a step if is not a value. In this way, progress shows that resource bounds are indeed bounds on the high-water mark of the resource usage since states in the small step semantics can be stuck based on resource usage if, for instance, and .
Theorem 1 (Progress).
If and , then either or there exist and such that .
By strengthening the assumption to where is a sequence of type variables and free potentials, and then induction on . ∎
The preservation theorem accounts for resource consumption by relating the left over resources after a computation to the type judgment of the new term.
Theorem 2 (Preservation).
If , and , then .
By strengthening the assumption to where is a sequence of free potentials, and then induction on , followed by inversion on the evaluation judgment . ∎
The proof of preservation makes use of the following crucial substitution lemma.
Lemma 0 (Substitution).
If , , and , then .
By induction on . ∎
Since we found the purely syntactic soundness statement about results of computations (they are well-typed values) somewhat unsatisfactory, we also introduced a denotational notation of consistency. For example, a list of values is consistent with , if and each value of the list is . We then show that well-typed values are consistent with their typing judgement.
Lemma 0 (Consistency).
If , then satisfies the conditions indicated by and is greater than or equal to the potential stored in with respect to .
As a result, we derive the following theorem.
Theorem 5 (Soundness).
If and the either
and is consistent with or
for every there is such that .
Complete proofs can be found in Appendix D.
Inductive Datatypes and Measures
We can generalize our development of list types for inductive types , where is the constructor name, is the element type that does not contain , and is the -element product type . The introduction rules and elimination rules are almost the same as (T-Nil), (T-Cons) and (T-MatL), respectively, except that we need to capture inductive invariants for each constructor in the rules correspondingly. In Synquid, these invariants are specified by inductive measures that map values to refinements. We can introduce new sorting rules for inductive types to embed values as their related measures in the refinement logic.
Our type system infers upper bounds on resource usage. Recently, AARA has been generalized to verify constant-resource behavior (Ngo et al., 2017). A program is said to be constant-resource if its executions on inputs of the same size consume the same amount of resource. We can adapt the technique in (Ngo et al., 2017) to by (i) changing the subtyping rules to keep potentials invariant (i.e. replacing with in (Sub-TVar), (Sub-Arrow), (Sub-Pot)), and (ii) changing the rule (Simp-Atom-Var) to require . Based on the modified type system, our synthesis algorithm can also synthesize constant-time implementations (see Sec. 5.2 for more details).
4. Type-Driven Synthesis with
In this section, we first show how to turn the type checking rules of into synthesis rules, and then leverage these rules to develop a synthesis algorithm.
4.1. Synthesis Rules
To express synthesis rules, we extend with a new syntactic form for expression templates. As shown in Fig. 7, templates are expressions that can contain holes in certain positions. The flat let form , where is a sequence of bindings, is a shortcut for a nest of let-expressions ; we write to convert a flat let (without holes) back to the original syntax. We also extend the language of types with an unknown type , which is used to build partially defined goal types, as explained below.