pwc
Papers with code. Sorted by stars. Updated weekly.
view repo
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence. We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren. Crucially, the neural model uses miniKanren's internal representation as input; miniKanren represents a PBE problem as recursive constraints imposed by the provided examples. We explore Recurrent Neural Network and Graph Neural Network models. We contribute a modified miniKanren, drivable by an external agent, available at https://github.com/xuexue/neuralkanren. We show that our neuralguided approach using constraints can synthesize programs faster in many cases, and importantly, can generalize to larger problems.
READ FULL TEXT VIEW PDFPapers with code. Sorted by stars. Updated weekly.
Program synthesis is a classic area of artificial intelligence that has captured the imagination of many computer scientists. Programming by Example (PBE) is one way to formulate program synthesis problems, where example input/output pairs specify a target program. In a sense, supervised learning can be considered program synthesis, but supervised learning via successful models like deep neural networks famously lacks interpretability. The clear interpretability of programs as code means that synthesized results can be compared, optimized, translated, and proved correct. The manipulability of code makes program synthesis continue to be relevant today.
Current stateoftheart approaches use symbolic techniques developed by the programming languages community. These methods use rulebased, exhaustive search, often manually optimized by human experts. While these techniques excel for small problems, they tend not to scale. Recent works by the machine learning community explore a variety of statistical methods to solve PBE problems more quickly. Works generally fall under three categories: differentiable programming
[1, 2, 3], direct synthesis [4, 5], and neural guided search [6, 7].This work falls under neural guided search, where the machine learning model guides a symbolic search. We take integrating with a symbolic system further: we use its internal representation as input to the neural model. The symbolic system we use is a constraint logic programming system called miniKanren^{1}^{1}1The name “Kanren” comes from the Japanese word for “relation”. [8], chosen for its ability to encode synthesis problems that are difficult to express in other systems. Specifically, miniKanren does not rely on types, is able to to complete partially specified programs, and has a straightforward implementation [9]. miniKanren searches for a candidate program that satisfies the recursive constraints imposed by the input/output examples. Our model uses these constraints to score candidate programs and guide miniKanren’s search.
Neural guided search using constraints is promising for several reasons. First, while symbolic approaches outperform statistical methods, they have not demonstrated an ability to scale to larger problems; neural guidance may help navigate the exponentially growing search space. Second, symbolic systems exploit the compositionality of synthesis problems: miniKanren’s constraints select portions of the input/output examples relevant to a subproblem, akin to having a symbolic attention mechanism. Third, constraint lengths are relatively stable even as we synthesize more complex programs; our approach should be able to generalize to programs larger than those seen in training.
To summarize, we contribute a novel form of neural guided synthesis, where we use a symbolic system’s internal representations to solve an auxiliary problem of constraint scoring using neural embeddings. We explore two models for scoring constraints: Recurrent Neural Network (RNN) and Graph Neural Network (GNN) [10]. We also present a “transparent” version of miniKanren with visibility into its internal constraints, available at https://github.com/xuexue/neuralkanren.
Our experiments focus on synthesizing programs in a subset of Lisp, and show that scoring constraints help. More importantly, we test the generalizability of our approach on three families of synthesis problems. We compare against stateoftheart systems [11], Escher [12], Myth [13], and RobustFill [4]. We show that our approach has the potential to generalize to larger problems.
Programming by example (PBE) problems have a long history dating to the 1970’s [14, 15]. Along the lines of early works in program synthesis, the programming languages community developed search techniques that enumerate possible programs, with pruning strategies based on types, consistency, and logical reasoning to improve the search. Several stateoftheart methods are described in Table 1.
Method  Direction  Search Strategy  Type Discipline 

miniKanren [8, 16]  Topdown  BiasedInterleaving  Dynamic 
[11]  Topdown  Template Complexity  Static 
Escher [12]  Bottomup  Forward Search / Conditional Inference  Static 
Myth [13]  Topdown  Iterative Deepening  Static 
The method [11] is most similar to miniKanren, but specializes in numeric, staticallytyped inputs and outputs. Escher [12] is built as an active learner, and relies on the presence of an oracle to supply outputs for new inputs that it chooses. Myth [13] searches for the smallest program satisfying a set of examples, and guarantees parsimony. These methods all use functional languages based on the calculus as their target language, and aim to synthesize general, recursive functions.
Contributions by the machine learning community have grown in the last few years. Interestingly, while PBE problems can be thought of as a metalearning problem, few works explore this relationship. Each synthesis problem can be thought of as a learning problem [17], so learning the synthesizer can be thought of as metalearning. Instead, works generally fall under direct synthesis, differentiable programming, and neural guided synthesis.
In direct synthesis, the program is produced directly as a sequence or tree. One domain where this has been successful is string manipulation as applied to spreadsheet completion, as in FlashFill [18] and its descendants [5, 4, 19]. FlashFill [18]
uses a combination of search and carefully crafted heuristics. Later works like
[5] introduce a “RecursiveReverseRecursive Neural Network” to generate a program tree conditioned on input/output embeddings. More recently, RobustFill [4]uses bidirectional Long ShortTerm Memory (LSTM) with attention, to generate programs as sequences. Despite flattening the tree structure, RobustFill achieved much better results (92% vs 38%) on the FlashFill benchmark. While these approaches succeed in the practical domain of string manipulation, we are interested in exploring manipulations of richer data structures.
Differentiable programming involves building a differentiable interpreter, then backpropagating through the interpreter to learn a
latentprogram. The goal is to infer correct outputs for new inputs. Work in differentiable programming began with the Neural Turing Machine
[3], a neural architecture that augments neural networks with external memory and attention. Neural Programmer [1] and Neural ProgrammerInterpreter [2] extend the work with reusable operations, and build programs compositionally. While differentiable approaches are appealing, [20] showed that this approach still underperforms discrete searchbased techniques.A recent line of work uses statistical techniques to guide a discrete search. For example, DeepCoder [6] uses an encoding of the input/output examples to predict functions that are likely to appear in the program, to prioritize programs containing those functions. More recently, [7] uses an LSTM to guide the symbolic search system PROSE (Microsoft Program Synthesis using Examples). The search uses a “branch and bound” technique. The neural model learns the choices that maximize the bounding function introduced in [18] and used for FlashFill problems. These approaches attempt to be search system agnostic, whereas we integrate deeply with one symbolic approach, taking advantage of its internal representation and compositional reasoning.
Other work in related domains shares similarities with our contribution. For example, [21] uses constraintbased solver to sample terms in order to complete a program sketch, but is not concerned with synthesizing entire programs. Further, [22] implements differentiable logic programming to do fuzzy reasoning and induce soft inference rules. They use Prolog’s depthfirst search asis and learn constraint validation (approximate unification), whereas we learn the search strategy and use miniKanren’s constraint validation asis.
This section describes the constraint logic programming language miniKanren and its use for program synthesis. Figure 1 summarizes the relationship between miniKanren and the neural agent.
The constraint logic programming language miniKanren uses the relational programming paradigm, where programmers write relations instead of functions. Relations are a generalization of functions: a function with parameters can be expressed as a relation with parameters, e.g., implies . The notation means that and are related by .
In miniKanren queries, data flow is not directionally biased: any input to a relation can be unknown. For example, a query where is known and X is an unknown, called a logic variable, finds values X where X and are related by . In other words, given and defined as before, the query finds inputs X to such that . This property allows the relational translation of a function to run computations in reverse [16]. We refer to such uses of relations containing logic variables as constraints.
(evalo P I O)  
disj  (evalo (quote A) I O)  
(evalo (car B) I O)  
(evalo (cdr C) I O)  
(evalo (cons D E) I O)  
(evalo (var F) I O)  
… 
In this work, we are interested in using a relational form evalo of an interpreter eval to perform program synthesis^{2}^{2}2In miniKanren convention, a relation is named after the corresponding function, with an ‘o’ at the end. Appendix A provides a definition of evalo used in our experiments. . In the functional computation , program P and input I are known, and the output O is the result to be computed. The same computation can be expressed relationally with where P and I are known and O is an unknown. We can also synthesize programs from inputs and outputs, expressed relationally with where P is unknown while I and O are known. While ordinary evaluation is deterministic, there may be many valid programs P for any pair of I and O. Multiple uses of evalo, involving the same P but different pairs I and O can be combined in a conjunction, further constraining P. This is how PBE tasks are encoded using an implementation of evalo for the target synthesis language.
A miniKanren program internally represents a query as a constraint tree built out of conjunctions, disjunctions, and calls to relations (constraints). A relation like evalo is recursive, that is, defined in terms of invocations of other constraints including itself. Search involves unfolding a recursive constraint by replacing the constraint with its definition in terms of other constraints. For example, in a Lisp interpreter, a program P can be a constant, a function call, or another expression. Unfolding reveals these possibilities as clauses of a disjunction that replaces evalo. Figure 2 shows a partial unfolding of .
As we unfold more nodes, branches of the constraint tree constrain P to be more specific. We call a partial specification of P as a “candidate” partial program. If at some point we find a fully specified P that satisfies all relevant constraints, then P is a solution to the PBE problem.
In Figure 3, we show portions of the constraint tree representing a PBE problem with two input/output pairs. Each of the gray boxes corresponds to a separate disjunct in the constraint tree, representing a candidate. Each disjunct is a conjunction of constraints, shown one on each line. A candidate is viable only if the entire conjunction can be satisfied. In the left column (a) certain “obviously” failing candidates like (quote M) are omitted from consideration. The right column (c) also shows the unfolding of the selected disjunct for (cons D E), where D is replaced by its possible values.
By default, miniKanren uses a biased interleaving search [16], alternating between disjuncts to unfold. The alternation is “biased” towards disjuncts that have more of their constraints already satisfied. This search is complete: if a solution exists, it will eventually be found, time and memory permitting.
Typical implementations of miniKanren represent constraint trees as “goals” [16] built from opaque, suspended computations. These suspensions entangle both constraint simplification and the implicit search policy, making it difficult to inspect a constraint tree and experiment with alternative search policies.
One of our contributions is a miniKanren implementation that represents the constraint tree as a transparent data structure. It provides an interface for choosing the next disjunct to unfold, making it possible to define custom search policies driven by external agents. Our implementation is available at https://github.com/xuexue/neuralkanren.
Like the standard miniKanren, this transparent version is implemented in Scheme. To interface with an external agent, we have implemented a Python interface that can drive the miniKanren process via stdin/stdout. Users start by submitting a query, then alternate between receiving constraint tree updates and choosing the next disjunct to unfold.
We present our neural guided synthesis approach summarized in Figure 3. To begin, miniKanren represents the PBE problem in terms of a disjunction of candidate partial programs, and the constraints that must be satisfied for the partial program to be consistent with the examples. A machine learning agent makes discrete choices amongst the possible candidates. The symbolic system then expands the chosen candidate, adding expansions of the candidate to the list of partial programs.
The machine learning model follows these steps:
Score
each constraint. Each constraint embedding is scored independently, using a multilayer perceptron (MLP).
Pool
scores together. We pool constraint scores for each candidate. We pool hierarchically using the structure of the constraint tree, maxpooling along a disjunction and averagepooling along a conjunction. We find that using averagepooling instead of minpooling helps gradient flow. In Figure
3 there are no internal disjunctions.Choose a candidate. We use a softmax distribution over candidates during training and choose greedily during test.
Intuitively, the pooled score for each candidate represents the plausibility of constraints associated with a candidate partial program being satisfied. So in some sense we are learning a neural constraint satisfaction system in order to solve synthesis problems.
One way to embed the constraints is using an RNN operating on each constraint as a sequence. We use an RNN with bidirectional LSTM units [23] to score constraints, with each constraint separately tokenized and embedded. The tokenization process removes identifying information of logic variables, and treats all logic variables as the same token. While logic variable identity is important, since each constraint is embedded and scored separately, the logic variable identity is lost.
We learn separate RNN weights for each relation (evalo, lookupo, etc). The particular set of constraint types differs depending on the target synthesis language.
In the RNN model, we lose considerable information by removing the identity of logic variables. Two constraints associated with a logic variable may independently be satisfiable, but may be obviously unsatisfiable together.
To address this, we use a GNN model that embeds all constraints simultaneously. The use of graph or tree structure to represent programs [24, 25] and constraints [26] is not unprecedented. An example graph structure is shown in Figure 4. Each constraint is represented as a tree, but since logic variable leaf nodes may be shared by multiple constraints, the constraint graph is in general a Directed Acyclic Graph (DAG). We do not include the constraint tree structure (disjunctions and conjunctions) in the graph structure since they are handled during pooling.
The specific type of GNN model we use is a Gated Graph Neural Network (GGNN) [27]. Each node has an initial embedding, which is refined through message passing along the edges. The final root node embedding of each constraint is taken to be the embedding representation of the constraint. Since the graph structure is a DAG, we use a synchronous message schedule for message passing.
One difference between our algorithm and a typical GGNN is the use of different node types. Each token in the constraint tree (e.g. evalo, cons
, logic variable) has its own aggregation function and Gated Recurrent Unit weights. Further, the edge types will also follow the node type of the parent node. Most node types will have asymmetric children, so the edge type will also depend on the position of the child.
To summarize, the GNN model has the following steps:
Initialization of each node, depending on the node type and label. The initial embeddings are learned parameters of the model.
Upward Pass, which is ordered leaftoroot, so that a node receives all messages from its children and updates its embedding before sending a message to its parents. Since a nonleaf node always has a fixed number of children, the merge function is parameterized as a multilayer perceptron (MLP) with a fixed size input.
Downward Pass, which is ordered roottoleaf, so that a node receives all messages from its parents and updates its embedding before sending a message to its children. Nodes that are not logic variables will only have one parent, so no merge function is required. Constant embeddings are never updated. Logic variables can have multiple parents, so an average pooling is used as a merge function.
Repeat
. The number of upward/downward passes is a hyperparameter. We end on an upward pass so that logic variable updates are reflected in the root node embeddings.
We extract the final embedding of the constraint root nodes for scoring, pooling, and choosing.
We note the similarity in the setup to a Reinforcement Learning problem. The candidates can be thought of as possible
actions, the ML model as the policy, and miniKanren as the nondifferentiable environment which produces the states or constraints. However, during training we have access to the groundtruth optimal action at each step, and therefore use a supervised crossentropy loss.We do use other techniques from the Reinforcement Learning literature. We use curriculum learning, beginning with simpler training problems. We generate training states by using the current model parameters to make action choices at least some of the time. We use scheduled sampling [28] with a linear schedule, to increase exploration and reduce teacherforcing as training progresses. We use prioritized experience replay [29]
to reduce correlation in a minibatch, and resample more difficult states. To prevent an exploring agent from becoming “stuck”, we abort episodes after 20 consecutive incorrect choices. For optimization we use RMSProp
[30], with weight decay for regularization.Importantly, we choose to expand two candidates per step during training, instead of the single candidate as described earlier. We find that expanding two candidates during training allows a better balance of exploration / exploitation during training, leading to a more robust model. During test time, we resume expanding one candidate per step, and use a greedy policy.
Following the programming languages community, we focus on tree manipulation as a natural starting point towards expressive computation. We use a small subset of Lisp as our target language. This subset consists of cons, car, cdr, along with several constants and function application. The full grammar is shown in Figure 5.
We present two experiments. First, we test on programmatically generated synthesis problems held out from training. We compare two miniKanren search strategies that do not use a neural guide, three of our neuralguided models, and RobustFill with a generous beam size. Then, we test the generalizability of these approaches on three families of synthesis problems. In this second set of experiments we additionally compare against stateoftheart systems , Escher, and Myth. All test experiments are run on Intel i76700 3.40GHz CPU with 16GB RAM.
We programmatically generate training data by querying in miniKanren, where the program, inputs, and outputs are all unknown. We put several other restrictions on the inputs and outputs so that the examples are sufficiently expressive. When input/output expressions contain constants, we choose random constants to ensure variety. We use 500 generated problems for training, each with 5 input/output examples. In this section, we report results on 100 generated test problems. We report results for several symbolic and neural guided models. Sample generated problems are included in Appendix B.
We compare two variants of symbolic methods that use miniKanren. The “Naive” model uses biasedinterleaving search, as described in [31]. The “+ Heuristic” model uses additional hand tuned heuristics described in [16]. The neural guided models include the RNN+Constraints guided search described in Section 4.1 and the GNN+Constraints guided search in Section 4.2
. The RNN model uses 2layer bidirectional LSTMs with embedding size of 128. The GNN model uses a single up/down/up pass with embedding size 64 and message size 128. Increasing the number of passes did not yield improvements. Further, we compare against a baseline RNN model that does not take constraints as input: instead, it computes embeddings of the input, output, and the candidate partial program using an LSTM, then scores the concatenated embeddings using a MLP. This baseline model also uses 2layer bidirectional LSTMs with embedding size of 128. All models use a 2layer neural network with ReLU activation as the scoring function.
Table 2 reports the percentage of problems solved within 200 steps. The maximum time the RNNGuided search used was 11 minutes, so we allow the symbolic models up to 30 minutes. The GNNGuided search is significantly more computationally expensive, and the RNN baseline model (without constraints) is comparable to the RNNGuided models (with constraints as inputs).
Method  Percent Solved  Average Steps 
Naive [31]  27%  N/A 
+Heuristics (Barliman) [16]  82%  N/A 
RNNGuided (No Constraints)  93%  46.7 
GNNGuided + Constraints  88%  44.5 
RNNGuided + Constraints  99%  37.0 
RobustFill [4] beam 1000+  100%  N/A 
All three neural guided models performed better than symbolic methods in our tests, with the RNN+Constraints model solving all but one problem. The RNN model without constraints also performed reasonably, but took more steps on average than other models. RobustFill [4] AttentionC with large beam size solves one more problem than RNN+Constraints on a flattened representation of these problems. Exploration of beam size is in Appendix D. We defer comparison with other symbolic systems because problems in this section involve dynamicallytyped, improper list construction.
In this experiment, we explore generalizability. We use the same model weights as above to synthesize three families of programs of varying complexity: Repeat(N) which repeats a token times, DropLast(N) which drops the last element in an element list, and BringToFront(N) which brings the last element to the front in an element list. As a measure of how synthesis difficulty increases with , Repeat(N) takes steps, DropLast(N) takes steps, and BringToFront(N) takes steps. The largest training program takes optimally 22 steps to synthesize. The number of optimal steps in synthesis correlates linearly with program size.
We compare against stateoftheart systems , Escher, and Myth. It is difficult to compare our models against other systems fairly, since these symbolic systems use type information, which provides an advantage. Further, assumes advanced language constructs like fold that other methods do not. Escher is built as an active learner, and requires an “oracle” to provide outputs for additional inputs. We do not enable this functionality of Escher, and limit the number of input/output examples to 5 for all methods. We allow every method up to 30 minutes. We also compare against RobustFill AttentionC with a beam size of 5000, the largest beam size supported by our test hardware. Our model is further restricted to 200 steps for consistency with Section 5.1.
Note that if given the full 30 minutes, the RNN+Constraints model is able to synthesize DropLast(7) and BringToFront(6), and the GNN+Constraints model is also able to synthesize DropLast(7). Myth solves Repeat(N) much faster than our model, taking less than 15ms per problem, but fails on DropLast and BringToFront. Results are shown in Table 3.
In summary, the RNN+Constraints and GNN+Constraints models both solve problems much larger than those seen in training. The results suggest that using constraints helps generalization: though RobustFill performs best in Section 5.1, it does not generalize to larger problems out of distribution; though RNN+Constraints and RNNwithoutconstraints perform comparably in Section 5.1, the former shows better generalizability. This is consistent with the observation that as program sizes grow, the corresponding constraints grow more slowly.
Method  Repeat(N)  DropLast(N)  BringToFront(N) 

Naive [31]  6 (time)  2 (time)   (time) 
+Heuristics [16]  11 (time)  3 (time)   (time) 
RNNGuided + Constraints  20+  6 (time)  5 (time) 
GNNGuided + Constraints  20+  6 (time)  6 (time) 
RNNGuided (no constraints)  9 (time)  3 (time)  2 (time) 
[11]  4 (memory)  3 (error)  3 (error) 
Escher [12]  10 (error)  1 (oracle)   (oracle) 
Myth [13]  20+   (error)   (error) 
RobustFill [4] beam 1000  1  1   (error) 
RobustFill [4] beam 5000  3  1   (error) 
We have built a neural guided synthesis model that works directly with miniKanren’s constraint representations, and a transparent implementation of miniKanren available at https://github.com/xuexue/neuralkanren. We have demonstrated the success of our approach on challenging tree manipulation and, more importantly, generalization tasks. These results indicate that our approach is a promising stepping stone towards more general computation.
Research reported in this publication was supported in part by the Natural Sciences and Engineering Research Council of Canada, and the National Center For Advancing Translational Sciences of the National Institutes of Health under Award Number OT2TR002517. R.L. was supported by Connaught International Scholarship. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
Sampling for bayesian program learning.
In Advances in Neural Information Processing Systems, pages 1297–1305, 2016.We include below the code for the relational interpreter, written in miniKanren. For readability by machine learning audience, our main paper renames the inputs to the relational interpreter: expr or expression is called P or program in the main paper, env or environment is called I or input, and value is called O or output.
(definerelation (evalo expr env value) (conde ;; conde creates a disjunction ((fresh (body) ;; fresh creates new variables and a conjunction (== ‘(lambda ,body) expr) ;; expr is a lambda definition (== ‘(closure ,body ,env) value))) ((== ‘(quote ,value) expr)) ;; expr is a literal constant ((fresh (a*) (== ‘(list . ,a*) expr) ;; expr is a list construction (evallisto a* env value))) ((fresh (index) (== ‘(var ,index) expr) ;; expr is a variable (lookupo index env value))) ((fresh (rator rand arg env^ body) (== ‘(app ,rator ,rand) expr) ;; expr is a function application (evalo rator env ‘(closure ,body ,env^)) (evalo rand env arg) (evalo body ‘(,arg . ,env^) value))) ((fresh (a d va vd) (== ‘(cons ,a ,d) expr) ;; expr is a cons operation (== ‘(,va . ,vd) value) (evalo a env va) (evalo d env vd))) ((fresh (c vd) (== ‘(car ,c) expr) ;; expr is a car operation (evalo c env ‘(,value . ,vd)))) ((fresh (c va) (== ‘(cdr ,c) expr) ;; expr is a cdr operation (evalo c env ‘(,va . ,value))))))
Some examples of automatically generated problems are shown in Table A1. Variables in a function body are encoded using de Bruijn indices, so that (var ()) is looking up the 0th (and only) variable. The symbol . denotes a pair.
Program: (lambda (car (car (var ()))))  

Input  Output  
((b . #t)) 
b 

((() . b) . a) 
() 

((a . s) . 1) 
a 

(((y . 1)) . 1) 
(y . 1) 

((b)) 
b 

Program: (lambda (cons (car (var ())) (quote x)))  
Input  Output  
(a) 
(a . x) 

(#t . s) 
(#t . x) 

((1 . y) . y) 
((1 . y) . x) 

((y 1 . s) . 1) 
((y 1 . s) . x) 

(((x . x)) . y) 
(((x . x)) . x) 

Program: (lambda (quote x))  
Input  Output  
y 
x 

() 
x 

#t 
x 

a 
x 

b 
x 

Program: (lambda (cons (car (var ())) (car (car (cdr (var ()))))))  
Input  Output  
(y (y . b) . y) 
(y . y) 

(x (1 . 1)) 
(x . 1) 

(x ((y . a) . x) . a) 
(x y . a) 

((#f . #t) (#f . a) . 1) 
((#f . #t) . #f) 

(a ((y #f . #f) . 1) . a) 
(a y #f . #f) 

Program: (lambda (car (cdr (car (car (cdr (cdr (cdr (var ())))))))))  
Input  Output  
(#f a () ((#f b . 1) . y) . #t) 
b 

(x #t y ((() (#t . a) . s))) 
(#t . a) 

(x b s ((#f (s 1 . b) . y)) . s) 
(s 1 . b) 

(b () #f ((b ((x . #t) . x))) . a) 
((x . #t) . x) 

(1 #t a ((s (1 #t s . a) . x) . #t) . #t) 
(1 #t s . a) 

Table A2 lists problems on which the methods failed. The single problem that RNN + Constraints failed to solve is a fairly complex problem. The problems that the GNN + Constraints failed to solve all include a complex list accessor portion. This actually makes sense: it is conceivable for multilayer RNNs to be better at this kind of problem compared to a singlelayer GNN. The RNN without constraints also fails at complex list accessor problems.
Method  Problem 

RNN + Constraints  (lambda (cons (cons (var ()) (var ())) (cons (var ()) (car (cdr (var ())))))) 
GNN + Constraints  (lambda (car (car (car (car (cdr (cdr (car (var ()))))))))) 
(lambda (car (car (car (cdr (car (cdr (car (var ())))))))))  
(lambda (car (car (car (cdr (cdr (cdr (car (var ())))))))))  
(lambda (car (car (cdr (car (car (var ())))))))  
(lambda (car (car (cdr (car (cdr (cdr (car (var ())))))))))  
(lambda (car (car (cdr (cdr (cdr (cdr (car (var ())))))))))  
(lambda (car (cdr (car (car (cdr (var ())))))))  
(lambda (car (cdr (car (cdr (cdr (car (var ()))))))))  
(lambda (car (cdr (cdr (car (car (cdr (var ()))))))))  
(lambda (car (cdr (cdr (cdr (cdr (car (car (var ())))))))))  
(lambda (car (cdr (cdr (cdr (cdr (car (cdr (var ())))))))))  
(lambda (cdr (cdr (car (car (var ()))))))  
RNN (No Constraints)  (lambda (cons (car (var ())) (cons (var ()) (cdr (car (var ())))))) 
(lambda (cdr (car (car (cdr (car (cdr (var ()))))))))  
(lambda (cdr (car (cdr (car (car (car (var ()))))))))  
(lambda (cdr (car (car (car (car (car (var ()))))))))  
(lambda (cdr (car (car (cdr (car (cdr (var ()))))))))  
(lambda (cdr (car (cdr (car (car (car (var ()))))))))  
(lambda (cdr (car (car (car (car (car (var ())))))))) 
To compare against RobustFill, we use a flattened representation of the problems shown in Section B, and use the AttentionC model with various beam sizes. For a beam size , if any of the top generated programs are correct, we consider the synthesis a success. We report several figures in Table A3: column (a) shows the percent of test problems held out from training that were successfully solved (Table 2 in our paper), and column (b) shows the largest for a family of synthesis problems for which synthesis succeeds (Table 3 in our paper).
Model  (a) Test  (b) Generalization  

% Solved  Repeat(N)  DropLast(N)  BringToFront(N)  
RobustFill, Beam Size 1  56%  0  0  0 
RobustFill, Beam Size 10  94%  0  0  0 
RobustFill, Beam Size 100  99%  1  0  0 
RobustFill, Beam Size 1000  100%  1  1  0 
RobustFill, Beam Size 5000  100%  3  1  0 
RNNGuided + Constraints (Ours)  99%  20+  6  5 