1 Introduction
Our longterm goal is to build an intelligent tutoring system that helps students to improve their programming skills. Our experience in introductory programming courses is that students, who learn programming for the first time, often struggle with solving programming problems for themselves. Manually providing guidance simply does not scale for the increasingly large number of students. To make matters worse, we found that even instructors sometimes make mistake and shy students are reluctant to ask questions. Motivated by this experience, we aim to build an automatic system that helps students to improve their skills without human teachers.
In this paper, we present a key component of the system, which automatically generates complete programs from students’ incomplete programs. The inputs of the algorithm are a partial program with constraints on variables and constants, and inputoutput examples that specify the program’s behavior. The output is a complete program whose behavior matches all of the given inputoutput examples.
The key novelty of our algorithm is to combine enumerative program synthesis and program analysis techniques. It basically enumerates every possible candidate program in increasing size until it finds a solution. This algorithm, however, is too slow to be interactively used with students due to the huge search space of programs. Our key idea to accelerate the speed is to perform static analysis alongside the enumerative search, in order to “statically” identify and prune out interim programs that eventually fail to be a solution. We formalize our pruning technique and its safety property.
The experimental results show that our algorithm is remarkably effective to synthesize introductory imperative programs. We have implemented the algorithm in a tool, Simpl, and evaluated its performance on 30 programming tasks used in introductory courses. With our pruning technique, Simpl is fast enough to solve each problem in 6.6 seconds on average. However, without the pruning, the baseline algorithm, which already adopts wellknown optimization techniques, takes 165.5 seconds (25x slowdown) on average.
We summarize our contributions below:

We present a new algorithm for synthesizing imperative programs from examples. To our knowledge, our work is the first to combine enumerative program synthesis and static analysis technologies.

We prove the effectiveness of our algorithm on 30 real programming problems used in introductory courses. The results show that our algorithm quickly solves the problems, including ones that most beginnerlevel students have hard times to solve.

We provide a tool, Simpl, which is publicly available and opensourced.^{1}^{1}1Hidden for doubleblind reviewing.
⬇ reverse(n){ r := 0; while ( n > 0 ){ x := n % 10; r := r * 10; r := r + x; n := n / 10; }; return r; }  ⬇ count(n,a){ while ( n > 0 ){ t := n % 10; a[t] := a[t] + 1; n := n / 10; }; return a; } 

(a) Problem1  (b) Problem 2 
⬇ sum(n){ r := 0; while ( n > 0 ){ t := n; while (t > 0){ r := r + t; t := t  1; }; n := n  1; }; return r; }  ⬇ abssum(a, len){ r := 0; i := 0; while (i < len){ if ( a[i] < 0 ) { r := r  a[i]; } else { r := r + a[i]; } i := i + 1; }; return r; } 
(c) Problem 3  (d) Problem 4 
2 Showcase
In this section, we showcase Simpl with four programming problems that most beginners feel difficult to solve. To use Simpl, students need to provide (1) a partial program, (2) a set of inputoutput examples, and (3) resources that Simpl can use. The resources consist of a set of integers, a set of integertype variables, and a set of arraytype variables. The goal of Simpl is to complete the partial program w.r.t. the inputoutput examples, using only the given resources.
Problem 1 (Reversing integer)
The first problem is to write a function that reverses a given integer. For example, given integer 12, the function should return 21. Suppose a partial program is given as
where ? denotes holes that need to be completed. Suppose further Simpl is provided with inputoutput examples , integers , and integer variables .
Given this problem, Simpl produces the solution in Figure 1(a) in 2.5 seconds. Note that, Simpl finds out that the integer ‘1’ is unnecessary and the final program does not contain it. Also, Simpl does not require sophisticated examples, so that Simpl can be easily used by inexperienced students.
Problem 2 (Counting)
The next problem is to write a function that counts the number of each digit in an integer. The program takes an integer and an array as inputs, where each element of the array is initially 0. As output, the program returns that array but now each array element at index stores the number of s that occur in the given integer. For example, when a tuple is given, the function should output ; 0 occurs once, 1 does not occur, and 2 occurs twice in ‘220’. Suppose the partial program is given as
with examples , integers , integer variables , and an array variable .
For this problem, Simpl produces the program in Figure 1(b) in 0.2 seconds. Note that Simpl uses a minimal set of resources; i is not used though it is given as usable.
Problem 3 (Sum of sum)
The third problem is to compute for a given integer . Suppose the partial program
is given with examples , integers , and integertype variables .
Then, Simpl produces the program in Figure 1(c) in 37.6 seconds. Note that Simpl newly introduced a nested loop, which is absent in the partial program.
Problem 4 (Absolute sum)
The last problem is to sum the absolute values of all the elements in a given array. We provide the partial program:
where the goal is to complete the condition and bodies of the ifstatement. Given a set of inputoutput examples , an integer , integer variables , and an array variable , Simpl produces the program in Figure 1(d) in 12.1 seconds.
3 Problem Definition
Language We designed an imperative language that is small yet expressive enough to deal with various programming problems in introductory courses. The syntax of the language is defined by the following grammar:
An lvalue () is a variable () or an array reference (). An arithmetic expression () is an integer constant , an lvalue (), or a binary operation (). A boolean expression () is a boolean constant (), a binary relation (), a negation (), or a logical conjunction () and disjunction (). Commands include assignment (), skip (), sequence (), conditional statement (), and whileloop ().
A program is a command with input and output variables, where is the input variable, is the command, and is the output variable. The input and output variables and can be either of integer or array types. For presentation brevity, we assume that the program takes a single input, but our implementation supports multiple input variables as well.
An unusual feature of the language is that it allows to write incomplete programs. Whenever uncertain, any arithmetic expressions, boolean expressions, and commands can be left out with holes (). The goal of our synthesis algorithm is to automatically complete such partial programs.
The semantics of the language is defined for programs without holes. Let be the set of program variables, which is partitioned into integer and array types, i.e., . A memory state
is a partial function from variables to values (). A value is either an integer or an array of integers. An array is a sequence of integers. For instance, we write for the array of integers 1, 2, and 3. We write , , and for the length of , the element at index , and the array , respectively.
The semantics of the language is defined by the functions:
where , , and denote the semantics of arithmetic expressions, boolean expressions, and commands, respectively. Figure 2 presents the denotational semantics, where fix is a fixed point operator. Note that the semantics for holes is undefined.
Synthesis Problem A synthesis task is defined by the five components:
where is an incomplete program with holes, is a set of inputoutput examples. is a set of integers, is a set of integertype variables, and is a set of arraytype variables. The goal of our synthesis algorithm is to produce a complete command without holes such that

uses constants and variables in and , and

is consistent with every inputoutput example:
4 Synthesis Algorithm
In this section, we present our synthesis algorithm that combines enumerative search with static analysis.
4.1 Synthesis as StateSearch
We first reduce the synthesis task into a statesearch problem. Consider a synthesis task . The corresponding search problem is defined by the transition system where is a set of states, is a transition relation, is an initial state, and is a set of solution states.

States : A state is a command possibly with holes, which is defined by the grammar in Section 3.

Initial state : An initial state is a partial command .

Transition relation : Transition relation determines the state that is immediately reachable from a state. The relation is defined as a set of inference rules in Figure 3. Intuitively, a hole can be replaced by an arbitrary expression (or command) of the same type. Given a state , we write for the set of all immediate next states, i.e., . We write for terminal states, i.e., states with no holes.

Solution states : A state is a solution iff is a terminal state and it is consistent with all inputoutput examples:
4.2 Baseline Search Algorithm
Algorithm 1 shows the basic architecture of our enumerative search algorithm. The algorithm initializes the workset with (line 1). Then, it picks a state with the smallest size and removes the state from the workset (line 3). If is a solution state, the algorithm terminates and is returned (line 5). For a nonterminal state, the algorithm attempts to prune the state by invoking the function (line 7). If pruning fails, the next states of are added into the workset and the loop repeats. The details of our pruning technique is described in Section 4.3
. At the moment, assume
always fails.The baseline algorithm implicitly performs two wellknown optimization techniques. First, it maintains previously explored states and never reconsider them. Second, more importantly, it normalizes states so that semanticallyequivalent programs are also syntactically the same. For instance, suppose is the current state. Before pushing it to the workset, we first normalize it to . To do so, we use four code optimization techniques: constant propagation, copy propagation, dead code elimination, and expression simplification [Aho et al.1986]. These two techniques significantly improve the speed of enumerative search.
In addition, the algorithm considers terminating programs only. Our language has unrestricted loops, so the basic algorithm may synthesize nonterminating programs. To exclude them from the search space, we use syntactic heuristics to detect potentially nonterminating loops. The heuristics are: 1) we only allow boolean expressions of the form
(or ) in loop conditions, 2) the last statement of the loop body must increase (or decrease) the induction variable , and 3) and are not defined elsewhere in the loop.4.3 Pruning with Static Analysis
Now we present the main contribution of this paper, pruning with static analysis. Static analysis allows to safely identify states that eventually fail to be a solution. We first define the notion of failure states.
Definition 1.
A state is a failure state, denoted , iff every terminal state reachable from is not a solution, i.e.,
Our goal is to detect as many failure states as possible. We observed two typical cases of failure states that often show up during the baseline search algorithm.
Example 1.
Consider the program in Figure 4(a) and inputoutput example . When the program is executed with , no matter how the hole gets instantiated, the output value is no less than 2 at the return statement. Therefore, the program cannot but fail to satisfy the example .
Example 2.
Consider the program in Figure 4(b) and inputoutput example . Here, we do not know the exact values of and , but we know that must hold at the end of the program. However, there exists no such integer , and we conclude the partial program is a failure state.
⬇ example1(n){ r := 0; while (n > 0){ r := n + 1; n := ; }; return r; }  ⬇ example2(n) { r := 0; while (n > 0){ ; r := x * 10; n := n / 10; }; return r; } 
(a)  (b) 
Static Analysis We designed a static analysis that aims to effectively identify these two types of failure states. To do so, our analysis combines numeric and symbolic analyses; the numeric analysis is designed to detect the cases of Example 1 and the symbolic analysis for the cases of Example 2. The abstract domain of the analysis is defined as follows:
An abstract memory state maps variables to abstract values (). An abstract value is a pair of intervals () and symbolic values (). The domain of intervals is standard [Cousot and Cousot1977]:
For symbolic analysis, we define the following flat domain:
A symbolic expression is a constant (), a symbol (), or a binary operation with symbolic expressions. We introduce symbols one for each integertype variable in the program. The symbolic domain is flat and has the partial order: . We define the abstraction function that transforms concrete values to abstract values:
The abstract semantics is defined in Figure 5 by the functions:
where is the abstract boolean lattice.
Intuitively, the abstract semantics overapproximates the concrete semantics of all terminal states that are reachable from the current state. This is done by defining the sound semantics for holes: , , and . An exception is that integer variables get assigned symbols, rather than , in order to generate symbolic constraints on integer variables.
In our analysis, array elements are abstracted into a single element. Hence, the definitions of and do not involve . Because an abstract array cell may represent multiple concrete cells, arrays are weakly updated by joining () old and new values. For example, in memory state , evaluates to .
For whileloops, the analysis performs a sound fixed point computation. If the computation does not reach a fixed point after a fixed number of iterations, we apply widening for infinite interval domain, in order to guarantee the termination of the analysis. We use the standard widening operator in [Cousot and Cousot1977]. The function and in Figure 5 denote a postfixed point operator and a sound abstraction of , respectively.
Pruning Next we describe how we do pruning with the static analysis. Suppose we are given examples and a state with input () and output () variables. For each example , we first run the static analysis with the input and obtain the analysis result
We only consider the case when (when , the program is semantically illformed and therefore we just prune out the state). Then, we obtain the interval abstraction of the output , i.e., , and generate the constraints :
The first (resp., second) conjunct means that the interval (resp., symbolic) analysis result must overapproximate the output example. We prune out a state iff is unsatisfiable for some example :
Definition 2.
The predicate is defined as follows:
The unsatisfiability can be easily checked, for instance, with an offtheshelf SMT solver. Our pruning is safe:
Theorem 1 (Safety).
.
That is, we prune out a state only when it is a failure state, which formally guarantees that the search algorithm with our pruning finds a solution if and only if the baseline algorithm (Section 4.2) does so.
5 Evaluation
Domain  No  Description  Vars  Ints  Exs  Time (sec)  
IVars  AVars  Base  Base+Opt  Ours  
Integer  1  Given , return .  2  0  2  4  0.0  0.0  0.0 
2  Given , return (i.e., double factorial).  3  0  3  4  0.0  0.0  0.0  
3  Given , return .  3  0  2  4  0.1  0.0  0.0  
4  Given , return .  4  0  2  3  122.4  18.1  0.3  
5  Given , return .  4  0  2  3  102.9  13.6  0.2  
6  Given and , return .  4  0  2  4  0.7  0.1  0.1  
7  Given and , return .  3  0  2  3  0.2  0.0  0.0  
8  Given and , return .  3  0  2  3  0.2  0.0  0.1  
9  Count the number of digit for an integer.  3  0  3  3  0.0  0.0  0.0  
10  Sum the digits of an integer.  3  0  3  4  5.2  2.2  1.3  
11  Calculate product of digits of an intger.  3  0  3  3  0.7  2.3  0.3  
12  Count the number of binary digit of an integer.  2  0  3  3  0.0  0.0  0.0  
13  Find the th Fibonacci number.  3  0  3  4  98.7  13.9  2.6  
14  Given , return .  3  0  2  4  324.9  37.6  
15  Given , return .  3  0  2  4  316.6  86.9  
16  Reverse a given integer.  3  0  3  3  367.3  2.5  
Array  17  Find the sum of all elements of an array.  3  1  2  2  8.1  3.6  0.9 
18  Find the product of all elements of an array.  3  1  2  2  7.6  3.9  0.9  
19  Sum two arrays of same length into one array.  3  2  2  2  44.6  29.9  0.2  
20  Multiply two arrays of same length into one array.  3  2  2  2  47.4  26.4  0.3  
21  Cube each element of an array.  3  1  1  2  1283.3  716.1  13.0  
22  Manipulate each element into 4th power.  3  1  1  2  1265.8  715.5  13.0  
23  Find a maximum element.  3  1  2  2  0.9  0.7  0.4  
24  Find a minimum element.  3  1  2  2  0.8  0.3  0.1  
25  Add 1 to each element.  2  1  1  3  0.3  0.0  0.0  
26  Find the sum of square of each element.  3  1  2  2  2700.0  186.2  11.5  
27  Find the multiplication of square of each element.  3  1  1  2  1709.8  1040.3  12.6  
28  Sum the products of matching elements of two arrays.  3  2  1  3  20.5  38.7  1.5  
29  Sum the absolute values of each element.  2  1  1  2  45.0  50.5  12.1  
30  Count the number of each element.  3  1  3  2  238.9  1094.1  0.2  
Average  616.8  165.5  6.6 
Experimental setup To evaluate our synthesis algorithm, we gathered 30 introductory level problems from several online forums (Table 1).^{2}^{2}2E.g., http://www.codeforwin.in The problems consist of tasks manipulating integers and arrays. Some problems are nontrivial for novice students to solve; they require students to come up with various control structures such as nested loops and combinations of loops and conditional statements. The partial programs we used are similar to those shown in Section 2; they have one boolean expression hole , and one or two command holes . For each benchmark, we report the number of integer variables (IVars), array variables (AVars), integer constants (Ints), and examples (Exs) provided, respectively. All benchmark problems are publicly available with our tool. Experiments were conducted on MacBook Pro with Intel Core i7 and 16GB of memory.
Baseline Algorithm Table 1 shows the performance of our algorithm. The column “Base” shows the running time of our baseline algorithm that performs enumerative search without state normalization. In that case, the average runtime was longer than 616 seconds, and three of the benchmarks timed out ( 1 hour). The column “Base+Opt” reports the performance of the baseline with normalization. It shows that normalizing states succeeds to solve all benchmark problems and improves the speed by more than 3.7 times on average, although it degrades the speed for some cases due to runtime normalization overhead.
Pruning Effectiveness On top of “Base+Opt”, we applied our staticanalysisguided pruning technique (the column “Ours”). The results show that our pruning technique is remarkably effective. It reduces the average time to 6.6 seconds, improving the speed of “Base+Opt” by 25 times. Note that Simpl is able to synthesize the desired programs from a few examples (Exs), requiring up to 4 examples.
6 Related Work
Computeraided education Recently, program synthesis technology has revolutionized computeraided education. For instance, the technology has been used in automatic problem generation [Singh et al.2012, Ahmed et al.2013, Alvin et al.2014, Polozov et al.2015], automatic grading [Alur et al.2013], and automatic solution generation [Gulwani et al.2011].
Our work is to use program synthesis for automated programming education system. A large amount of work has been done to automate programming education [Adam and Laurent1980, Soloway et al.1981, Farrell et al.1984, Johnson and Soloway1984, Murray1989, Singh et al.2013, Gulwani et al.2014, Kaleeswaran et al.2016, Kim et al.2016], which focuses primarily on providing feedback on students’ programming submissions. Our system, Simpl, has the following advantages over prior works:

Feedback on incomplete programs: Existing systems produce feedback only for complete programs; they cannot help students who do not know how to proceed further. In this case, Simpl can help by automatically generating solutions starting from incomplete solutions.

No burden on instructor: Existing systems require instructor’s manual effort. For example, the system in [Singh et al.2013] needs a correct implementation and a set of correction rules manually designed by the instructor. On the other hand, Simpl does not require anything from the instructor.
An exception is [Farrell et al.1984], where an automatic LISP feedback system is presented. However, the system produces feedback by relying on adhoc rules.
Programming by example Our work differs from prior programmingbyexample (PBE) techniques in two ways. First, to our knowledge, our work is the first to synthesize imperative programs with loops. Most of the PBE approaches focus on domainspecific languages for string transformation [Gulwani2011, Kini and Gulwani2015, Raza et al.2015, Manshadi et al.2013, Wu and Knoblock2015], number transformation [Singh and Gulwani2012], XML transformation [Raza et al.2014], and extracting relational data [Le and Gulwani2014], etc. Several others have studied synthesis of functional programs [Albarghouthi et al.2013, Osera and Zdancewic2015, Frankle et al.2016]. Second, our algorithm differs from prior work in that we combine semanticbased static analysis technology with enumerative program synthesis. Existing enumerative synthesis technology used pruning techniques such as type systems [Osera and Zdancewic2015, Frankle et al.2016] and deductions [Feser et al.2015], which are not applicable to our setting.
7 Conclusion
In this paper, we have shown that combining enumerative synthesis and static analysis is a promising way of synthesizing introductory imperative programs. The enumerative search allows us to find the smallest possible, therefore general, program while the semanticsbased static analysis dramatically accelerates the process in a safe way. We demonstrated the effectiveness on 30 real programming problems gathered from online forums.
References
 [Adam and Laurent1980] Anne Adam and JeanPierre Laurent. Laura, a system to debug student programs. Artificial Intelligence, 15(12), November 1980.
 [Ahmed et al.2013] Umair Z. Ahmed, Sumit Gulwani, and Amey Karkare. Automatically generating problems and solutions for natural deduction. In IJCAI, 2013.
 [Aho et al.1986] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. AddisonWesley Longman Publishing Co., Inc., Boston, MA, USA, 1986.
 [Albarghouthi et al.2013] Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. Recursive program synthesis. In CAV, 2013.
 [Alur et al.2013] Rajeev Alur, Loris D’Antoni, Sumit Gulwani, Dileep Kini, and Mahesh Viswanathan. Automated grading of dfa constructions. In IJCAI, 2013.
 [Alvin et al.2014] Chris Alvin, Sumit Gulwani, Rupak Majumdar, and Supratik Mukhopadhyay. Synthesis of geometry proof problems. In AAAI, 2014.
 [Cousot and Cousot1977] Patrick Cousot and Radhia Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, 1977.
 [Farrell et al.1984] Robert G. Farrell, John R. Anderson, and Brian J. Reiser. An interactive computerbased tutor for lisp. In AAAI, 1984.
 [Feser et al.2015] John K. Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data structure transformations from inputoutput examples. In PLDI, 2015.
 [Frankle et al.2016] Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic. Exampledirected synthesis: A typetheoretic interpretation. In POPL, 2016.
 [Gulwani et al.2011] Sumit Gulwani, Vijay Anand Korthikanti, and Ashish Tiwari. Synthesizing geometry constructions. In PLDI, 2011.
 [Gulwani et al.2014] Sumit Gulwani, Ivan Radiček, and Florian Zuleger. Feedback generation for performance problems in introductory programming assignments. In FSE, 2014.
 [Gulwani2011] Sumit Gulwani. Automating string processing in spreadsheets using inputoutput examples. In POPL, 2011.
 [Johnson and Soloway1984] W. Lewis Johnson and Elliot Soloway. Proust: Knowledgebased program understanding. In ICSE, 1984.
 [Kaleeswaran et al.2016] Shalini Kaleeswaran, Anirudh Santhiar, Aditya Kanade, and Sumit Gulwani. Semisupervised verified feedback generation. In FSE, 2016.
 [Kim et al.2016] Dohyeong Kim, Yonghwi Kwon, Peng Liu, I. Luk Kim, David Mitchel Perry, Xiangyu Zhang, and Gustavo RodriguezRivera. Apex: Automatic programming assignment error explanation. In OOPSLA, 2016.
 [Kini and Gulwani2015] Dileep Kini and Sumit Gulwani. Flashnormalize: Programming by examples for text normalization. In IJCAI, 2015.
 [Le and Gulwani2014] Vu Le and Sumit Gulwani. Flashextract: A framework for data extraction by examples. In PLDI, 2014.
 [Manshadi et al.2013] Mehdi Manshadi, Daniel Gildea, and James Allen. Integrating programming by example and natural language programming. In AAAI, 2013.
 [Murray1989] William R. Murray. Automatic Program DeBugging for Intelligent Tutoring Systems. Morgan Kaufmann Publishers Inc., 1989.
 [Osera and Zdancewic2015] PeterMichael Osera and Steve Zdancewic. Typeandexampledirected program synthesis. In PLDI, 2015.
 [Polozov et al.2015] Oleksandr Polozov, Eleanor O’Rourke, Adam M. Smith, Luke Zettlemoyer, Sumit Gulwani, and Zoran Popovic. Personalized mathematical word problem generation. In IJCAI, 2015.
 [Raza et al.2014] Mohammad Raza, Sumit Gulwani, and Natasa MilicFrayling. Programming by example using least general generalizations. In AAAI, 2014.
 [Raza et al.2015] Mohammad Raza, Sumit Gulwani, and Natasa MilicFrayling. Compositional program synthesis from natural language and examples. In IJCAI, 2015.
 [Singh and Gulwani2012] Rishabh Singh and Sumit Gulwani. Synthesizing number transformations from inputoutput examples. In CAV, 2012.
 [Singh et al.2012] Rohit Singh, Sumit Gulwani, and Sriram Rajamani. Automatically generating algebra problems. In AAAI, 2012.
 [Singh et al.2013] Rishabh Singh, Sumit Gulwani, and Armando SolarLezama. Automated feedback generation for introductory programming assignments. In PLDI, 2013.
 [Soloway et al.1981] Elliot M. Soloway, Beverly Woolf, Eric Rubin, and Paul Barth. Menoii: An intelligent tutoring system for novice programmers. In IJCAI. Morgan Kaufmann Publishers Inc., 1981.
 [Wu and Knoblock2015] Bo Wu and Craig A. Knoblock. An iterative approach to synthesize data transformation programs. In IJCAI, 2015.
Comments
There are no comments yet.