1. Introduction
Program synthesis involves automatically assembling a program from simpler components. It is analogous to searching the entire space created by all possible permutations of those components, looking for solutions that satisfy given requirements. Many such search strategies (such as enumerative, deductionbased, constraintsolving, stochastic) have been proposed to address this challenge (Polozov and Gulwani, 2015; SolarLezama, 2008; Srivastava et al., 2013; Menon et al., 2013; Liang et al., 2010; Balog et al., 2016; Feng et al., 2018).
In this work, we propose an evolutionbased search strategy, implemented in the Automatic Algorithm Discoverer (AAD). AAD can synthesize programs of relatively high complexity (including loops, nested blocks, nested function calls, etc.), based on a subset of Python as grammar, and can generate executable Python code. In this paper we use AAD to discover algorithmic solutions to array/vector problems.
Evolutionary algorithms use a fitness (objective) function to pick the fittest individuals from a population (Koza, 1992, 1994; Johnson, 2007; Katz and Peled, 2008). The traits of the fittest individuals can recombine (crossover) to form the next generation. However, designing an effective fitness function could be challenging for complex problems (Ivancevic and Ivancevic, 2007; Chaudhuri et al., 2016; Gulwani et al., 2017). We propose an alternative way to guide evolution without a fitness function, by adding several potentially related problems together into a group. We call this Problem Guided Evolution (PGE) and it is analogous to the way we teach children to solve complex problems. For instance, to help discover an algorithm for finding the area of a polygon, we may ask a student to find a way to calculate the area of a triangle. That is, simpler problems guide solutions to more complex ones. Notably, PGE does not require knowing the exact algorithm nor the exact constituents to a solution, but rather a few potentially related problems. In AAD, PGE allows more complex solutions to be derived through (i) composition (combining simpler ones), and through (ii) mutation of alredy discovered ones.
Grouping related problems for PGE, like designing a fitness function, is not automatic and currently requires human insight. However, PGE could be a more natural way to attack complex problems. As a concrete example, Figure 1 shows code that AAD produced in order to sort a nonempty array in ascending order (SortAsc). To solve this, we grouped ten potentially related problems together to guide evolution: min, max, first/last index, reverse, remove first/last, isinarray, and sort ascending/descending. AAD used solutions it found itself for some of those problems to discover an algorithm for sorting: by first finding the minimum of the input array, appending that minimum to a new list, removing the minimum from the input array, and repeating these steps until the entire input array is processed. Though not the most elegant nor the most efficient implementation, a machine being able to discover an algorithm for sorting starting from a basic subset of Python illustrates the capabilities of AAD and the utility of PGE.
Overall, this paper makes the following contributions:

Use of Problem Guided Evolution to eliminate objective functions in evolutionary algorithms.

Use of multiple evolutionary strategies (such as diverse environments & solutions, crosspollination, and ganged evolution), and evaluation of their effectiveness via a wide range of experiments.

Application of AAD to solve 29 array/vector problems in generalpurpose Python language, demonstrating evolutionary algorithms are capable of solving complex stateoftheart problems.

Support of loops to discover algorithms that can accept any (nonzero) input size.

Mapping of inherently parallel evolutionary process to HPC hardware and techniques.
We find that PGE and related evolutionary strategies are effective at discovering solutions to our array/vector problems. Among other findings, notable is the adaptability of AAD to constrained environments and inputs, as well as its ability to find creative solutions.
The rest of the paper is organized as follows: In Section 2 we present the design details of AAD. Specifically, we discuss: (i) the three constituent parts of AAD, with special emphasis on the Evolver and its three phases that construct the solution, (ii) the evolutionary strategies employed by AAD and similarities to biological evolution, and (iii) engineering challenges we faced and their solutions, as well as our HPCoriented approach in the AAD implementation. Section 3 presents our experimental setup and Section 4 discusses the results of putting AAD in test in a broad range of experiments encompassing a variety of problems. Last, we present related work (Section 5), a discussion and future work (Section 6), and conclude the paper in Section 7.
2. Design
As shown in Figure 2, AAD consists of three components: (i) a Problem Generator (ProbGen) to generate a problem, (ii) a Solution Generator (SolGen) to come up with potential solutions (programs), and (iii) a Checker to check those solutions.
2.1. Problem Generator (ProbGen)
Each problem we want solved starts with its own ProbGen. ProbGen is responsible for: (i) specifying the number and types of inputs and outputs, and (ii) generating inputs for a given problem size. For instance, for maximum finding (Max), ProbGen specifies that Max takes one array as its input and produces a number as its output. In addition, when requested to generate inputs for a problem of size , it produces an input array filled with numbers.
2.2. Checker
Checker is responsible for accepting or rejecting a solution generated for a given problem. Checker executes the generated program with input(s) generated by ProbGen, and produces output. The Checker contains logic to either accept or reject the output, as in (Jha et al., 2010). Therefore, a Checker is specific to a given ProbGen, and both go hand in hand.
A Checker does not necessarily need an implementation of the algorithm it seeks to discover, although some problems do require an alternative implementation. For instance, the Checker for problem “Sort” does not have to sort the input array. Rather, it can compare each two adjacent elements in the output array and see whether the two elements are in the expected order. As soon as it detects an out of order pair, it can declare failure. If each pair of elements is in order, and the output array contains exactly the same elements as the input array, which can be checked by removing matching elements, the Checker can accept the solution.
For some problems based on the physical world, the input and output data may be found out empirically. For instance, to develop an algorithm that can predict future temperatures at a specific place, historical temperature data may be used or data may be gathered using sensors. In other words, the physical world can function as a Checker for some of the models (algorithms) we want to discover.
2.3. The Solution Generator (SolGen)
SolGen primarily consists of two components: (i) an Expression/Idiom Store, and (ii) an Evolver.
2.3.1. Expression/Idiom Store (ExpStore)
SolGen constructs source programs using a grammar, as in (Gulwani, 2011; Lee et al., 2018; Torlak and Bodik, 2013; Alur et al., 2013). The subset of Python grammar AAD uses is stored in ExpStore, and is given in Table 1.
In AAD, grammar rules are augmented with type information, as in (Polikarpova et al., 2016; Osera and Zdancewic, 2015; Frankle et al., 2016). AAD supports four types: numbers (NUM), Boolean (BOOL), arrays (ARR), and arraysofarrays (AoA), which can model matrices. Further, each operand of an expression is marked as a Consumer (readonly), a Producer (writeonly), or a ProdCon (readmodifywrite). With this additional information, a typical binary addition rule becomes:
NUM (Prod) = NUM (Cons) + NUM (Cons)
In Table 1, the producer operands are italicized and ProdCon operands are underlined. The rest is all consumers. Some auxiliary grammar rules (e.g., for concatenating statements, function declarations) are omitted for brevity.
When AAD uses a statement with a BLOCK (a code block), it inserts at least one randomly selected expression into the BLOCK as a heuristic. We call such a construct an
idiom. In addition, when an If Stmt is inserted, an additional expression producing a BOOL is inserted before it, to make sure that there is an expression producing a BOOL for the If Stmt to consume. Consequently, every For Stmt and If Stmt is inserted as an idiom. Reduction is another such idiom. Idioms allow for faster construction of useful programs.In ExpStore, the operands of expressions are generally not given any identifier names. However, in idioms, if two expressions have a producerconsumer relationship (e.g., BOOL expression in If idiom), we assign the common operand a common integer identifier to link the producer and the consumer.
Expr/Stmt  Representation 

Arithmetic  NUM = NUM op NUM 
op =  
Compare  BOOL = BOOL op BOOL 
op = , , , , ,  
Head/Tail  NUM = ARR[0] ARR[1] 
Pop (Head/Tail)  NUM = ARR.pop(0) 
NUM = ARR.pop()  
Pop at Ind  NUM = ARR.pop(NUM) 
Append  ARR.append() 
New Array  ARR = list() 
Constant  NUM = 0 1 
Array copy  ARR = ARR.copy() 
AoA copy  AoA = AoA.copy() 
Func Arg  ANY_TYPE = param 
Assign Stmt  NUM = NUM 
Return Stmt  return ANY_TYPE 
Reduction  NUM += NUM 
If Stmt  if BOOL: BLOCK 
For Stmt  for NUM in ARR: BLOCK 
for NUM, NUM in enumerate(ARR): BLOCK  
for ARR in AoA: BLOCK  
for NUM, ARR in enumerate(AoA): BLOCK  
BLOCK  BEGIN Statements END 
For Stmts allow us to iterate over two types of data structures – ARR or AoA. The latter type is used in a contextsensitive way – if and only if the function has a parameter of type AoA. For each of those two types, AAD allows enumerated and nonenumerated forloops. In Python, enumerated iterative loops provide both the index and the item at that index, making them more versatile. However, such a statement leads to two producers, one for the element and one for the index.
In AAD, the Expr class representing expressions are structured so that it can represent multiple Producers, Consumers and/or ProdCons. A function is modeled as a sequence of Expr objects.
For a BLOCK, we add BEGIN and END delimiters (dummy Expr objects), which are useful in analysis. They are not emitted as part of generated Python code, though they decide indentation of emitted code.
Since function calls are expressions, ExpStore can include calls to library functions. However, to be as minimalistic as possible, we use library calls needed only for basic array access – pop head/tail or a given index, and append to tail.
At first glance, ProdCon operands may appear as an unnecessary complication. However, Python’s library functions like append() modify the source operand, and so does the Reduction expression in Table 1. ProdCon operands allow modeling these succinctly. In program synthesis such operands provide a less obvious but important benefit because they reduce the total number of operands in an expression and reduce the number of unique producers in a program, leading to a reduction of search space.
More expressions and statements can be added to ExpStore as needed, but only those we use for current results are listed in Table 1. Especially, the grammar shown leads to regular controlflow and avoids statements like break and continue, though addition of these is readily supported.
2.3.2. Evolver
The Evolver is responsible for combining expressions and idioms to assemble a program (a function), which can potentially solve the problem presented by ProbGen. The Evolver constructs the solution function (SolFunc) in three phases.
Phase 1: Building of SolFunc
First, Evolver initializes SolFunc so that the input argument expressions are at the top of the function and the return statement is at the bottom of the function as shown in Figure 3. The building of SolFunc boils down to inserting items in ExpStore between the input arguments and the return statement. Second, the Evolver builds SolFunc bottomup, starting from the return statement. If the return statement consumes a value of type , the Evolver randomly picks an expression, , from ExpStore that has a producer (or a ProdCon) of type and inserts it above the return statement. Now, , has its own consumer operands, for which producers must be found. Consequently, the Evolver picks another expression at random from ExpStore to produce each source operand of , and inserts it randomly somewhere above , but below function arguments. Instead of selecting an expression, the Evolver may randomly choose to insert an idiom from the ExpStore.
If input arguments are only of one type, , just below the input arguments, the Evolver inserts an expression consuming an operand of type and producing a type other than . For instance, if the sole input argument is an array, the Evolver inserts a randomly picked expression that consumes an array and produces a number. This heuristic ensures that values of both types, ARR and NUM, are generated at the top of the function, allowing any later expressions to consume values of these two common types. Additionally, local copies are made of incoming ARR function parameters to prevent them from being modified within the function (see Figure 3).
Phase 2: Linking Producers and Consumers
In Phase 1, Evolver gives each producer operand a unique integer ID, when an expression is inserted into SolFunc. In Phase 2, consumer (and ProdCon) operands have to be assigned one of those IDs, linking a producer and a consumer. The Evolver, starts this process at the bottom of SolFunc, starting with the return statement. The return statement, , has only one consumer of type . The Evolver looks for all expressions above , for one producing an operand of type . Out of all those expressions, the Evolver picks one at random and assigns the ID of that producer (or ProdCon) to the consumer, thereby linking the two. The Evolver continues this process, from bottom to top of SolFunc, until every consumer operand is linked with a producer (or ProdCon). One producer can be consumed by one or more consumers (or ProdCons).
Since AAD supports block nesting, while linking producers and consumers Evolver has to make sure scoping rules are met. For instance, in Figure 3, producer num_13 is in an inner block than its consumer, the return statement, violating scoping rules. There are several ways to fix this and we chose to alias num_13 with another producer (e.g., num_11) at the same level as the consumer, causing to emit num_11 instead of num_13, for all operands in all expressions. If there are multiple consumers of a producer, the producer must be at the same level as the outermost consumer.
The linking phase also opportunistically detects dead expressions, which are those with producers that are not consumed. A rigorous attempt is not made to detect and remove all dead code because Python interpreter executing the produced code can do this for itself.
At the end of Phase 2, SolFunc is complete and can be compiled and executed.
Phase 3: Operator & Function Call Mutation
The completed SolFunc can be optionally mutated in Phase 3. The first four expressions in Table 1 are designed to capture multiple operations. For instance, the first grammar rule for binary Arithmetic operations captures four operations: +, , *, // (integer division).
As a result, we can easily change an expression from an addition to a multiplication without rebuilding SolFunc or relinking producers and consumers. In Phase 3, the Evolver randomly changes these operations. In addition to operators, Phase 3 can mutate an existing function call (e.g., Max) to another compatible call (e.g., Min), with the same type of argument(s) and the return type.
2.4. Checking Output
Once SolFunc is built (and mutated), it is executed to produce output using Python’s exec() function. The output is checked with the Checker, which either accepts or rejects the output. If the first output is accepted, the same SolFunc is tested with more inputs of different sizes, generated using ProbGen. If all those tests are accepted by the Checker, the SolFunc is declared a solution for the problem. The above three phases constitute one evolutionary step.
2.5. Evolutionary Strategies
This section describes evolutionary strategies used by AAD.
2.5.1. Composition
AAD uses selfdiscovered solutions to simpler problems to compose more complex solutions. To this end, AAD evolves an entire group of problems at once, as shown in Figure 4(a). Once an acceptable SolFunc is found for one problem in the group, it is allowed to be called by others by adding an appropriate function call to ExpStore. Since a function call is an expression, when a SolFunc is accepted by the Checker, AAD creates a function call expression for it with appropriate parameters. AAD uses the inputoutput description given by ProbGen to determine the type of each parameter and the return type. By AAD’s convention, the input parameters are always readonly (consumers). However, a function like Remove(ARR, NUM) modifies the first parameter. We allow such functions to be created as well by allowing ProbGen to identify the first parameter as a ProdCon and omit a separate return value (i.e., omit a separate producer). When emitting, AAD emits such a function with the same identifier for the return value and the first argument as arr1 = Remove(arr1, num1).
Function composition has a profound effect on reducing the size of search space. For instance, assume we allow statements in SolFunc and the ExpStore contains items to pick from. Since each statement in SolFunc can be filled in ways (with repetition), there are unique functions we can create. This is the size of the search space. If a problem requires two functions of size (one function calling the other), without function composition we may need up to statements to solve this problem. That increases the search space to possibilities. In contrast, if we have an additional function, we have expressions in the ExpStore. Therefore, the number of unique SolFuncs we can create becomes . For nontrivial cases,
Therefore, function composition is much more effective at reducing search space than allowing more statements in SolFunc.
Although genetic mutation and recombination get the most attention, composition can be also seen in biological cell evolution, where it is called endosymbiosis (Purves et al., 2003, p. 77). For instance, it is widely believed that mitochondria present in eukaryotic cells (like animal or plant cells) have been captured from the primitive environment, where mitochondria existed independently as a more primitive prokaryotic cell (like bacteria). However, for mitochondria to evolve as a prokaryot, it must have solved some natural challenge (problem). In fact, it solved the problem of energy production on its own and is the ‘power plant’ in a eukaryotic cell. This shows that having many problems to solve is a key to evolution. This is the main insight used by AAD; if we want to solve larger problems, there should be many other simpler problems present, to guide evolution.
2.5.2. Ganged Evolution
Related problems, which have the same number and types of inputs and outputs (e.g., Min and Max, or SortAsc and SortDesc), are ganged together into a single gang (see Figure 4(a)). Once a SolFunc is generated for one of the problems in the gang, at the end of Phase 2 or 3, it is tested on all problems in the gang. This allows solutions to be found faster because a potential solution (SolFunc) may satisfy one of the many problems in the gang.
2.5.3. Solution Mutation
Instead of building a SolFunc from scratch in Phase 1, sometimes the Evolver picks an existing solution for another problem in the same gang, and attempts to mutate it by sending it through Phases 1, 2 and 3. In AAD, solution mutation is a tradeoff, because if we pick an already built solution for mutation, we lose the opportunity to build a fresh SolFunc in that step. Notice that solutionmutation is different from operatormutation described under Phase 3.
Natural evolution also takes advantage of multiple, related problems present in an environment to test new solutions or adapt existing ones for new purposes, as in the case of evolution of birds’ feathers, which are believed to have first served a thermoregulatory or a display function (Purves et al., 2003, p. 671).
2.5.4. CrossPollination Among Ranks
AAD creates multiple concurrent processes (called ranks) and assigns problems to each of them to solve. Evolution happens in multiple ranks in isolation, in periods called epochs, as shown in Figure 4(b). Pseudocode for the highlevel evolution algorithm for one epoch is shown in Figure 5.
At the end of an epoch, synchronization happens and solutions are exchanged among ranks, as shown in Figure 4(b). At the end of an epoch, solutions discovered by each rank are sent to the master rank, which collects and distributes all of them to all ranks to be used in the next epoch. In the next epoch, a rank may receive solutions discovered by any rank in the previous epoch (see Figure 4(b)).
2.5.5. Diverse Environments & Solutions
In AAD, each rank maintains its own copy of ExpStore and AAD allows some of the nonessential expressions in Table 1 to be randomly removed form a rank’s ExpStore. In current setup, binary Arithmetic operations are added 80% of the time to a rank, Pop at Ind expressions are added 20% of the time, and Reduction idiom is added 10% of the time. Further, when a For Stmt is inserted, 20% of the time we use an enumerated For Stmt instead of a nonenumerated one. Moreover, when a rank receives a solution to a problem, a function call for it is added to ExpStore only 80% of the time. These random omissions of expressions, including function calls, help create diverse environments with respect to expressions available in the ExpStore.
Additionally, AAD allows multiple solutions (e.g., 100) to coexist for a given problem. For a solved problem, a rank may receive any one of these existing solutions, picked at random. This allows solution mutation to start from different solutions. Further, currently, 20% of the ranks do not receive a solution for a given problem, even when one exists, forcing them to find their own solutions. This is illustrated in Figure 4(b) with an empty circle, where Rank R3 does not receive a solution for problem B. Both of these strategies increase diversity of solutions.
These strategies are inspired by biological evolution, which uses diverse environments resulting from different temperatures, salinity, humidity, pressure, etc., to come up with different organisms. Similarly, evolution depends on diversity of individuals (solutions) within a population.
2.6. Engineering Challenges
Since AAD supports loops and conditional statements and uses function composition, AAD presents several challenges that are not faced by simpler program synthesizers. We take a practical engineering approach to solving these challenges as outlined below.
2.6.1. Exceptions
Although AAD produces syntactically correct programs, many runtime exceptions are possible due to various reasons ranging from divide by zero to popping from an outofbound index. Fortunately, Python provides a robust exception handling framework and AAD catches all exceptions arising from Python’s builtin exec() function used to execute SolFunc.
Allowing exceptions in the first place does not make AAD less robust. As an engineering example, outoforder processors introduce unintended exceptions such as page faults, divide by zero exceptions, through speculative actions taken by the processor itself (Bringmann et al., 1993; Dwyer and Torng, 1992; Wall, 1994). However, processors have mechanisms to detect such violations and correct themselves, thereby making them robust.
2.6.2. Infinite Loops
Even natural evolution may cause infinite repetition, as seen with cancerous cell division. Instead of trying to detect infinite repetion (loops), we use a timeout to terminate programs that do not terminate within a given time period (e.g., 1 second), using Python’s signal module to set an ALRM signal. Although it may appear as a crude solution, this is a well established engineering technique used in complex systems like microprocessors and spacecrafts, which use various watchdog timers (Weaver and Austin, 2001; Wang and Patel, 2006; LaBel and Gates, 1996) to recover from a multitude of unpredictable situations like deadlocks, livelocks (starvation), soft errors, etc. For instance, when an execution core of a microprocessor issues a memory read request, if it gets dropped in the memory system due to an unexpected condition, the core may timeout and reissue the request.
However, there is a cost to this approach: timeouts waste valuable processing time. Therefore, we use one heuristic to prevent one such cause of infinite loops. When we iterate over an array using a for loop, allowing to grow by appending more items can cause an infinite loop. In Python, this can be easily prevented by making a tuple, which is immutable.
2.6.3. Infinite Recursion
Since cycles in call graph can lead to infinite recursion, AAD disallows any recursion. Since recursion detection is well understood (Brent, 1980; Gammie, 2015), AAD detects and discards programs with recursion. When we construct SolFunc, we do not let SolFunc call itself. When we call an already built function within SolFunc, we do not allow to call SolFunc either. Therefore, we do not have a cycle between SolFunc and . We repeat this cycle detection process for any function we use.
2.7. Parallelization and HPC
The parallelization of the evolutionary search process naturally lends to Highperformance Computing (HPC), where a large number of processors and nodes are used to solve a problem. Therefore, we designed AAD as a multiprocess application, using Python’s multiprocessor module, to take advantage of multiple processors on a single node. In addition, as shown in Figure 4(b), AAD can take a checkpoint of solutions discovered since the last checkpoint. With distributed file systems available on HPC clusters, this allows solutions found on node to be exchanged with other nodes, by those nodes reading the checkpoint dumped by node . A node can read any available checkpoint at the start of an epoch, and does not wait for any specific checkpoint, thereby avoiding any synchronization. A checkpoint is produced using Python’s pickle module.
However, parallel processing introduces other engineering challenges, including load balancing. Due to diverse environments and random nature of evolutionary decisions, some ranks may take longer to finish an epoch than others. Similarly, a large number of timeouts due to infinite loops can also increase the execution time of a rank. To mitigate the latter problem, a rank counts the number of total timeouts and if it is greater than a threshold, ends epoch early.
Strategies like discarding of solutions and early termination of search are made possible by the nondeterministic evolutionary nature of AAD. Evolution does not put much value on a single individual or even a single environment, but mainly depends on the continuation of the process itself. As long as evolution continues, a solution may be found one way or the other.
3. Methodology
The entire AAD framework is written in Python and is about 6,700 lines of code (see Appendix E), including all problem generators and checkers, blank lines and comments.
We use 3 groups of 10 problems each (described in Appendix A) to study the effectiveness of evolutionary strategies described in Section 2.5. All problems are listed in Section 4.1.1. GroupA consists of typical list manipulation problems (e.g., min, sort, reverse, index, remove), GroupB represents basic vector processing problems (e.g., dotproduct, matrixvector multiply), and GroupC consists of some basic spreadsheet problems (e.g., sum, sumif, countif, average). One entire group is evolved in a given run. We run 112 concurrent processes on a 4socket Intel Xeon(R) 8180 Platinum server with a total of 112 physical cores. To minimize artifacts due to randomness, we do 10 runs per experiment, for each group.
The metric used to evaluate different strategies is steps. Within a single step, we can send a SolFunc through phases 2 and 3 any number of times, to relink and remutate. Currently, we do this twice, leading to 4 different variants. Since we simulate 112 ranks concurrently, each step reported accounts for 448 distinct SolFuncs. On the above server, on average, one step on a rank takes about 7 ms per gang. Runtime, however, is not an objective of this paper and we make no claims regarding runtime nor make any attempts to compare runtime of AAD to prior work.
For each problem, when a solution is found on a rank, it is reported by writing the solution and relevant statistics to a log file created for each rank. The first solution to be found for a problem among any rank is found through postprocessing log files and its step count is reported as the time to solution for that problem for that run. For an experiment with 10 runs, for a given problem, we report the average of all such step counts across all 10 runs.
For each run, we simulate a maximum of 100 epochs, with each epoch containing 2,000 steps. If at least one solution is found for all problems in a group before 100 epochs, we end the run and dump a checkpoint. For reporting purposes, AAD can rank solutions by reading checkpoints from one or more runs to find the least complex ones. Although there are many strategies to rank (Gvero et al., 2013; Perelman et al., 2012; Singh and Gulwani, 2015), we use a simple heuristic that assigns different weights to different structures – e.g., 50 for a function call, 20 for a For Stmt, 10 for an If Stmt, and 2 for any other statement. Ranking for all other purposes is left as future work.
AAD supports different parameters. Currently, we allow a maximum of 12 statements for a SolFunc, with additional 2 statements if SolFunc is mutated from an existing solution. We allow up to 100 different solutions to be found for a single problem. Default parameters used to create diverse environments were given in Section 2.5.5. All values were picked as reasonable guesses and tuning them is left as a future study.
There are no restrictions placed on inputs of problems other than all arrays must be nonempty. Problem generators usually generate input arrays from size 1 to 200, randomly filled with integers from 200 to +200.
4. Results and Analysis
In this section, we present results showing the effectiveness of different evolutionary strategies discussed in Section 2.5, along with insights from code generated by AAD.
4.1. Evolutionary Strategies
Max 
Min 
SortDesc 
SortAsc 
ReverseArr 
RemoveL 
RemoveF 
LastIndOf 
FirstIndOf 
IsInArr 


Max  
Min  
SortDesc  57  14  43  43  14  57  
SortAsc  4  8  94  94  6  8  18  
ReverseArr  
RemoveL  2  1  26  22  79  10  
RemoveF  1  1  59  61  43  14  
LastIndOf  2  
FirstIndOf  2  
IsInArr  2  3  3  1  18  61  3 
4.1.1. Composition
Table 2 shows callercallee relationships for GroupA problems (see Appendix B for all groups). For instance, the row for SortDesc function shows that SortAsc called Max in 57% of the solutions, Min in 14% of the solutions, and so on. Functions Min, Max, and ReverseArr did not call any other function. All other functions depended on one or more functions to arrive at a solution, underscoring the importance of function composition. Some of the function dependencies may be unexpected as discussed in Section 4.2.1.
GroupA  Baseline  Exp1  Exp2  Exp3  Exp4 

Max  98  0.9  1.1  0.9  1.1 
Min  90  0.8  2.6  0.9  1.0 
SortDesc  59929  1.2  
SortAsc  60177  1.2  
ReverseArr  160  0.9  1185.9  1.2  1.5 
RemoveL  2599  3.1  0.8  
RemoveF  4050  5.2  0.8  
LastIndOf  511  1.3  0.7  2.1  
FirstIndOf  17077  1.8  2.5  0.7  
IsInArr  269  0.8  0.9  1.4  
GroupB  
AddArrays  2349  1.0  0.7  2.7  1.3 
MultArrays  18797  0.3  10.6  7.5  3.1 
Sum  9  2.6  0.9  1.1  1.1 
SumOfSq  9373  7.9  11.0  1.8  
DotProd  18290  7.8  2.9  
MatVecMult  54476  2.1  
AddToArr  232  0.7  0.8  1.8  1.8 
SubFromArr  2995  1.9  4.5  0.7  
ScaleArr  3682  0.5  2.7  0.7  
ScaledSum  29  269.0  2.0  0.8  
GroupC  
CountEQ  91345  1.5  1.9  
CountLT  109444  1.2  1.3  1.6  
CountGT  109431  1.2  1.3  1.6  
SumIfLT  25109  5.7  5.7  5.4  1.7 
SumIfGT  25065  5.7  5.7  4.6  1.7 
SumIfEQ  25001  5.7  8.0  4.5  1.7 
ScaledAvg  7548  1.1  0.8  
Sum  7  18.6  0.6  1.6  2.3 
Len  28  0.5  1.1  0.5  0.2 
Avg  891  1.2  1.2 
Table 3 shows all three groups of problems, their baseline performance in terms of the step count, and compares the effectiveness of four evolutionary strategies against the baseline. For the four experiments shown, each entry gives the ratio between the number of steps taken without that strategy and that of the baseline. Baseline step counts span a wide range due to the varying complexity of problems. Since step counts for each entry is taken by averaging across multiple runs, when a run does not produce a solution for a given problem, the maximum step count (200,000) is used for that run (Lee et al., 2018). If none of the runs for a given experiment produces a solution for a given problem, symbol is used to indicate that fact.
Step counts for Exp1 are obtained by disabling the addition of function calls to ExpStore for solutions found (i.e., disabling composition). This increases time to solution by a significant factor for some problems (e.g., SumOfSq, ScaledSum) and makes it impossible to find any solutions for others (e.g., SortAsc, DotProd), underscoring the importance of composition, especially for more complex problems. However, not adding function calls to ExpStore shrinks the search space and hence can speed up solutions of some simpler problems (e.g, Len, IsInArr) that do not have to depend on other functions.
4.1.2. Operator Mutation
Exp2 disables operatormutation by disabling Phase 3, leading to increased solution times and altogether unsolved problems, especially in GroupA. Solutions that depend on an alternative operation (e.g., multiply instead of addition) are susceptible to eliminating operatormutation. However, having fewer (or late) solutions reduces search space by reducing function composition and hence can speed up solutions to problems that do not depend on operatormutation (e.g., Sum).
4.1.3. CrossPollination
Exp3 in Table 3 disables crosspollination among ranks by running a single epoch of 200,000 steps, thereby avoiding any exchange of solutions among ranks. Many problems that depend on others (e.g., SortAsc, SortDesc) are severely affected by lack of crosspollination.
4.1.4. Diversity
Exp4 reduces the number of solutions maintained for a given problem from 100 to 1, thereby decreasing the diversity of solutions for a given problem. This negatively affects several problems (e.g., MultArrays, DotProd) demonstrating the importance of solutiondiversity. Similarly, for GroupC, Exp5 in Table 5 shows the effect of adding all optional expressions described in Section 2.5.5 100% of the time, decreasing diversity of environments and increasing expressions available to every rank. This shows that for many problems in GroupC adding optional expressions randomly is a better choice than always adding them to every rank. However, as before, simpler problems (e.g., Len) can benefit from the late discovery of other solutions.
CountEQ 
CountLT 
CountGT 
SumIfLT 
SumIfGT 
SumIfEQ 
ScaledAvg 
Sum 
Len 
Avg 


CountEQ  38  33  1  6  19  
CountLT  43  33  2  3  16  
CountGT  43  36  1  3  15  
SumIfLT  2  41  56  
SumIfGT  1  1  43  53  1  
SumIfEQ  1  42  53  1  
ScaledAvg  22  25  34  
Sum  
Len  5  
Avg 
4.1.5. Solution Mutation
Table 4 gives % of solutions where a solution is mutated from another solution (a parent) for GroupC problems, where this is most common (see Appendix C for all groups). It should be noted that mutation can happen in either direction, on different ranks. For instance, some ranks may first come up with Min (or receive Min as an already solved problem from another rank) and Max may mutate out of it. On other ranks, Min may mutate out of Max.
CountEQ 
CountLT 
CountGT 
SumIfLT 
SumIfGT 
SumIfEQ 
ScaledAvg 
Sum 
L en 
Avg 


Exp5  1.8  1.4  1.4  2.0  2.0  2.0  2.1  1.3  0.4  1.5 
Exp6  2.0  1.6  1.6  3.7  5.8  2.2  0.9  0.6  0.4  0.9 
Exp6 in Table 5 shows the effects of disabling solutionmutation for GroupC problems, which are dependent on parents as given by Table 4. Problems that mutate from parents show increased time to solution, while problems that do not (e.g., Len) see a speedup, as in previous experiments.
AddArrays  MultArrays  Total  

Separate  2023  7866  9889 
Together  1746  6105  6105 
4.1.6. Ganged Evolution
To show the effectiveness of gangedevolution, we picked two problems that belong to the same gang, evolved them one at a time, and compared the results to evolving them together, as shown with Exp7 in Table 6. To isolate effects of gangedevolution, we disabled function composition and solutionmutation, and picked two problems that can evolve without other functions. Results show that evolving both of them separately takes about 1.6X steps (9,889) compared to evolving them together (6,105). Total column reports total steps to find solutions for both problems, in each case.
It should be emphasized that every evolutionary strategy is not important for every problem. Some simple problems can directly evolve from the grammar itself and they are often hurt by advanced strategies used. However, as these results show, many complex problems cannot find a solution without these strategies, within a reasonable time limit.
4.2. Code Examples and Insights
Being able to generate complex code is an important result of AAD. Besides SortAsc (from GroupA) shown in Section 1, in this section we show an example each from GroupB and GroupC. Code generated for all 29 problems is shown in Appendix D. Since AAD finds multiple solutions for a problem, we only discuss one of them, usually the least complex one.
The solution for MatVecMult (Figure 6) is performing dot products (DotProd) between row vectors in the matrix (arg0) and the input vector (arg1), and appending those results to a new result vector. DotProd in turn depends on the sum (Sum) of two arrays multiplied together (MultArrays). MatVecMult
is performing a linear transformation, which is the basis of linear algebra, and hence this discovery of AAD is particularly noteworthy.
From GroupC, the solution for CountEQ, which counts the number of times an element occurs in an array, is shown below. The algorithm is somewhat circuitous, because first it appends matching elements to a new list and then finds the length of that list using Len. This is an example of a nonobvious algorithm, although it is not the most efficient solution.
def Len(arg0): arr_10 = arg0.copy() num_15 = 0 num_14 = 1 for num_12 in tuple(arr_10): num_15 = num_14 + num_15 return (num_15) def CountEQ(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_15 in tuple(arr_10): bool_17 = (num_11 == num_15) if (bool_17): arr_14.append(num_15) arr_10.append(num_15) num_12 = Len(arr_14) return (num_12)
It should be noted that the code generated by AAD often contains redundant operations like copying, appending and popping the same element, calling functions whose results are never used, ifstatements with always True or False conditions, etc. Some of these can be easily eliminated with standard compiler techniques and are out of scope for the current version of AAD.
4.2.1. Outsideofthebox Solutions
This section describes some unexpected programs illustrating both strengths and weaknesses of AAD. Since AAD uses a limited number of input combinations generated by a ProbGen to check the validity of a program, any solution generated for a problem is as good as the ProbGen and the Checker used. This is both a weakness and a strength depending on the application. If an application demands rigorous validation, it is the responsibility of the ProbGen and the Checker to cover all cases, including corner cases. Writing such verification logic can be quite demanding, which is an obvious weakness. On the other hand, when an application needs to take advantage of peculiarities of the input, or needs to come up with solutions in constrained environments, AAD can show remarkable adaptability.
As a simple example, the grammar we use does not have truth values True and False provided as constants. Although initially this was an omission on our part, we realized that AAD was actually generating these values when they were needed, using an expression of the form bool_10 = (num_11 == num_11). Similarly, initially, we forgot to include constant values 0 and 1 in the grammar. AAD overcame that difficulty by subtracting the same value from itself to generate zero and dividing the same value (e.g., the last value of an array) by itself to to generate constant 1. Although, the latter approach is not safe because the last value of an array could be zero, it used that in cases where the ProbGen did not put a zero at the end of an array. To defeat this, we changed ProbGen to generate an array of all zeros for some problem sizes. Then AAD generated constant value 1 by taking advantage of Len – by dividing the length of the array by itself, because we always use nonempty arrays. This shows that the SolGen is in an adversarial relationship with ProbGen/Checker, trying to defeat the latter duo by exploiting any opportunity or weakness present, similar to bacteria adapting to antibiotics in biolgical evolution.
Another such example is FirstIndOf, which is used for finding the index of a given element in an array. If there are multiple elements present, the function returns the index of the first element found. Most programmers would write a forloop to iterate over the array looking for the element, and when a match is found, will break out of the loop using a break or a return statement. However, the current grammar used by AAD does not have a break statement and does not use return statements in the middle of a function. One solution AAD came up with to solve this challenge is given below. First, it goes through the array popping each element from the front. If a match is found, it is appended at the end of the same array. Once it has gone through the loop, all the elements left in the array are the matching elements appended at the end. Then this function returns the head of the remaining array, which is the first matching index. In other situations, we have also seen it calling Min function to find the minimum value of the remaining array.
def FirstIndOf(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_12, num_13 in enumerate(tuple(arr_10)): num_17 = arr_10.pop(0) bool_16 = (num_17 == num_11) if (bool_16): arr_10.append(num_12) num_13 = arr_10[0] return (num_13)
FirstIndOf is usually defined with a precondition – viz., the element must be always present (e.g., index() in Python). With our ProbGen/Checker, the behavior of this function is not defined if arg1 is not present in the array. We observed IsInArr function exploiting this undefined return value of a specific FirstIndOf function to arrive at a solution, which always passed our checker but was incorrect in a rare situation the Checker did not check for. Although such a solution is problematic in many uses, it illustrates the ability to take advantage of deficiencies (or features) present in the inputs or the Checker. This could be quite valuable for an autonomous system to adapt to (or take advantage of) vulnerabilities of an adversary.
The following solution for IsInArr is a prime example of outsideofthebox thinking. This code finds whether a given element (arg1) is in a given array (arg0). Most programmers would write a loop that goes over the array and looks for a match. However, the following program does not contain any loops, which appeared to be a bug until we realized what it was doing. It first appends arg1 to the end of the array. Then, it calls RemoveF to remove arg1, which removes the first matching element. Then it pops the last element of the array and checks whether it matches arg1 it appended. If it matches arg1, removeF must have removed another element equal to arg1 from the array. In that case, there must have been another element already present in the array and IsInArr must return true, as it does here. If the popped element does not match arg1, it means RemoveF removed arg1 it appended itself. In that case, there was no prior matching element and IsInArr must return False, as is the case.
def IsInArr(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_10.append(num_11) arr_10 = RemoveF(arr_10, num_11) arr_18 = arr_10.copy() num_13 = arr_18.pop() bool_12 = (num_13 == num_11) return (bool_12)
If any person were to come up with this solution for IsInArr, we would have labeled him or her as creative or an outsideofthebox thinker. It is hard to believe that a machine is capable of this level of logical reasoning, although AAD stumbled upon it without any reasoning at all. This capability presents us with a new opportunity for AAD – viz. to use it as an outsideofthebox thinker to come up with alternative solutions that we would not normally think of. After all, creative thinking does not seem to be the prerogative of humans alone.
5. Related Work
Program synthesis is an active and challenging research area that seeks to automate programming. Many approaches have been proposed over the years in an effort to generate programs that conform as accurately as possible to the userexpressed intent. In (Gulwani et al., 2017) Gulwani, Polozov, and Singh present an excellent survey on the program synthesis problem, applications, principles and proposed solutions. In (Polozov, 2018), the same authors extend their prior survey to include more recent advances that span 2017 and 2018. Below, we review related work in the area, discuss typical problem domains, challenges in program synthesis and position our work among prior research.
Program synthesis approaches have targeted automatic generation of solutions for problems in domains such as data wrangling (Gulwani, 2011; Singh and Gulwani, 2012, 2016; Le and Gulwani, 2014), graphics (Chugh et al., 2016; Hempel and Chugh, 2016), code repair (Singh et al., 2013; Jobstmann et al., 2005; Nguyen et al., 2013), superoptimization (Phothilimthana et al., 2016; Massalin, 1987; Joshi et al., 2002; Bansal and Aiken, 2008), and others. In the above cases solutions are sought for restricted target problem domains such as string manipulation (Gulwani, 2011), bitvector programs (Jha et al., 2010), optimized code implementations at the ISA level (Phothilimthana et al., 2016), etc. Moreover, many program synthesis approaches are restricted to straightline code fragments (Phothilimthana et al., 2016; Bansal and Aiken, 2008; Balog et al., 2016). While there exist works that target loopbased code, they are either restricted in the form of SIMD code (Barthe et al., 2013) or synthesis of loop bodies (e.g., within templates (Srivastava et al., 2013) or sketches (SolarLezama et al., 2007)); the majority of related work, unlike AAD, focus on loopfree programs (Jha et al., 2010; Gulwani et al., 2011). However, program synthesis tools like SYNQUID (Polikarpova et al., 2016), MYTH (Osera and Zdancewic, 2015), and Leon (Kneuss et al., 2013) can generate recursive and/or higherorder programs that can be as expressive as loopenabled approaches, in frameworks with formal specifications. AAD enables similar capabilities for generalpurpose Python language with its support of loops, function composition, and complex control flow.
Two of the main challenges in program synthesis are related to the intractability of program search space and accurately expressing user intent. There are many search techniques proposed to address the former problem: enumerative, deductionbased (Polozov and Gulwani, 2015), constraint solving (SolarLezama, 2008; Srivastava et al., 2013), as well as stochastic search techniques (Menon et al., 2013; Koza, 1994)
. Stochastic search techniques include a lot of modern approaches that employ machine learning
(Menon et al., 2013; Liang et al., 2010) and neural program synthesis (Balog et al., 2016; Feng et al., 2018; Vijayakumar et al., 2018; Zhang et al., 2018). In comparison, AAD uses a modified evolutionary approach that relies on PGE without requiring a fitness function.As far as expressing user intent is concerned, different program synthesis techniques use formal logical specifications (mostly deductivebased techniques), informal natural language descriptions, and templates and sketches, among others. In AAD we provide the specification in the form of a program (called a checker) along with test inputs. This is similar to oracleguided synthesis (Jha et al., 2010) (where an oracle produces the correct output) and reference implementations in SKETCH (SolarLezama, 2008).
To show the effectiveness of AAD, we use array based problems similar to the benchmarks (e.g., arraysearch) in the SyGuSComp competition (syg, [n. d.]). However, solvers such as Sketchbased, enumerative or stochastic, do not scale up to large array sizes (Alur et al., 2013), although it may be conceptually possible to extend Sketch based templates to express the grammar we use, in order to support large arrays. AAD’s support of loops allows it to support input arrays of any (nonzero) size, and hence be as effective as frameworks that support recursion and/or higherorder operators.
Overall, our work complements and builds upon prior approaches that use composition; Bladek and Krawiec (Bladek and Krawiec, 2016)
propose a similar genetic programming approach of simultaneous synthesis of multiple functions, and briefly explain the concept with four simple examples (last, patch, splice, splitAt). AAD extends this to much more complex problems with PGE and associated evolutionary strategies. Although, other works, like SYNQUID
(Polikarpova et al., 2016) utilize components in the process of program synthesis, unlike the former, PGE does not require the user to specify any underlying order (dependence) between the constituent components/functions. Besides, not specifying dependencies allows AAD to discover outsideofthebox solutions.Compared to other works (e.g., deductivebased approaches), program equivalence in AAD is not formally proven. This is typical in similar approaches of counterexample guided synthesis where ”the best validation oracle for most domains and specifications tends to be the user” (Gulwani et al., 2017), who can inspect the program under consideration. We also emphasize that formal verification is not a prerequisite for many useful applications, especially for knowledge discovery for AI (e.g., for a robot to find a way to sort objects for packing). Human knowledge in general is inductive in nature. After all, biological evolution produced complex and intelligent organisms as humans without anyone writing a formal specification.
6. Discussion & Future Work
This section discusses limitations of AAD, alternative ways of guiding evolution, and potential applications of AAD.
6.1. Limitations of AAD
A large search space is a challenge to searchbased synthesizers (Gulwani et al., 2017). For AAD, this is especially true due to addition of function calls as new solutions are discovered. AAD depends on guiding to solve this challenge and Section 6.2 outlines several possible ways for guiding.
The problems solved in this paper mostly require regular control flow (except FirstIndOf and RemoveF). Programs that require complicated controlflow may take considerably more time to be discovered. Similarly, algorithms that depend on a very specific value (e.g., if x 3.5
) are hard for AAD to discover, unless those values are present. AAD would be more suitable for performing permutations and combinations of already available inputs, with straightforward numerical processing. The solution for the above deficiency is to have proper library support. For instance, although it may be difficult for AAD to produce an algorithm for FFT (Fast Fourier Transform) on its own, it should be able to call an FFT implementation in a library and use that to solve other problems.
6.2. Guiding the Hand of Evolution
Grouping problems for PGE can be achived in several ways. First, we can imagine humans (domain experts) doing guiding. For instance, future ‘programmers’ or scientists could just suggest AAD to use problems A and B to come up with a solution for problem C (e.g., “try using dot product to come up with an algorithm for matrixvector multiply”). Notice that this is quite analogous to the way we teach children to discover solutions to problems on their own (“try using a screwdriver instead of a hammer”). Similarly, a researcher who wants to come up with a hypothesis, or a programmer who wants to come up with a heuristic, may be able to make some suggestions and let AAD discover an algorithm, especially a nonobvious one, based on that guidance. If the Checker is based on past data or sensor data from the physical environment, this strategy could be used on many realworld problems without having to write a Checker or a ProbGen as we discussed in Section 2.2. This would be an entirely new way to “program” computers and build scientific models, and we intend to pursue this further.
Second, we can imagine other AI programs doing this guiding, especially in restricted domains where AI systems can guess the components of a solution based on domain, but not the exact algorithm (Balog et al., 2016).
6.3. Other Potential Applications
Conceptually, AAD can also be used in program translation. If we have a routine written in C, assembly, or even binary language, we can execute that routine as a Checker for AAD to produce code in Python (or similar). This is akin to a machine learning an algorithm by just observing how another one behaves (i.e., how another one responds to inputs). Incidently, Python code shown in this paper can be considered as Python to Python translations, since the Checker is a different Python implementation.
AAD could be more than a program synthesizer. It could be used to acquire intrinsic knowledge for machines. The callercallee graph (Table 2) and the parentchild graph (Table 4) capture inherent relationships between different problems. For instance, we can see min/max is related to sorting, and dotproduct to matrixvector multiply. These relationships are discovered by AAD itself and can be thought of as one representation of associative memory among actions, similar to what human brains construct (e.g., getting ready in the morning is associated with brushing teeth, dressing up, etc.). Since AAD allows incremental expansion of knowledge by introducing more and more problems, with a proper guiding mechanism we may be able to guide autonomous systems to acquire a large number of skills (algorithms) and build a knowledge representation on their own, the same way we guide our children to acquire a large body of skills and knowledge by presenting them with many problems and challenges over their childhood.
7. Conclusion
We presented AAD, an evolutionary framework for synthesizing programs of high complexity. Using a basic subset of Python language as grammar, AAD allowed us to synthesize code for 29 array/vector problems, ranging from min, max, reverse to more challenging problems like sorting and matrixvector multiplication, without input size restrictions. AAD’s use of problem guided evolution (PGE) and related evolutionary strategies made this possible. We evaluated the effectiveness of these strategies and presented evidence of outsideofthebox problem solving skills of AAD. To deal with various challenges posed by complex requirements, we demonstrated how to use HPC techniques. Overall, we show that evolutionary algorithms with PGE are capable of solving problems of similar or higher complexity compared to the stateoftheart.
References
 (1)
 syg ([n. d.]) [n. d.]. SyntaxGuided Synthesis Competition. http://www.sygus.org/.
 Alur et al. (2013) R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. SolarLezama, E. Torlak, and A. Udupa. 2013. Syntaxguided synthesis. In 2013 Formal Methods in ComputerAided Design. 1–8.
 Balog et al. (2016) Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. CoRR abs/1611.01989 (2016). arXiv:1611.01989
 Bansal and Aiken (2008) Sorav Bansal and Alex Aiken. 2008. Binary Translation Using Peephole Superoptimizers. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). USENIX Association, Berkeley, CA, USA, 177–192.
 Barthe et al. (2013) Gilles Barthe, Juan Manuel Crespo, Sumit Gulwani, Cesar Kunz, and Mark Marron. 2013. From Relational Verification to SIMD Loop Synthesis. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP ’13). ACM, New York, NY, USA, 123–134.

Bladek and
Krawiec (2016)
Iwo Bladek and Krzysztof
Krawiec. 2016.
Simultaneous Synthesis of Multiple Functions Using
Genetic Programming with Scaffolding. In
Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion
(GECCO ’16 Companion). ACM, New York, NY, USA, 97–98. https://doi.org/10.1145/2908961.2908992  Brent (1980) Richard P Brent. 1980. An improved Monte Carlo factorization algorithm. BIT Numerical Mathematics 20, 2 (1980), 176–184.
 Bringmann et al. (1993) Roger A Bringmann, Scott A Mahlke, Richard E Hank, John C Gyllenhaal, and W Hwu Wenmei. 1993. Speculative execution exception recovery using writeback suppression. In Microarchitecture, 1993., Proceedings of the 26th Annual International Symposium on. IEEE, 214–223.
 Chaudhuri et al. (2016) A. Chaudhuri, K. Mandaviya, P. Badelia, and S.K. Ghosh. 2016. Optical Character Recognition Systems for Different Languages with Soft Computing. Springer International Publishing, 53.
 Chugh et al. (2016) Ravi Chugh, Brian Hempel, Mitchell Spradlin, and Jacob Albers. 2016. Programmatic and Direct Manipulation, Together at Last. SIGPLAN Not. 51, 6 (June 2016), 341–354.
 Dwyer and Torng (1992) Harry Dwyer and Hwa C Torng. 1992. An outoforder superscalar processor with speculative execution and fast, precise interrupts. ACM SIGMICRO Newsletter 23, 12 (1992), 272–281.
 Feng et al. (2018) Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program Synthesis Using Conflictdriven Learning. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA, 420–435.
 Frankle et al. (2016) Jonathan Frankle, PeterMichael Osera, David Walker, and Steve Zdancewic. 2016. Exampledirected Synthesis: A Typetheoretic Interpretation. SIGPLAN Not. 51, 1 (Jan. 2016), 802–815. https://doi.org/10.1145/2914770.2837629
 Gammie (2015) Peter Gammie. 2015. The Tortoise and Hare Algorithm. Archive of Formal Proofs (Nov. 2015). http://isaafp.org/entries/TortoiseHare.html, Formal proof development.
 Gulwani (2011) Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Inputoutput Examples. In Proceedings of the 38th Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 317–330.
 Gulwani et al. (2011) Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. 2011. Synthesis of Loopfree Programs. SIGPLAN Not. 46, 6 (June 2011), 62–73.
 Gulwani et al. (2017) Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program Synthesis. Foundations and Trends® in Programming Languages 4, 12 (2017), 1–119.
 Gvero et al. (2013) Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete Completion Using Types and Weights. SIGPLAN Not. 48, 6 (June 2013), 27–38.
 Hempel and Chugh (2016) Brian Hempel and Ravi Chugh. 2016. SemiAutomated SVG Programming via Direct Manipulation. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (UIST ’16). ACM, New York, NY, USA, 379–390.
 Ivancevic and Ivancevic (2007) V.G. Ivancevic and T.T. Ivancevic. 2007. Computational Mind: A Complex Dynamics Perspective. Springer Berlin Heidelberg, 243.
 Jha et al. (2010) Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracleguided Componentbased Program Synthesis. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering  Volume 1 (ICSE ’10). ACM, New York, NY, USA, 215–224.
 Jobstmann et al. (2005) Barbara Jobstmann, Andreas Griesmayer, and Roderick Bloem. 2005. Program Repair As a Game. In Proceedings of the 17th International Conference on Computer Aided Verification (CAV’05). SpringerVerlag, Berlin, Heidelberg, 226–238.
 Johnson (2007) Colin G. Johnson. 2007. Genetic Programming with Fitness Based on Model Checking. In Genetic Programming, Marc Ebner, Michael O’Neill, Anikó Ekárt, Leonardo Vanneschi, and Anna Isabel EsparciaAlcázar (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 114–124.
 Joshi et al. (2002) Rajeev Joshi, Greg Nelson, and Keith Randall. 2002. Denali: A Goaldirected Superoptimizer. SIGPLAN Not. 37, 5 (May 2002), 304–314.
 Katz and Peled (2008) Gal Katz and Doron Peled. 2008. Genetic Programming and Model Checking: Synthesizing New Mutual Exclusion Algorithms. In Automated Technology for Verification and Analysis, Sungdeok (Steve) Cha, JinYoung Choi, Moonzoo Kim, Insup Lee, and Mahesh Viswanathan (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 33–47.
 Kneuss et al. (2013) Etienne Kneuss, Ivan Kuraj, Viktor Kuncak, and Philippe Suter. 2013. Synthesis Modulo Recursive Functions. SIGPLAN Not. 48, 10 (Oct. 2013), 407–426. https://doi.org/10.1145/2544173.2509555
 Koza (1992) John R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.
 Koza (1994) John R. Koza. 1994. Genetic programming as a means for programming computers by natural selection. Statistics and Computing 4, 2 (01 Jun 1994), 87–112.
 LaBel and Gates (1996) K. A. LaBel and M. M. Gates. 1996. Singleeventeffect mitigation from a system perspective. IEEE Transactions on Nuclear Science 43, 2 (April 1996), 654–660.
 Le and Gulwani (2014) Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 542–553.
 Lee et al. (2018) Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating searchbased program synthesis using learned probabilistic models. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 436–449.
 Liang et al. (2010) Percy Liang, Michael I. Jordan, and Dan Klein. 2010. Learning Programs: A Hierarchical Bayesian Approach. In Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML’10). Omnipress, USA, 639–646.
 Massalin (1987) Henry Massalin. 1987. Superoptimizer: A Look at the Smallest Program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems (ASPLOS II). IEEE Computer Society Press, Los Alamitos, CA, USA, 122–126.
 Menon et al. (2013) Aditya Krishna Menon, Omer Tamuz , Sumit Gulwani, Butler Lampson, and Adam Kalai. 2013. A Machine Learning Framework for Programming by Example, Vol. 28. Int’l Conf. Machine Learning, 187–195.
 Nguyen et al. (2013) H. D. T. Nguyen, D. Qi, A. Roychoudhury, and S. Chandra. 2013. SemFix: Program repair via semantic analysis. In 2013 35th International Conference on Software Engineering (ICSE). 772–781.
 Osera and Zdancewic (2015) PeterMichael Osera and Steve Zdancewic. 2015. Typeandexampledirected Program Synthesis. SIGPLAN Not. 50, 6 (June 2015), 619–630.
 Perelman et al. (2012) Daniel Perelman, Sumit Gulwani, Thomas Ball, and Dan Grossman. 2012. Typedirected Completion of Partial Expressions. SIGPLAN Not. 47, 6 (June 2012), 275–286.
 Phothilimthana et al. (2016) Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling Up Superoptimization. SIGPLAN Not. 51, 4 (March 2016), 297–310.
 Polikarpova et al. (2016) Nadia Polikarpova, Ivan Kuraj, and Armando SolarLezama. 2016. Program Synthesis from Polymorphic Refinement Types. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA, 522–538.
 Polozov (2018) Alex Polozov. 2018. Program Synthesis in 201718. https://alexpolozov.com/blog/programsynthesis2018/.
 Polozov and Gulwani (2015) Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on ObjectOriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA, 107–126.
 Purves et al. (2003) William K. Purves, David E Sadava, Gordon H. Orians, and H. Craig Heller. 2003. Life: the Science of Biology, 7th Edition. Sinauer Associates and W. H. Freeman.
 Singh and Gulwani (2012) Rishabh Singh and Sumit Gulwani. 2012. Synthesizing Number Transformations from Inputoutput Examples. In Proceedings of the 24th International Conference on Computer Aided Verification (CAV’12). SpringerVerlag, Berlin, Heidelberg, 634–651.
 Singh and Gulwani (2015) Rishabh Singh and Sumit Gulwani. 2015. Predicting a Correct Program in Programming by Example. In Computer Aided Verification, Daniel Kroening and Corina S. Păsăreanu (Eds.). Springer International Publishing, Cham, 398–414.
 Singh and Gulwani (2016) Rishabh Singh and Sumit Gulwani. 2016. Transforming Spreadsheet Data Types Using Examples. SIGPLAN Not. 51, 1 (Jan. 2016), 343–356.
 Singh et al. (2013) Rishabh Singh, Sumit Gulwani, and Armando SolarLezama. 2013. Automated Feedback Generation for Introductory Programming Assignments. SIGPLAN Not. 48, 6 (June 2013), 15–26.
 SolarLezama (2008) Armando SolarLezama. 2008. Program Synthesis by Sketching. Ph.D. Dissertation. Berkeley, CA, USA. Advisor(s) Bodik, Rastislav. AAI3353225.
 SolarLezama et al. (2007) Armando SolarLezama, Gilad Arnold, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2007. Sketching Stencils. SIGPLAN Not. 42, 6 (June 2007), 167–178.
 Srivastava et al. (2013) Saurabh Srivastava, Sumit Gulwani, and Jeffrey S. Foster. 2013. Templatebased program verification and program synthesis. International Journal on Software Tools for Technology Transfer 15, 5 (01 Oct 2013), 497–518.
 Torlak and Bodik (2013) Emina Torlak and Rastislav Bodik. 2013. Growing Solveraided Languages with Rosette. In Proceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Onward! 2013). ACM, New York, NY, USA, 135–152.
 Vijayakumar et al. (2018) Ashwin J. Vijayakumar, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. NeuralGuided Deductive Search for RealTime Program Synthesis from Examples. CoRR abs/1804.01186 (2018). arXiv:1804.01186
 Wall (1994) David W Wall. 1994. Speculative execution and instructionlevel parallelism. Report Western Research Laboratory (1994).
 Wang and Patel (2006) N. J. Wang and S. J. Patel. 2006. ReStore: SymptomBased Soft Error Detection in Microprocessors. IEEE Transactions on Dependable and Secure Computing 3, 3 (July 2006), 188–201.
 Weaver and Austin (2001) C. Weaver and T. Austin. 2001. A fault tolerant approach to microprocessor design. In 2001 International Conference on Dependable Systems and Networks. 411–420.

Zhang et al. (2018)
Lisa Zhang, Gregory
Rosenblatt, Ethan Fetaya, Renjie Liao,
William E. Byrd, Raquel Urtasun, and
Richard Zemel. 2018.
Leveraging Constraint Logic Programming for Neural Guided Program Synthesis.
https://openreview.net/forum?id=HJIHtIJvz
Appendix A: Problem Definitions
Table 1 lists all 3 groups of problems used in this work.
Problem  Description 

GroupA  
Max  NUM = Max(ARR). Returns maximum of an array 
Min  NUM = Min(ARR). Returns minimum of an array 
SortDesc  ARR = SortDesc(ARR). Returns a sorted array in descending order 
SortAsc  ARR = SortAsc(ARR). Returns a sorted array in ascending order 
ReverseArr  ARR = ReverseArr(ARR). Returns a reversed array 
RemoveL  RemoveL(ARR, NUM). Removes the last occurrence of given number* in array 
RemoveF  RemoveF(ARR, NUM). Removes the first occurrence of given number* in array 
LastIndOf  NUM = LastIndOf(ARR, NUM). Returns the last index of given number* in array 
FirstIndOf  NUM = FirstIndOf(ARR, NUM). Returns the first index of given number* in array 
IsInArr  BOOL = IsInArr(ARR, NUM). Returns whether a given number is in array 
GroupB  
AddArrays  ARR = AddArrays(ARR, ARR). Adds corresponding elements of two arrays together 
MultArrays  ARR = MultArrays(ARR, ARR). Multiplies corresponding elements of two arrays together 
Sum  NUM = Sum(ARR). Returns sum of elements of an array 
SumOfSq  NUM = SumOfSq(ARR). Returns sum of each element squared in array 
DotProd  NUM = DotProd(ARR, ARR). Returns dot product of two arrays 
MatVecMult  ARR = MatVecMult(AoA, ARR). Returns result of a matrixvector multiply 
AddToArr  ARR = AddToArr(ARR, NUM). Adds a number to each element of an array 
SubFromArr  ARR = SubFromArr(ARR, NUM). Subtracts a number from each element of an array 
ScaleArr  ARR = ScaleArr(ARR, NUM). Multiplies each element of an array by a number 
ScaledSum  NUM = ScaledSum(ARR, NUM). Multiplies each element of an array by a number and sums the result 
GroupC  
CountEQ  NUM = CountEQ(ARR). Returns number of elements equal to a given number in array 
CountLT  NUM = CountLT(ARR). Returns number of elements less than a given number in array 
CountGT  NUM = CountGT(ARR). Returns number of elements greater than a given number in array 
SumIfLT  NUM = SumIfEQ(ARR). Returns the sum of elements less than a given number in array 
SumIfGT  NUM = SumIfGT(ARR). Returns the sum of elements greater than a given number in array 
SumIfEQ  NUM = SumIfEQ(ARR). Returns the sum of elements equal to a given number in array 
ScaledAvg  NUM = ScaledAvg(ARR). Returns the average of array elements scaled by a number 
Sum  NUM = Sum(ARR). Returns sum of elements of an array 
Len  NUM = Len(ARR). Returns the length of array 
Avg  NUM = Avg(ARR). Returns the average of array 
Appendix B: Composition Graphs
Table 2 lists callercallee relationships for all 3 groups of problems.
GroupA  
Max 
Min 
SortDesc 
SortAsc 
ReverseArr 
RemoveL 
RemoveF 
LastIndOf 
FirstIndOf 
IsInArr 

Max  
Min  
SortDesc  57  14  43  43  14  57  
SortAsc  4  8  94  94  6  8  18  
ReverseArr  
RemoveL  2  1  26  22  79  10  
RemoveF  1  1  59  61  43  14  
LastIndOf  2  
FirstIndOf  2  
IsInArr  2  3  3  1  18  61  3  
GroupB  
AddArrays 
MultArrays 
Sum 
SumOfSq 
DotProd 
MatVecMult 
AddToArr 
SubFromArr 
ScaleArr 
ScaledSum 

AddArrays  2  
MultArrays  
Sum  
SumOfSq  3  32  29  70  3  2  2  5  
DotProd  100  93  1  1  1  8  
MatVecMult  100  13  
AddToArr  46  2  
SubFromArr  2  1  1  2  86  2  
ScaleArr  49  
ScaledSum  100  1  
GroupC  
CountEQ 
CountLT 
CountGT 
SumIfLT 
SumIfGT 
SumIfEQ 
ScaledAvg 
Sum 
Len 
Avg 

CountEQ  1  1  99  
CountLT  100  
CountGT  100  
SumIfLT  100  
SumIfGT  100  
SumIfEQ  100  
ScaledAvg  12  12  88  
Sum  
Len  4  
Avg  37  64  64 
Appendix C: ParentChild Graphs
Table 3 lists parentchild relationships for all 3 groups of problems.
GroupA  
Max 
Min 
SortDesc 
SortAsc 
ReverseArr 
RemoveL 
RemoveF 
LastIndOf 
FirstIndOf 
IsInArr 

Max  66  
Min  68  
SortDesc  
SortAsc  
ReverseArr  
RemoveL  
RemoveF  
LastIndOf  77  
FirstIndOf  99  
IsInArr  
GroupB  
AddArrays 
MultArrays 
Sum 
SumOfSq 
DotProd 
MatVecMult 
AddToArr 
SubFromArr 
ScaleArr 
ScaledSum 

AddArrays  
MultArrays  
Sum  
SumOfSq  
DotProd  
MatVecMult  
AddToArr  
SubFromArr  
ScaleArr  15  13  
ScaledSum  
GroupC  
CountEQ 
CountLT 
CountGT 
SumIfLT 
SumIfGT 
SumIfEQ 
ScaledAvg 
Sum 
Len 
Avg 

CountEQ  38  33  1  6  19  
CountLT  43  33  2  3  16  
CountGT  43  36  1  3  15  
SumIfLT  2  41  56  
SumIfGT  1  1  43  53  1  
SumIfEQ  1  42  53  1  
ScaledAvg  22  25  34  
Sum  
Len  5  
Avg 
Appendix D: Code Examples
Code for each problem is given below. For brevity, only the main function is listed (i.e., the entire call tree is not listed for each solution). It should be noted that there are often many other solutions found in addition to the one shown.
1. GroupA Problems
def Max(arg0): arr_10 = arg0.copy() num_11 = arr_10[0] for num_12 in tuple(arr_10): bool_14 = (num_12 > num_11) if (bool_14): num_11 = num_12 num_12 = num_11 return (num_12)
def Min(arg0): arr_10 = arg0.copy() num_11 = arr_10[0] for num_12 in tuple(arr_10): bool_14 = (num_12 < num_11) if (bool_14): num_11 = num_12 num_12 = num_11 return (num_12)
def SortDesc(arg0): arr_10 = arg0.copy() arr_15 = list() for num_13 in tuple(arr_10): num_16 = Max(arr_10) arr_15.append(num_16) arr_10 = RemoveF(arr_10, num_16) return (arr_15)
def SortAsc(arg0): arr_10 = arg0.copy() arr_19 = SortDesc(arr_10) arr_13 = ReverseArr(arr_19) arr_12 = arr_13.copy() return (arr_12)
def ReverseArr(arg0): arr_10 = arg0.copy() arr_17 = list() for num_12 in tuple(arr_10): num_13 = arr_10.pop() arr_17.append(num_13) return (arr_17)
def RemoveL(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 num_15 = LastIndOf(arr_10, num_11) num_14 = arr_10.pop(num_15) bool_12 = (num_11 < num_14) return (arr_10)
def RemoveF(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 num_17 = FirstIndOf(arr_10, num_11) num_16 = arr_10.pop(num_17) bool_12 = (num_17 < num_16) return (arr_10)
def LastIndOf(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_14, num_15 in enumerate(tuple(arr_10)): bool_17 = (num_15 == num_11) if (bool_17): arr_10.append(num_14) num_12 = arr_10.pop() return (num_12)
def FirstIndOf(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_17 = list() for num_13, num_14 in enumerate(tuple(arr_10)): bool_16 = (num_14 == num_11) if (bool_16): arr_17.append(num_13) num_12 = arr_17[0] return (num_12)
def IsInArr(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_12 in tuple(arr_10): bool_14 = (num_12 == num_11) if (bool_14): num_11 = arr_10[1] bool_13 = (num_12 == num_11) return (bool_13)
2. GroupB Problems
def AddArrays(arg0, arg1): arr_10 = arg0.copy() arr_11 = arg1.copy() for num_13 in tuple(arr_10): num_14 = arr_11.pop(0) num_15 = num_13 + num_14 arr_11.append(num_15) return (arr_11)
def MultArrays(arg0, arg1): arr_10 = arg0.copy() arr_11 = arg1.copy() for num_13 in tuple(arr_10): num_14 = arr_11.pop(0) num_16 = arr_10.pop(0) num_15 = num_16 * num_14 arr_11.append(num_15) return (arr_11)
def Sum(arg0): arr_10 = arg0.copy() num_14 = 0 for num_12 in tuple(arr_10): num_14 = num_12 + num_14 return (num_14)
def SumOfSq(arg0): arr_10 = arg0.copy() arr_14 = arr_10.copy() num_12 = DotProd(arr_14, arr_14) return (num_12)
def DotProd(arg0, arg1): arr_10 = arg0.copy() arr_11 = arg1.copy() arr_14 = MultArrays(arr_10, arr_11) num_13 = Sum(arr_14) return (num_13)
def MatVecMult(arg0, arg1): arr_of_arr10 = arg0 arr_11 = arg1.copy() arr_16 = list() for arr_15 in tuple(arr_of_arr10): num_17 = DotProd(arr_11, arr_15) arr_16.append(num_17) arr_14 = arr_16.copy() return (arr_14)
def AddToArr(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_12 in tuple(arr_10): num_15 = arr_10.pop(0) num_14 = num_11 + num_15 arr_10.append(num_14) return (arr_10)
def SubFromArr(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_12 in tuple(arr_10): num_13 = arr_10.pop(0) num_14 = num_13  num_11 arr_10.append(num_14) return (arr_10)
def ScaleArr(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_15 = list() for num_12 in tuple(arr_10): num_14 = num_12 * num_11 arr_15.append(num_14) return (arr_15)
def ScaledSum(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 num_13 = Sum(arr_10) num_12 = num_13 * num_11 return (num_12)
3. GroupC Problems
def CountEQ(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_15 in tuple(arr_10): bool_17 = (num_11 == num_15) if (bool_17): arr_14.append(num_15) arr_10.append(num_15) num_12 = Len(arr_14) return (num_12)
def CountLT(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_15 in tuple(arr_10): bool_17 = (num_11 > num_15) if (bool_17): arr_14.append(num_15) arr_10.append(num_15) num_12 = Len(arr_14) return (num_12)
def CountGT(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_15 in tuple(arr_10): bool_17 = (num_15 > num_11) if (bool_17): arr_14.append(num_15) arr_10.append(num_11) num_12 = Len(arr_14) return (num_12)
def SumIfLT(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_13 in tuple(arr_10): bool_15 = (num_11 <= num_13) if (bool_15): num_13 = 0 arr_14.append(num_13) num_12 = Sum(arr_14) return (num_12)
def SumIfGT(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_13 in tuple(arr_10): bool_15 = (num_11 >= num_13) if (bool_15): num_13 = 0 arr_14.append(num_13) num_12 = Sum(arr_14) return (num_12)
def SumIfEQ(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 arr_14 = list() for num_13 in tuple(arr_10): bool_15 = (num_11 != num_13) if (bool_15): num_13 = 0 arr_14.append(num_13) num_12 = Sum(arr_14) return (num_12)
def ScaledAvg(arg0, arg1): arr_10 = arg0.copy() num_11 = arg1 for num_13 in tuple(arr_10): num_14 = arr_10.pop(0) num_15 = num_14 * num_11 arr_10.append(num_15) num_12 = Avg(arr_10) return (num_12)
def Sum(arg0): arr_10 = arg0.copy() num_17 = 0 for num_12 in tuple(arr_10): num_17 = num_12 + num_17 return (num_17)
def Len(arg0): arr_10 = arg0.copy() for num_14, num_15 in enumerate(tuple(arr_10)): pass num_13 = 1 num_12 = num_13 + num_14 return (num_12)
def Avg(arg0): arr_10 = arg0.copy() arr_20 = arr_10.copy() num_18 = Sum(arr_20) num_17 = Len(arr_10) num_13 = num_18 // num_17 num_12 = num_13 return (num_12)
Appendix E: Source Code & Result Files
1. Source Code
Please contact authors for source code (main.py). It can be run with Python 3.6 or later using the following command:
python main.py groupID
where, groupID is 1, 2, or 3, for GroupA, GroupB, and GroupC, respectively. The run prints progress on stdout and generates a report at the end (containing leastcomplex results, callees, parents, a stat record for each solution, etc.) in the current directory, a checkpoint in the ./chkpts directory, and a detailed log file for each rank in the ./log.nodename directory. On machines with fewer cores (than 112 we used), number of epochs must be increased proportionately for all solutions to be found.
After one or more such runs, one or more checkpoints can be read and leastcomplex solutions can be constructed for reporting purposes by running
python main.py groupID 1
Notice that least complex code produced in this step is composed from the least complex result found for each problem and the composed code is not currently tested using a Checker. This composed code is for reporting purposes only and must be inspected by the user.
2. Result Files
Please contact authors for result files.
Comments
There are no comments yet.