Smooth interoperability between interpreters by different vendors and the various polyfills is ensured by the common ECMAScript standard and its test suite. While there is a formally verified reference interpreter for the core language, which closely follows the natural language specification (Bodin et al., 2014), all fully-fledged implementations in browsers and other systems rely on test suites to ensure conformance. The main mechanism for validating conformance to the ECMAScript standard is Test262 (ECMA International, 2017), a manually curated test suite with the goal of covering all observable behavior of the ECMAScript specification.
We present a methodology for automated generation of conformance tests from polyfills. We employ differential testing across multiple implementations to compensate the lack of testing oracles (§ 3).
Overall, we believe that this can lower the bar for maintaining standardization test suites like Test262 in the future. New language features are regularly implemented in polyfills before standardization, and our approach will allow to generate corresponding tests as a byproduct.
Objects are maps from string property names to values. Objects are constructed dynamically and have no pre-set structure.constructs a new object with a single property, and assigns it to the variable . After executing , x will have two properties set, , and . Values of any type can be assigned to object properties, including other objects, arrays, and functions. For example, the following code will create a new function and assign it to the property on :
When a function attached to an object is executed, the containing object is passed as theargument to the function call. For example, will print , since refers to .
The language also allows object-oriented programming. Classes are constructed dynamically through constructor functions and thekeyword. When is used, a fresh object will be created and the constructor is executed with the value equal to the new object. The resulting object is returned after construction. For example, the following code defines a new class and then creates an instance of it with the argument :
Note that there is no distinction between class constructors and other functions. In our example, sinceis a function, we are also allowed to execute it without the keyword. This allows created objects to call other class constructors to simulate inheritance. For example, the following code will create a class constructor , and use the constructor to make sure it has the same properties:
Assigning all properties of a new object in the constructor can make managing code difficult, so the language also includes object prototypes. If we want a value to be added to every instance of, then we can add it to the object which exists as a property of every function. For example, once we execute , the property will exist in any new instance of A. These prototypes can be chained together, forming inheritance chains. For example, the following code defines an extension of , and will print since it inherits from the chained prototype:
2.2. Dynamic Symbolic Execution
Dynamic symbolic execution (DSE) is an automated test generation approach based on constraint solving and has been shown to be effective at bug-finding (Cadar et al., 2008; Godefroid et al., 2008; Bounimova et al., 2013). DSE generates new test cases for a program through repeat executions. In DSE, some inputs to a program as marked as symbolic while others are fixed. The DSE engine then generates a series of assignments for symbolic values which each exercise a unique control flow path through the program. For example, when analyzing the following program, we begin by replacing the input with the symbol :
When executing the program we maintain a symbolic state in addition to the concrete state. The concrete state drives test execution, while the symbolic state tracks constraints on the symbols in the program. To begin analysis, we execute the test harness with an initial concrete assignment for the symbolic inputs. For our example, we pick the initial assignment.
With our test setup and our initial test case selected, we are now ready to symbolically execute the program. When operations involve symbolic operands, we compute the concrete result using the concrete value of the operand and use a symbolic interpreter to generate the resulting symbol. We see this on line 2.2, where execution with our initial test case will yield a concrete value of , and a symbolic value of . We now reach line 2.2, the first branching condition in the program. In DSE, we use use the concrete state to decide which branch to follow for the current test execution and we also develop a symbolic path condition (PC), a symbolic representation of the conditional operations which drove us down the branches we followed. On line 2.2 we follow the branch, and do not enter the if condition since the concrete value of is . We use the operator to denote updates to the symbolic path condition. At this step we update our path condition with . After this, our first test case terminates.
Upon termination, the DSE engine uses the PC and an SMT solver in order to find alternate assignments for the symbolic inputs. We find these alternate assignments by negating the conditional operations in the PC so that the next test case will take the opposite route at that branching point. We now try and find an alternative assignment for which will follow the branch on line 2.2. We query the SMT solver to decide there is any assignment for where , and the SMT solver gives us the input , our new test case.
Since we have identified a new test case, we now re-execute our program with the new concrete assignment for . During this execution we follow the path on line 2.2, and each line 2.2 with the path constraints . On line 2.2, we check if . In this test case, has a concrete value of 50, and a symbolic value of . Since 50 is greater than 20, we take the else path and update the PC with , leading to test case termination.
We now use the SMT solver to decide if there is an assignment for which explores the true branch on line 2.2. We take the PC and negate the last constraint, resulting in a query asking the SMT solver if there is a feasible assignment for X such that . Here, the SMT solver tells us that there is no feasible assignment for , so we know that the true branch on line 2.2 is unreachable. Since there are no new test cases for our program our DSE is now complete and we have explored all feasible control flows contingent on our symbol . In general, there will be an impractical (possibly infinite) number of test cases to execute. So instead of exhausting all test cases, we repeatedly execute new test cases until we reach a time limit or a predefined coverage goal. Therefore, DSE can in general not be used for software verification, but it is ideally suited to generate high-coverage test suites fully automatically.
Modifications to ExpoSE
3. Conformance Testing using PolyFills
We generate new test cases by dynamic symbolic execution of polyfills. Analysis of these polyfills will generate inputs that explore the intricacies of built-in specifications, but we do not have a ground-truth for the correct behavior of a test case. To solve this problem we use a suite of interpreters and have them vote on the correct answer. This acts an oracle to identify when an implementation is incorrect, and only requires manual intervention when two or more implementations diverge.
We split our implementation into two components, the test case generator, and the test case executor. The test case generator uses ExpoSE to generate new test cases. The test case executor executes a test suite extracted from the symbolic executions and checks that each of our selected interpreters is implemented correctly.
3.3. Test Case Generation
We generate new test cases by symbolically executing polyfills using ExpoSE. Figure 1 provides an overview of the architecture. We begin by supplying the test apparatus with a target built-in and the number of arguments the method expects. The apparatus then constructs a series of symbolic inputs to use as arguments, including a symbolic value for . ExpoSE then analyzes the generated test harness and begins to output a series of test cases. We also execute each new test case in to mitigate any errors in ExpoSE. We forward the result of the concrete and symbolic executions to the path verifier, a tool that double-checks that the concrete and the symbolic result are identical. If they are not, then the test case is discarded. Otherwise, we add the test case to the generated test case suite, and the symbolic path condition is used to generate new test cases. We use an object-aware type encoding when finding alternate test cases to explore more of our target polyfills (§ 4).
3.4. Test Case Executor
The second component in our design is the test case executor. Our automatically generated test cases do not have a predetermined expected result because the result found during symbolic execution may be from a flawed implementation. Instead of using predetermined test case results, we use a consensus-based approach to detect incorrect implementations, illustrated in Figure 2. We execute each test case in several different interpreters. Each interpreter has a different interface so we generate a compatible test through a test translator that takes a test input and returns a program compatible with a specific engine. For polyfills, we inject the target method into a Node.js instance, replacing any existing implementation. We then execute each of these programs and collect the output.
Once the test case has been executed by each implementation, we pass the results to a voting mechanism. The voting mechanism looks for implementations where behavior diverges from the others. If the outcome of a built-in call diverges either in exception type or result then we say that the interpreters disagree and raise an error. Specifically, we say that an implementation disagrees if either of the following two conditions are violated:
If a test case throws an exception and others do not, or the exception type differs from other implementations.
If a test case has output different from the others.
We do not compare the exact text of exceptions because it is not specified by the ECMAScript specification. If a single implementation disagrees then it is marked as incorrect. When multiple implementations disagree, we cannot make any conclusion about correct behavior and mark the test case for manual review.
To allow automated generation of structured test inputs for built-in methods, we require a method for maintaining symbolic objects and arrays. We developed new encodings for untyped symbolic objects, i.e., symbolic objects with no pre-specified property names or types (§ 4.2), arrays of mixed types (§ 4.3), and for homogeneously typed arrays (§ 4.4).
Support for symbolic objects is key to the exploration of built-ins because it allows thorough exploration of object and array-centric built-ins. More subtly, support allows the DSE engine to consider esoteric type-checking in built-in methods. The specification includes precise but unintuitive rules on how input values are to be interpreted and when type contract violations should raise an error. To highlight how an object encoding can improve coverage of these edge cases, we now consider.
Usually, this method is given an array as its base argument and a predicate. The array is then searched, left to right, until a value satisfying the predicate is found. If no values satisfy the predicate thenis returned. For example, would yield , the first even number in the array. If we look at the method specification, we see that there is a quirk to this method contract. The method accepts any object which looks like an array (i.e., any object with a length property). Because of this behaves equivalently to the previous example, but will yield , since the object does not specify a length.
One further quirk is the coercion ofto an integer. The specification does not reject non-integer length properties, leading to a coercion that resolves to , but to , as is coerced to .
In 111https://github.com/msn0/mdn-polyfills/blob/master/src/Array.prototype.findIndex/findIndex.js. In this case, if the length property is not an integer then it is first coerced to a number and subsequently truncated to an integer. Through our encoding of objects, we can synthesize useful test cases for such behavior.these checks are implemented by , which ensures the value is either an array or an object, followed by , which selects the length of the object and ensures it is an integer using type coercion
4.2. Symbolic Objects
We model symbolic objects by tracking property lookups and updates to objects. For this, we rewrite all property lookups to use the common interfaceand all property updates to use . In both cases, is the object operated on and is a string indicating which property is being accessed. For , is the new value of the given property. With this instrumentation, we can keep track of all object operations during execution, updating the symbolic state when appropriate. We instrument arrays similarly, with and interfaces for all property lookups. They differ in the typing of property names, where they also accept integer values, since arrays can contain integer and string property names.
The root of our encoding is the creation of new symbolic values for properties we have not seen before while returning the value stored in a state for properties that we have previously set. Our encoding for objects is illustrated in Figure 3. Here we see how a symbolic object behaves under various typical operations.
The first step in Figure 3 shows how symbolic objects support fully concrete operations. Here, we record the concrete value supplied to be returned on subsequent lookups. When we perform a lookup for a property that we have not encountered before, we introduce a new symbolic value to the program and set it to the appropriate property. The created symbol does not have a fixed type, and instead uses existing support in the DSE engine to explore the program as if it were any of the supported symbolic types. In the case of ExpoSE, the DSE engine we use in this paper, the symbolic types supported are undefined, null, boolean, number, string and through our encoding also objects and arrays. The second operation illustrates this process on in Figure 3, where the new symbol Z is introduced and assigned to the property z.
Next, we want to set a property with a concrete property name but a symbolic value. As with a fully concrete set property, we record the supplied property value to the object state; here, it makes no difference if the supplied properties are concrete or symbolic.
The last matter that we address in this example is how we approach setting and getting of properties with symbolic property names. Here, we attempt to create new test cases for each of the previously recorded properties of an object - even if they are subsequently deleted. The final operation illustrates this in the figure, where we write a concrete value with a symbolic property name, leading to two new paths. One where the propertyis replaced with 5, and another where the property is replaced with . This final step causes under-approximation in our encoding: We do not enumerate on properties that we have not seen previously. We could, in principle, support this through the enumeration of all possible property names, but this would lead to an infeasible number of paths to explore.
There are a number of advanced features which can change the behavior ofand operations, such as , which can trigger the execution of a function instead of map lookup. Methods can also be used to change the enumerability of properties within an object. We concertize the symbolic objects when handling these cases, and so our encoding is under-approximate when modeling these behaviors.
4.3. Mixed Type Arrays
We intercept reads and writes to array length, which is a reserved property name in arrays. The array length property will always be one higher than the largest element index in the array. This point is important because arrays do not need to be contiguous (i.e., there may be gaps between two indices). This design choice has an impact on enumeration, where looping on the array length will include all indices, but using the or operators will only include those which have been set, since these operators will only include properties which are marked as enumerable. For example, examine the following program:
Here, the interpreter will yield the array. If we enumerate using the or operators we would see and enumerated upon, however if we enumerate and print all properties through the array length then we would see printed.
When a program writes to the array length property, the array will be truncated or expanded to the new length. If the value is less than the current array length, any values in the indiceswill be deleted from the array. If the value is greater than the current array length, then the array will be extended with values. To illustrate this, see the following program:
In this example the variable x1 would have a length of 100 with all values after 3 being, while x2 will be empty.
We illustrate these changes in behavior through Figure 4. To ensure we accurately model array length, we create a separate symbolic integer to represent it. This value is initially unbounded and has constraints applied as the program executes. As we fetch property 5, we explore two paths, one where the existing length property is large enough to accommodate the new value and one where it is not. In the case where it is not the value of the property will be undefined, and in the other case it will return a new symbol using the same approach as our symbolic objects. The second step illustrates what happens when an array lookup occurs on an array that is longer than our property index. Here, a second path is infeasible because the array length cannot be less than six. Direct writes to an array fix the symbolic length; writing a length of zero to the array truncates it, removing all properties. Subsequent property lookups will all return undefined. A write of length 100 expands the array to a fixed length but does not fix any properties. Here, a property lookup creates a fresh symbol because the previous one was erased. The new symbol is given a unique name in the path condition, and can interact with the symbol that used to occupy this property.
4.4. Optimized Support for Homogeneously Typed Arrays
In this program, we do not exercise the error because we concretize symbolic property names. Thus, , and will not consider any paths where i is not 0 due to concretization. If we set to , then this error would be found. We provide an encoding for homogeneously typed arrays directly in SMT to explore portions of a program where property name concretization is limiting analysis. Since the encoding is directly in SMT, we no longer need to concretize property names, allowing us to reason about property names symbolically.will resolve to , leading to an infeasible constraint of
Our encoding uses existing SMT solver support to represent arrays. A typed array has two symbolic components, the array data and array length. The array base is a symbolic mapping of integer property names to symbolic values of the array’s type. The symbolic length property is used to represent the current constraints on array length, which is necessary to test out-of-bounds array element access. A symbolic Algorithm 3 and Algorithm 4. In these algorithms, the methods and map directly to SMT. We downgrade when a is given a value that is not the array base type.can explore two paths, one where the array is shorter than the index resulting in , the second where the array includes the index, resulting in a value of the array type. operations update the symbolic length to accommodate the new value and then inserts it into the base. This is illustrated in
The process for downgrading a homogeneously typed array to a mixed-type array is detailed in Algorithm 5. Array downgrading converts a homogeneously typed array into a generic array to allow mixed types. We do this by using the concrete array length to derive the initial mapping for the mixed-type array. We copy the homogeneously typed array’s length into the new array so that we respect existing length constraints.
Is our approach able to cover the logic of built-in functions?
Can our approach find any bugs in built-in methods?
Does the addition of our test cases improve coverage of Test262?
We answer these research questions through three experiments on selected functions introduced with the ES6 specification. In the first experiment we evaluate the effectiveness of our conformance test case generation strategy using the polyfillsv3.1.4 and v5.17.1. Here, we show that ExpoSE achieves high coverage of many method implementations. For our second experiment we use our generated conformance test suite and voting mechanism to search for errors in existing implementations of the ES6 standard, finding 17 bugs in a widely depended built-in implementation. Finally, we evaluate the coverage of our test suite against Test262 under the QuickJS interpreter (v2019-10-27). In this study we see that, while Test262 generally covers more branches of tested methods overall, our test cases explore parts of the built-in implementations which are not covered by Test262.
5.1. Test Case Generation
In our first experiment we answer RQ1 through an evaluation on two popular ES6 built-in method implementations found on NPM. We extracted our surrogate implementations from and . Overall, we collected 96,470 new unique test cases. We show that we achieve high coverage of the built-in implementations during symbolic execution, suggesting that a large portion of the implementation is covered.
Our test harness loads the portions of the library we wish to test and selects a target method. The method is then executed with symbolic arguments for both theargument and each of the method arguments. We analyze this harness with ExpoSE. Each method is tested in isolation, through a single analysis using ExpoSE with a timeout of one hour on a 64 core machine. After analysis, the generated test cases are combined and duplicates are removed.
|Function||Test Cases||core-js Coverage||mdn-polyfill Coverage|
We generated 129,960 new test cases overall, which was reduced to 96,470 after removal of duplicate tests. Table 1 presents the results of our evaluation, providing coverage information from the analysis of the and variant if the method was supported by that library. Overall, we found that our prototype is more capable of generating test cases for string methods than array methods. These results are inline with our expectations, as the string support in ExpoSE is mature. Further improvements in ExpoSE modeling and SMT solvers could improve this support even further. In particular, our encoding currently does not include symbolic models for array methods other than , , , , which may lower overall performance.
5.2. Executing Our Test Cases
We selected QuickJS 2019-10-27, SpiderMonkey 68 (through the standalone interpreter), Node.js v8.12.0, § 5.1. We executed each test case once with each competing implementation and stored the output. Next, we examined the result of each test case for divergence between the tested implementations. If there is any divergence then we used the outlined voting mechanism to resolve the failing case. Test cases were each executed with a maximum time of 10 minutes on each interpreter, though no test cases hit this boundary. Tests which crashed or exceeded the timeout are terminated with a failure.v3.1.4 and v5.17.1 for testing. We tested each of the test-cases identified in
|Implementation||Unique Exceptions||Test Case Failure||Bugs|
|mdn-polyfills (Jezierski, 2016)||34||200||17|
|core-js (Pushkarev, 2014)||63||125||0|
|SpiderMonkey (Mozilla Foundation, 1996)||72||66||0|
|Node.js (Dahl, 2009)||56||122||0|
|QuickJS (Bellard, 2017)||24||141||0|
Table 2 presents a summary of test case executions for the 5 built-in implementations. Unique Exceptions gives the number of unique exceptions identified across the executions of all test cases (i.e, where an exception text has not been seen before after test specific details are removed). Test Case Failure details the total number of test cases where the interpreter failed to give a result due to crash or timeout. The final column, Bugs gives the number of bugs found in each implementation.
In addition to finding some bugs, we exercised many unique exceptions in interpreters. The high number of unique exceptions suggests that our test suite is exploring many interesting corner cases of implementation. Interestingly, we do not see the same number of unique exceptions across interpreters. We found that some implementations have much more verbose error messages for built-ins than others. While the exception messages are not standardized, and so this is not an implementation error, the lack of verbosity could make errors harder to debug.
We experienced some test case failure for each of the implementations tested. We observed zero cases of failure due to test timeouts or interpreter error; instead, all observed failures were due to interpreter memory limits. Most of these errors occur in, where many of the test inputs are large values which hit interpreter memory limits. We examined our surrogates to understand why the DSE engine is generating such extreme cases. We find that in one of our surrogate implementations there is an upper limit on string size through a boundary condition . The condition drives ExpoSE to generate a series of test cases supplying large arrays or strings as input. The specification does not specify interpreter memory limits, so the different number of failing cases is not an error. In particular, we observed that SpiderMonkey avoids test case failure in these cases by having stricter limits on bounds for . As an example, at the time of writing, Node.js will execute but SpiderMonkey will not. In the specification, ECMAScript does not add any constraints to the range of strings, so long as they are positive integers. In practice, the reason we see these memory errors in QuickJS and Node, but not SpiderMonkey, is because string boundaries are explicit in SpiderMonkey method implementations. So these errors manifest as exceptions without crashing the interpreter.
Our study has shown that we can detect faults in a real built-in implementation with 35,000 weekly downloads at time of writing. The ability to detect real bugs using our approach shows that a consensus based approach for test case evaluation can be effective. In addition, our approach generated a large number of unique exceptions in the tested cases and covered an obscure difference in string length constraints between interpreters, demonstrating that our test cases explore interesting paths through the implementations.
5.3. Test Suite Coverage
To ensure that our new approach generates novel test cases, we now compare the branch coverage of the new test cases to Test262. We show that the addition of our test cases leads to an increases in overall branch coverage in QuickJS, demonstrating that our approach is generating novel test cases.
We modified the QuickJS build process to include support for branch coverage output via gcov, a tool which collects coverage information through compile time instrumentation. For each built-in method, we then executed all of our generated test cases and the relevant portion of the Test262 suite. Once each had finished, we extracted the covered branches, using a manual analysis to identify the appropriate function names in the QuickJS source code.
When evaluating the coverage of a function within a program, we present both shallow and deep metrics for combined coverage increases, and follow calls to a depth of 3 when presenting absolute branch coverage. If we only present shallow coverage metrics (i.e., we do not follow function calls), then we may under-represent coverage improvements as logic for built-ins is often spread across many methods. Conversely, including all reachable functions may make our results less insightful by including large amounts of indirectly related code, such as utility methods, which may also be called by other methods during execution. By presenting our combined coverage improvements at different call depths, the reader can see how branch coverage changes as we follow an implementation deeper into the methods it calls.
In our coverage metrics we only include methods defined in the core QuickJS implementation and do not include library calls.
Table 3 details total number of branches, branches covered by our systematically generated conformance tests (ExpoSE), and branches covered by Test262. We selected a call depth of 3 as following calls further included many utility methods, making results less insightful. Here, tests generated by our approach achieve reasonable branch coverage, but do not exceed the coverage of Test262 which is already very high for every method. When we combine the branches covered by automatically generated conformance tests and Test262 we see an overall coverage improvement over Test262 for every tested function, demonstrating that generated conformance tests are exploring new routes through the implementation.
Table 4 shows the results of our coverage study at various call depths. The function names in the table are the internal function names in QuickJS. QuickJS sometimes implements optimized methods for typed arrays, which is why there may be two methods for the same feature. We see branch coverage improvements in many of the methods we test, in some cases seeing a 15% improvement overall. Our results demonstrate that our automatically generated test cases do explore further into built-in method behavior than Test262 answering RQ3. In most functions, we see notable coverage increases, even at a call depth of 0 (i.e., not including the coverage impact of any called methods). These results highlight that our approach is exploring untraveled paths through built-in function implementations login, and not just expanding coverage in utility methods. The coverage increases at low call depths show that built-in specific edge cases are being exercised, as these expressed near the surface of the call-tree.
6. Related Work
We now briefly review related work in the space of dynamic symbolic execution, with a particular emphasis on memory models and handling of symbolic reads and writes to objects and arrays.
Mayhem (Cha et al., 2012) is a dynamic symbolic execution engine for compiled programs that represents a 32-bit address space symbolically to model program memory. In this work a symbolic memory model improved the effectiveness of DSE by 40%, showing that supporting symbolic memory is crucial. To make their solution feasible, they limit the symbolic representation to reads and do not consider writes symbolically. EXE (Cadar et al., 2006) supports a single-object model, where pointers are concretized and only a single address is considered. The approaches are similar to our own when treating symbolic field names, where ExpoSE concretizes the field name to avoid exploring an unbounded number of inputs.
Kristensen and Møller (2017) use TypeScript type specifications and feedback directed random fuzzing to identify mismatches between type specifications and observed behaviors. Through this approach the authors identify many inconsistencies, motivating the use of dynamic analysis for specification testing.
Marinescu and Cadar (2012) symbolically execute test suites to find bugs. A symbolic execution runs on the existing harnesses used by a program for unit testing, replacing concrete values with symbolic ones in order take advantage of interesting test conditions. Unlike our approach, only simple error conditions are considered because the tool cannot deduce the expected output after a charge in input.
Palikareva et al. (2016) use DSE to automatically discover differences in behavior between program versions. The authors test versions of the same software, while our approach tests differences between many implementations of the same specification. As versions of the same software are tested but program specifications are not static, it is difficult to decide whether changes in behavior between two versions are desired. This differs from our approach, where the behavior of compliant implementations is fixed and divergence is an error.
- Anand et al. (2007) Saswat Anand, Corina S. Pasareanu, and Willem Visser. 2007. JPF-SE: A Symbolic Execution Extension to Java PathFinder. In Proc. 13th Int. Conf. Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2007) (LNCS), Vol. 4424. Springer, 134–138.
- Bellard (2017) Fabrice Bellard. 2017. QuickJS. https://bellard.org/quickjs/.
- Bounimova et al. (2013) Ella Bounimova, Patrice Godefroid, and David A. Molnar. 2013. Billions and billions of constraints: whitebox fuzz testing in production. In 35th Int. Conf. Software Engineering (ICSE 2013). 122–131.
- Bucur et al. (2014) Stefan Bucur, Johannes Kinder, and George Candea. 2014. Prototyping Symbolic Execution Engines for Interpreted Languages. In Proc. 19th Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, 239–254.
- Cadar et al. (2008) Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. 2008. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs. In Proc. 8th Symp. Operating Systems Design and Implementation (OSDI 2008). USENIX, 209–224.
- Cadar et al. (2006) Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Dawson R. Engler. 2006. EXE: automatically generating inputs of death. In Proc. ACM SIGSAC Conf. Computer and Communications Security (CCS). ACM, 322–335.
- Cha et al. (2012) Sang Kil Cha, Thanassis Avgerinos, Alexandre Rebert, and David Brumley. 2012. Unleashing Mayhem on Binary Code. In Proceedings of the 2012 IEEE Symposium on Security and Privacy.
- Chipounov et al. (2011) Vitaly Chipounov, Volodymyr Kuznetsov, and George Candea. 2011. S2E: A platform for in-vivo multi-path analysis of software systems. In Proc. 16th. Int. Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS 2011). ACM, 265–278.
- Dahl (2009) Ryan Dahl. 2009. Node.js. https://github.com/nodejs/node.
- ECMA International (2017) ECMA International. 2017. Test262. https://github.com/tc39/test262.
- Godefroid et al. (2008) Patrice Godefroid, Michael Y. Levin, and David A. Molnar. 2008. Automated Whitebox Fuzz Testing. In Annu. Network and Distributed System Security Symposium (NDSS). The Internet Society.
- Havelund and Pressburger (2000) Klaus Havelund and Thomas Pressburger. 2000. Model Checking Java Programs using Java PathFinder. Int. J. Softw. Tools Technol. Transfer (STTT) 2, 4 (2000), 366–381.
- Holler et al. (2012) Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proc. 21st USENIX Security Symposium (USENIX Security). 445–458.
- Jezierski (2016) Michał Jezierski. 2016. mdn-polyfills. https://github.com/msn0/mdn-polyfills.
- Kapus and Cadar (2019) Timotej Kapus and Cristian Cadar. 2019. A segmented memory model for symbolic execution. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019, Marlon Dumas, Dietmar Pfahl, Sven Apel, and Alessandra Russo (Eds.). 774–784. https://doi.org/10.1145/3338906.3338936
- Khurshid et al. (2003) Sarfraz Khurshid, Corina S. Pasareanu, and Willem Visser. 2003. Generalized Symbolic Execution for Model Checking and Testing. In Tools and Algorithms for the Construction and Analysis of Systems, 9th International Conference, TACAS 2003, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2003, Warsaw, Poland, April 7-11, 2003, Proceedings. Springer, 553–568. https://doi.org/10.1007/3-540-36577-X_40
- Kristensen and Møller (2017) Erik Krogh Kristensen and Anders Møller. 2017. Type test scripts for TypeScript testing. Proc. ACM Program. Lang. 1, OOPSLA (2017), 90:1–90:25. https://doi.org/10.1145/3133914
- Marinescu and Cadar (2012) Paul Dan Marinescu and Cristian Cadar. 2012. make test-zesti: A symbolic execution solution for improving regression testing. In 34th Int. Conf. Software Engineering (ICSE). 716–726. https://doi.org/10.1109/ICSE.2012.6227146
- Mozilla Foundation (1996) Mozilla Foundation. 1996. SpiderMonkey. https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey.
- Palikareva et al. (2016) Hristina Palikareva, Tomasz Kuchta, and Cristian Cadar. 2016. Shadow of a doubt: testing for divergences between software versions. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. ACM. https://doi.org/10.1145/2884781.2884845
- Pushkarev (2014) Denis Pushkarev. 2014. core-js. https://www.npmjs.com/package/core-js.
- Robby et al. (2003) Robby, Matthew B. Dwyer, and John Hatcliff. 2003. Bogor: an extensible and highly-modular software model checking framework. In Proceedings of the 11th ACM SIGSOFT Symposium on Foundations of Software Engineering 2003 held jointly with 9th European Software Engineering Conference, ESEC/FSE 2003, Helsinki, Finland, September 1-5, 2003. ACM, 267–276. https://doi.org/10.1145/940071.940107
- Sridharan et al. (2014) Manu Sridharan, Koushik Sen, and Liang Gong. 2014. Jalangi 2. https://github.com/Samsung/jalangi2.
- Visser et al. (2004) Willem Visser, Corina S. Pasareanu, and Sarfraz Khurshid. 2004. Test input generation with Java PathFinder. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2004, Boston, Massachusetts, USA, July 11-14, 2004. ACM, 97–107. https://doi.org/10.1145/1007512.1007526
- Xie et al. (2005) Tao Xie, Darko Marinov, Wolfram Schulte, and David Notkin. 2005. Symstra: A Framework for Generating Object-Oriented Unit Tests Using Symbolic Execution. In Tools and Algorithms for the Construction and Analysis of Systems, 11th International Conference, TACAS 2005, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2005, Edinburgh, UK, April 4-8, 2005, Proceedings. Springer, 365–381. https://doi.org/10.1007/978-3-540-31980-1_24