Adaptive Test Generation Using a Large Language Model

by   Max Schafer, et al.

Unit tests play a key role in ensuring the correctness of software. However, manually creating unit tests is a laborious task, motivating the need for automation. This paper presents TestPilot, an adaptive test generation technique that leverages Large Language Models (LLMs). TestPilot uses Codex, an off-the-shelf LLM, to automatically generate unit tests for a given program without requiring additional training or few-shot learning on examples of existing tests. In our approach, Codex is provided with prompts that include the signature and implementation of a function under test, along with usage examples extracted from documentation. If a generated test fails, TestPilot's adaptive component attempts to generate a new test that fixes the problem by re-prompting the model with the failing test and error message. We created an implementation of TestPilot for JavaScript and evaluated it on 25 npm packages with a total of 1,684 API functions to generate tests for. Our results show that the generated tests achieve up to 93.1 Moreover, on average, 58.5 assertion that exercises functionality from the package under test. Our experiments with excluding parts of the information included in the prompts show that all components contribute towards the generation of effective test suites. Finally, we find that TestPilot does not generate memorized tests: 92.7 measured by normalized edit distance), with none of them being exact copies.


page 1

page 2

page 3

page 4


Exploring the Effectiveness of Large Language Models in Generating Unit Tests

A code generation model generates code by taking a prompt from a code co...

No More Manual Tests? Evaluating and Improving ChatGPT for Unit Test Generation

Unit testing is essential in detecting bugs in functionally-discrete pro...

RICK: Generating Mocks from Production Data

Test doubles, such as mocks and stubs, are nifty fixtures in unit tests....

Perfect Is the Enemy of Test Oracle

Automation of test oracles is one of the most challenging facets of soft...

Carving Parameterized Unit Tests

We present a method to automatically extract ("carve") parameterized uni...

Provenance and Pseudo-Provenance for Seeded Learning-Based Automated Test Generation

Many methods for automated software test generation, including some that...

Is Unit Testing Immune to Coincidental Correctness?

Researchers have previously shown that Coincidental Correctness (CC) is ...

Please sign up or login with your details

Forgot password? Click here to reset