Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

08/31/2023
by   Arghavan Moradi Dakhel, et al.
0

One of the critical phases in software development is software testing. Testing helps with identifying potential bugs and reducing maintenance costs. The goal of automated test generation tools is to ease the development of tests by suggesting efficient bug-revealing tests. Recently, researchers have leveraged Large Language Models (LLMs) of code to generate unit tests. While the code coverage of generated tests was usually assessed, the literature has acknowledged that the coverage is weakly correlated with the efficiency of tests in bug detection. To improve over this limitation, in this paper, we introduce MuTAP for improving the effectiveness of test cases generated by LLMs in terms of revealing bugs by leveraging mutation testing. Our goal is achieved by augmenting prompts with surviving mutants, as those mutants highlight the limitations of test cases in detecting bugs. MuTAP is capable of generating effective test cases in the absence of natural language descriptions of the Program Under Test (PUTs). We employ different LLMs within MuTAP and evaluate their performance on different benchmarks. Our results show that our proposed method is able to detect up to 28 Among these, 17 automated test generation tool (i.e., Pynguin) and zero-shot/few-shot learning approaches on LLMs. Furthermore, MuTAP achieves a Mutation Score (MS) of 93.57 on synthetic buggy code, outperforming all other approaches in our evaluation. Our findings suggest that although LLMs can serve as a useful tool to generate test cases, they require specific post-processing steps to enhance the effectiveness of the generated test cases which may suffer from syntactic or functional errors and may be ineffective in detecting certain types of bugs and testing corner cases PUTs.

READ FULL TEXT
research
09/23/2022

Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction

Many automated test generation techniques have been developed to aid dev...
research
09/11/2020

Unit Test Case Generation with Transformers

Automated Unit Test Case generation has been the focus of extensive lite...
research
06/06/2023

Large Language Models of Code Fail at Completing Code with Potential Bugs

Large language models of code (Code-LLMs) have recently brought tremendo...
research
10/29/2019

Modelling and testing timed data-flow reactive systems in Coq from controlled natural-language requirements

Data-flow reactive systems (DFRSs) are a class of embedded systems whose...
research
04/15/2021

Automated Conformance Testing for JavaScript Engines via Deep Compiler Fuzzing

JavaScript (JS) is a popular, platform-independent programming language....
research
10/14/2022

TestAug: A Framework for Augmenting Capability-based NLP Tests

The recently proposed capability-based NLP testing allows model develope...
research
02/07/2022

Red Teaming Language Models with Language Models

Language Models (LMs) often cannot be deployed because of their potentia...

Please sign up or login with your details

Forgot password? Click here to reset