COCO: Testing Code Generation Systems via Concretized Instructions

08/25/2023
by   Ming Yan, et al.
0

Code generation systems have been extensively developed in recent years to generate source code based on natural language instructions. However, despite their advancements, these systems still face robustness issues where even slightly different instructions can result in significantly different code semantics. Robustness is critical for code generation systems, as it can have significant impacts on software development, software quality, and trust in the generated code. Although existing testing techniques for general text-to-text software can detect some robustness issues, they are limited in effectiveness due to ignoring the characteristics of code generation systems. In this work, we propose a novel technique COCO to test the robustness of code generation systems. It exploits the usage scenario of code generation systems to make the original programming instruction more concrete by incorporating features known to be contained in the original code. A robust system should maintain code semantics for the concretized instruction, and COCO detects robustness inconsistencies when it does not. We evaluated COCO on eight advanced code generation systems, including commercial tools such as Copilot and ChatGPT, using two widely-used datasets. Our results demonstrate the effectiveness of COCO in testing the robustness of code generation systems, outperforming two techniques adopted from general text-to-text software testing by 466.66 104.02 can help reduce robustness inconsistencies by 18.35 fine-tuning.

READ FULL TEXT

page 3

page 4

research
01/05/2018

Comment Generation for Source Code: State of the Art, Challenges and Opportunities

Researches have shown that most effort of today's software development i...
research
08/20/2023

A Study on Robustness and Reliability of Large Language Model Code Generation

Recently, the large language models (LLMs) have shown extraordinary abil...
research
11/16/2020

Survey of Methods for Automated Code-Reuse Exploit Generation

This paper provides a survey of methods and tools for automated code-reu...
research
04/15/2023

Self-collaboration Code Generation via ChatGPT

Code generation is widely regarded as a key technique for elevating the ...
research
05/09/2023

The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation

We present The Vault, an open-source, large-scale code-text dataset desi...
research
02/16/2022

Code Search based on Context-aware Code Translation

Code search is a widely used technique by developers during software dev...
research
05/02/2021

Assessing Exception Handling Testing Practices in Open-Source Libraries

Modern programming languages (e.g., Java and C#) provide features to sep...

Please sign up or login with your details

Forgot password? Click here to reset