Large Language Models of Code Fail at Completing Code with Potential Bugs

06/06/2023
by   Tuan Dinh, et al.
0

Large language models of code (Code-LLMs) have recently brought tremendous advances to code completion, a fundamental feature of programming assistance and code intelligence. However, most existing works ignore the possible presence of bugs in the code context for generation, which are inevitable in software development. Therefore, we introduce and study the buggy-code completion problem, inspired by the realistic scenario of real-time code suggestion where the code context contains potential bugs – anti-patterns that can become bugs in the completed program. To systematically study the task, we introduce two datasets: one with synthetic bugs derived from semantics-altering operator changes (buggy-HumanEval) and one with realistic bugs derived from user submissions to coding problems (buggy-FixEval). We find that the presence of potential bugs significantly degrades the generation performance of the high-performing Code-LLMs. For instance, the passing rates of CodeGen-2B-mono on test cases of buggy-HumanEval drop more than 50 bug in the context. Finally, we investigate several post-hoc methods for mitigating the adverse effect of potential bugs and find that there remains a large gap in post-mitigation performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2019

An Empirical Investigation of Correlation between Code Complexity and Bugs

There have been many studies conducted on predicting bugs. These studies...
research
08/31/2023

Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

One of the critical phases in software development is software testing. ...
research
08/01/2023

The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models

Static analysis is a widely used technique in software engineering for i...
research
08/09/2023

Universal Fuzzing via Large Language Models

Fuzzing has achieved tremendous success in discovering bugs and vulnerab...
research
10/16/2018

Optimizing AIREBO: Navigating the Journey from Complex Legacy Code to High Performance

Despite initiatives to improve the quality of scientific codes, there st...
research
03/17/2021

On the Rise and Fall of Simple Stupid Bugs: a Life-Cycle Analysis of SStuBs

Bug detection and prevention is one of the most important goals of softw...
research
09/21/2020

Recommending Stack Overflow Posts for Fixing Runtime Exceptions using Failure Scenario Matching

Using online Q A forums, such as Stack Overflow (SO), for guidance to ...

Please sign up or login with your details

Forgot password? Click here to reset