Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues

07/24/2023
by   Yue Liu, et al.
0

In this paper, we systematically study the quality of 4,066 ChatGPT-generated code implemented in two popular programming languages, i.e., Java and Python, for 2,033 programming tasks. The goal of this work is three folds. First, we analyze the correctness of ChatGPT on code generation tasks and uncover the factors that influence its effectiveness, including task difficulty, programming language, time that tasks are introduced, and program size. Second, we identify and characterize potential issues with the quality of ChatGPT-generated code. Last, we provide insights into how these issues can be mitigated. Experiments highlight that out of 4,066 programs generated by ChatGPT, 2,757 programs are deemed correct, 1,081 programs provide wrong outputs, and 177 programs contain compilation or runtime errors. Additionally, we further analyze other characteristics of the generated code through static analysis tools, such as code style and maintainability, and find that 1,933 ChatGPT-generated code snippets suffer from maintainability issues. Subsequently, we investigate ChatGPT's self-debugging ability and its interaction with static analysis tools to fix the errors uncovered in the previous step. Experiments suggest that ChatGPT can partially address these challenges, improving code quality by more than 20 limitations and opportunities for improvement. Overall, our study provides valuable insights into the current limitations of ChatGPT and offers a roadmap for future research and development efforts to enhance the code generation capabilities of AI models like ChatGPT.

READ FULL TEXT

page 4

page 9

page 15

page 16

page 17

research
08/09/2023

No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT

Large language models (LLMs) have demonstrated impressive capabilities a...
research
08/27/2021

Lyra: A Benchmark for Turducken-Style Code Generation

Code generation is crucial to reduce manual software development efforts...
research
08/31/2023

Experimenting with ChatGPT for Spreadsheet Formula Generation: Evidence of Risk in AI Generated Spreadsheets

Large Language Models (LLM) have become sophisticated enough that comple...
research
07/17/2023

A Lightweight Framework for High-Quality Code Generation

In recent years, the use of automated source code generation utilizing t...
research
03/12/2023

Live, Rich, and Composable: Qualities for Programming Beyond Static Text

Efforts to push programming beyond static textual code have sought to im...
research
05/24/2023

ALGO: Synthesizing Algorithmic Programs with Generated Oracle Verifiers

Large language models (LLMs) excel at implementing code from functionali...
research
03/22/2021

Sorald: Automatic Patch Suggestions for SonarQube Static Analysis Violations

Previous work has shown that early resolution of issues detected by stat...

Please sign up or login with your details

Forgot password? Click here to reset