No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT

by   Zhijie Liu, et al.

Large language models (LLMs) have demonstrated impressive capabilities across various natural language processing (NLP) tasks, such as machine translation, question answering, summarization, and so on. Additionally, LLMs are also highly valuable in supporting software engineering tasks, particularly in the field of code generation. Automatic code generation is a process of automatically generating source code or executable code based on given specifications or requirements, improving developer productivity. In this study, we perform a systematic empirical assessment of code generation using ChatGPT, a recent and popular LLM. Our evaluation encompasses a comprehensive analysis of code snippets generated by ChatGPT, focusing on three critical aspects: correctness, understandability, and security. We also specifically investigate ChatGPT's ability to engage in multi-round process (i.e., ChatGPT's dialog ability) of facilitating code generation. By delving into the generated code and examining the experimental results, this work provides valuable insights into the performance of ChatGPT in tackling code generation tasks. Overall, our findings uncover potential issues and limitations that arise in the ChatGPT-based code generation and lay the groundwork for improving AI and LLM-based code generation techniques.


page 6

page 8

page 11

page 13

page 18

page 20

page 21

page 22


ChatGPT vs SBST: A Comparative Assessment of Unit Test Suite Generation

Recent advancements in large language models (LLMs) have demonstrated ex...

Refining ChatGPT-Generated Code: Characterizing and Mitigating Code Quality Issues

In this paper, we systematically study the quality of 4,066 ChatGPT-gene...

Is ChatGPT the Ultimate Programming Assistant – How far is it?

The recent progress in generative AI techniques has significantly influe...

Can Large Language Models assist in Hazard Analysis?

Large Language Models (LLMs), such as GPT-3, have demonstrated remarkabl...

A Framework for Generating Diverse Haskell-IO Exercise Tasks

We present the design of a framework to automatically generate a large r...

ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications

ChatGPT, as a versatile large language model, has demonstrated remarkabl...

Please sign up or login with your details

Forgot password? Click here to reset