On the Reliability and Explainability of Automated Code Generation Approaches

02/19/2023
by   Yue Liu, et al.
0

Automatic code generation, the task of generating new code snippets from existing code or comments, has long been of interest. Numerous code generation models have been proposed and proven on different benchmark datasets. However, little is known about whether this objective has been achieved and why code generation models effectively transform code sequences automatically. In other words, can we totally trust these automated code generation models? Consequently, there is a pressing need to understand the inner logic of code generation models and to investigate their replicability, reliability, and explainability. To bridge these research gaps, we conduct a thorough empirical study of five code generation models on four representative code generation datasets to assess the limits and capabilities of automatic code generation approaches. We further employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code generation. Experiments demonstrate that we successfully replicate state-of-the-art code generation approaches. We discover that state-of-the-art approaches suffer from severe data duplication and input insensitivity, which are subtle issues with significant implications. Our explainability analysis reveals that, in various experimental scenarios, code generation models can recognize code grammar and structural information, but can not capture key tokens that need to be updated. Our results draw several lessons and guidelines for future work in this area.

READ FULL TEXT

page 10

page 12

page 13

research
05/24/2023

Who Wrote this Code? Watermarking for Code Generation

Large language models for code have recently shown remarkable performanc...
research
07/12/2022

Are We Building on the Rock? On the Importance of Data Preprocessing for Code Summarization

Code summarization, the task of generating useful comments given the cod...
research
06/02/2023

Is Model Attention Aligned with Human Attention? An Empirical Study on Large Language Models for Code Generation

Large Language Models (LLMs) have been demonstrated effective for code g...
research
11/14/2018

A Grammar-Based Structural CNN Decoder for Code Generation

Code generation maps a program description to executable source code in ...
research
11/02/2022

CODEP: Grammatical Seq2Seq Model for General-Purpose Code Generation

General-purpose code generation (GPCG) aims to automatically convert the...
research
09/06/2023

Improving Code Generation by Dynamic Temperature Sampling

Recently, Large Language Models (LLMs) have shown impressive results in ...
research
05/15/2023

Improving ChatGPT Prompt for Code Generation

Automated code generation can be a powerful technique for software devel...

Please sign up or login with your details

Forgot password? Click here to reset