CodeGen-Test: An Automatic Code Generation Model Integrating Program Test Information

by   Maosheng Zhong, et al.

Automatic code generation is to generate the program code according to the given natural language description. The current mainstream approach uses neural networks to encode natural language descriptions, and output abstract syntax trees (AST) at the decoder, then convert the AST into program code. While the generated code largely conforms to specific syntax rules, two problems are still ignored. One is missing program testing, an essential step in the process of complete code implementation; the other is only focusing on the syntax compliance of the generated code, while ignoring the more important program functional requirements. The paper proposes a CodeGen-Test model, which adds program testing steps and incorporates program testing information to iteratively generate code that meets the functional requirements of the program, thereby improving the quality of code generation. At the same time, the paper proposes a new evaluation metric, test accuracy (Test-Acc), which represents the proportion of passing program test in generated code. Different from the previous evaluation metric, which only evaluates the quality of code generation from the perspective of character similarity, the Test-Acc can evaluate the quality of code generation from the Program functions. Moreover, the paper evaluates the CodeGen-test model on a python data set "hearthstone legend". The experimental results show the proposed method can effectively improve the quality of generated code. Compared with the existing optimal model, CodeGen-Test model improves the Bleu value by 0.2 0.3


page 4

page 10


GAP-Gen: Guided Automatic Python Code Generation

Automatic code generation from natural language descriptions can be high...

CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

Evaluation metrics play a vital role in the growth of an area as it defi...

Retrieval-Based Neural Code Generation

In models to generate program source code from natural language, represe...

CAST: Enhancing Code Summarization with Hierarchical Splitting and Reconstruction of Abstract Syntax Trees

Code summarization aims to generate concise natural language description...

Auto-generated Spies Increase Test Maintainability

We have inspected the test code for the scala.collection.Iterator trait ...

EgoCoder: Intelligent Program Synthesis with Hierarchical Sequential Neural Network Model

Programming has been an important skill for researchers and practitioner...

CGEMs: A Metric Model for Automatic Code Generation using GPT-3

Today, AI technology is showing its strengths in almost every industry a...

Please sign up or login with your details

Forgot password? Click here to reset