SkCoder: A Sketch-based Approach for Automatic Code Generation

02/13/2023
by   Jia Li, et al.
0

Recently, deep learning techniques have shown great success in automatic code generation. Inspired by the code reuse, some researchers propose copy-based approaches that can copy the content from similar code snippets to obtain better performance. Practically, human developers recognize the content in the similar code that is relevant to their needs, which can be viewed as a code sketch. The sketch is further edited to the desired code. However, existing copy-based approaches ignore the code sketches and tend to repeat the similar code without necessary modifications, which leads to generating wrong results. In this paper, we propose a sketch-based code generation approach named SkCoder to mimic developers' code reuse behavior. Given a natural language requirement, SkCoder retrieves a similar code snippet, extracts relevant parts as a code sketch, and edits the sketch into the desired code. Our motivations are that the extracted sketch provides a well-formed pattern for telling models "how to write". The post-editing further adds requirement-specific details to the sketch and outputs the complete code. We conduct experiments on two public datasets and a new dataset collected by this work. We compare our approach to 20 baselines using 5 widely used metrics. Experimental results show that (1) SkCoder can generate more correct programs, and outperforms the state-of-the-art - CodeT5-base by 30.30 (2) Our approach is effective to multiple code generation models and improves them by up to 120.1 sketches and discuss the importance of sketches. (4) We manually evaluate the generated code and prove the superiority of our SkCoder in three aspects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2023

Towards Enhancing In-Context Learning for Code Generation

In-context learning (ICL) with pre-trained language models (PTLMs) has s...
research
05/11/2023

Enabling Programming Thinking in Large Language Models Toward Code Generation

Large Language Models (LLMs) (e.g., ChatGPT) have shown impressive perfo...
research
06/14/2022

CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation

Code generation is a longstanding challenge, aiming to generate a code s...
research
04/08/2021

A Sketch-Based Neural Model for Generating Commit Messages from Diffs

Commit messages have an important impact in software development, especi...
research
08/13/2018

Generating Paths with WFC

Motion plans are often randomly generated for minor game NPCs. Repetitiv...
research
11/19/2020

Creative Sketch Generation

Sketching or doodling is a popular creative activity that people engage ...
research
11/10/2022

DrawMon: A Distributed System for Detection of Atypical Sketch Content in Concurrent Pictionary Games

Pictionary, the popular sketch-based guessing game, provides an opportun...

Please sign up or login with your details

Forgot password? Click here to reset