DeepAI AI Chat
Log In Sign Up

CodeEditor: Learning to Edit Source Code with Pre-trained Models

10/31/2022
by   Jia Li, et al.
0

Developers often perform repetitive code editing activities for various reasons (e.g., code refactor) during software development. Many deep learning models are applied to automate code editing by learning from the code editing history. Recently, pre-trained code editing models have achieved the state-of-the-art (SOTA) results. Pre-trained models are first pre-trained with pre-training tasks and fine-tuned with the code editing task. Existing pre-training tasks mainly are code infilling tasks (e.g., masked language modeling), which are derived from the natural language processing field and are not designed for code editing. In this paper, we propose a pre-training task specialized in code editing and present an effective pre-trained code editing model named CodeEditor. Our pre-training task further improves the performance and generalization ability of code editing models. Specifically, we collect real-world code snippets as the ground truth and use a generator to rewrite them into natural but inferior versions. Then, we pre-train our CodeEditor to edit inferior versions into the ground truth, to learn edit patterns. We conduct experiments on four datasets and evaluate models in three settings. (1) In the fine-tuning setting, we fine-tune the pre-trained CodeEditor with four datasets. CodeEditor outperforms SOTA baselines by 15 few-shot setting, we fine-tune the pre-trained CodeEditor with limited data. CodeEditor substantially performs better than all baselines, even outperforming baselines that are fine-tuned with all data. (3) In the zero-shot setting, we evaluate the pre-trained CodeEditor without fine-tuning. CodeEditor correctly edits 1,113 programs while SOTA baselines can not work. The results prove that the superiority of our pre-training task and the pre-trained CodeEditor is more effective in automatic code editing.

READ FULL TEXT

page 1

page 2

page 3

page 4

12/04/2021

Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding

With the great success of pre-trained models, the pretrain-then-finetune...
12/08/2022

Editing Models with Task Arithmetic

Changing how pre-trained models behave – e.g., improving their performan...
05/18/2021

ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale

AI engineering has emerged as a crucial discipline to democratize deep n...
05/05/2022

Assistive Recipe Editing through Critiquing

There has recently been growing interest in the automatic generation of ...
03/17/2022

CodeReviewer: Pre-Training for Automating Code Review Activities

Code review is an essential part to software development lifecycle since...
02/10/2023

Impact of Code Language Models on Automated Program Repair

Automated program repair (APR) aims to help developers improve software ...
04/13/2022

Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar

We introduce NSEdit (neural-symbolic edit), a novel Transformer-based co...