CodeEditor: Learning to Edit Source Code with Pre-trained Models

10/31/2022
by   Jia Li, et al.
0

Developers often perform repetitive code editing activities for various reasons (e.g., code refactor) during software development. Many deep learning models are applied to automate code editing by learning from the code editing history. Recently, pre-trained code editing models have achieved the state-of-the-art (SOTA) results. Pre-trained models are first pre-trained with pre-training tasks and fine-tuned with the code editing task. Existing pre-training tasks mainly are code infilling tasks (e.g., masked language modeling), which are derived from the natural language processing field and are not designed for code editing. In this paper, we propose a pre-training task specialized in code editing and present an effective pre-trained code editing model named CodeEditor. Our pre-training task further improves the performance and generalization ability of code editing models. Specifically, we collect real-world code snippets as the ground truth and use a generator to rewrite them into natural but inferior versions. Then, we pre-train our CodeEditor to edit inferior versions into the ground truth, to learn edit patterns. We conduct experiments on four datasets and evaluate models in three settings. (1) In the fine-tuning setting, we fine-tune the pre-trained CodeEditor with four datasets. CodeEditor outperforms SOTA baselines by 15 few-shot setting, we fine-tune the pre-trained CodeEditor with limited data. CodeEditor substantially performs better than all baselines, even outperforming baselines that are fine-tuned with all data. (3) In the zero-shot setting, we evaluate the pre-trained CodeEditor without fine-tuning. CodeEditor correctly edits 1,113 programs while SOTA baselines can not work. The results prove that the superiority of our pre-training task and the pre-trained CodeEditor is more effective in automatic code editing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2021

Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding

With the great success of pre-trained models, the pretrain-then-finetune...
research
05/18/2023

CCT5: A Code-Change-Oriented Pre-Trained Model

Software is constantly changing, requiring developers to perform several...
research
05/29/2023

Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing

Developers often dedicate significant time to maintaining and refactorin...
research
05/18/2021

ModelPS: An Interactive and Collaborative Platform for Editing Pre-trained Models at Scale

AI engineering has emerged as a crucial discipline to democratize deep n...
research
05/05/2022

Assistive Recipe Editing through Critiquing

There has recently been growing interest in the automatic generation of ...
research
05/22/2023

Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models

Task arithmetic has recently emerged as a cost-effective and scalable ap...
research
12/08/2022

Editing Models with Task Arithmetic

Changing how pre-trained models behave – e.g., improving their performan...

Please sign up or login with your details

Forgot password? Click here to reset