CIRCLE: Continual Repair across Programming Languages

05/22/2022
by   Wei Yuan, et al.
0

Automatic Program Repair (APR) aims at fixing buggy source code with less manual debugging efforts, which plays a vital role in improving software reliability and development productivity. Recent APR works have achieved remarkable progress via applying deep learning (DL), particularly neural machine translation (NMT) techniques. However, we observe that existing DL-based APR models suffer from at least two severe drawbacks: (1) Most of them can only generate patches for a single programming language, as a result, to repair multiple languages, we have to build and train many repairing models. (2) Most of them are developed in an offline manner. Therefore, they won't function when there are new-coming requirements. To address the above problems, a T5-based APR framework equipped with continual learning ability across multiple programming languages is proposed, namely ContInual Repair aCross Programming LanguagEs (CIRCLE). Specifically, (1) CIRCLE utilizes a prompting function to narrow the gap between natural language processing (NLP) pre-trained tasks and APR. (2) CIRCLE adopts a difficulty-based rehearsal strategy to achieve lifelong learning for APR without access to the full historical data. (3) An elastic regularization method is employed to strengthen CIRCLE's continual learning ability further, preventing it from catastrophic forgetting. (4) CIRCLE applies a simple but effective re-repairing method to revise generated errors caused by crossing multiple programming languages. We train CIRCLE for four languages (i.e., C, JAVA, JavaScript, and Python) and evaluate it on five commonly used benchmarks. The experimental results demonstrate that CIRCLE not only effectively and efficiently repairs multiple programming languages in continual learning settings, but also achieves state-of-the-art performance with a single repair model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/20/2019

ENCORE: Ensemble Learning using Convolution Neural Machine Translation for Automatic Program Repair

Automated generate-and-validate (G&V) program repair techniques typicall...
research
08/04/2018

code2seq: Generating Sequences from Structured Representations of Code

The ability to generate natural language sequences from source code snip...
research
12/12/2020

R-Hero: A Software Repair Bot based on Continual Learning

Software bugs are common and correcting them accounts for a significant ...
research
03/22/2023

Towards A Visual Programming Tool to Create Deep Learning Models

Deep Learning (DL) developers come from different backgrounds, e.g., med...
research
05/03/2022

Neural language models for network configuration: Opportunities and reality check

Boosted by deep learning, natural language processing (NLP) techniques h...
research
07/05/2023

Exploring Continual Learning for Code Generation Models

Large-scale code generation models such as Codex and CodeT5 have achieve...
research
05/26/2022

Leveraging Causal Inference for Explainable Automatic Program Repair

Deep learning models have made significant progress in automatic program...

Please sign up or login with your details

Forgot password? Click here to reset