On Learning Meaningful Code Changes via Neural Machine Translation

01/25/2019
by   Michele Tufano, et al.
0

Recent years have seen the rise of Deep Learning (DL) techniques applied to source code. Researchers have exploited DL to automate several development and maintenance tasks, such as writing commit messages, generating comments and detecting vulnerabilities among others. One of the long lasting dreams of applying DL to source code is the possibility to automate non-trivial coding activities. While some steps in this direction have been taken (e.g., learning how to fix bugs), there is still a glaring lack of empirical evidence on the types of code changes that can be learned and automatically applied by DL. Our goal is to make this first important step by quantitatively and qualitatively investigating the ability of a Neural Machine Translation (NMT) model to learn how to automatically apply code changes implemented by developers during pull requests. We train and experiment with the NMT model on a set of 236k pairs of code components before and after the implementation of the changes provided in the pull requests. We show that, when applied in a narrow enough context (i.e., small/medium-sized pairs of methods before/after the pull request changes), NMT can automatically replicate the changes implemented by developers during pull requests in up to 36 that the model is capable of learning and replicating a wide variety of meaningful code changes, especially refactorings and bug-fixing activities. Our results pave the way for novel research in the area of DL on code, such as the automatic learning and applications of refactoring.

READ FULL TEXT

page 1

page 6

page 8

research
09/30/2018

Tree2Tree Neural Translation Model for Learning Source Code Changes

The way developers edit day-to-day code tend to be repetitive and often ...
research
05/26/2022

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

Adapting Deep Learning (DL) techniques to automate non-trivial coding ac...
research
07/22/2021

An Empirical Study on Code Comment Completion

Code comments play a prominent role in program comprehension activities....
research
04/08/2021

A Sketch-Based Neural Model for Generating Commit Messages from Diffs

Commit messages have an important impact in software development, especi...
research
08/15/2021

On Multi-Modal Learning of Editing Source Code

In recent years, Neural Machine Translator (NMT) has shown promise in au...
research
03/02/2022

Code Smells in Machine Learning Systems

As Deep learning (DL) systems continuously evolve and grow, assuring the...
research
02/13/2020

On Learning Meaningful Assert Statements for Unit Test Cases

Software testing is an essential part of the software lifecycle andrequi...

Please sign up or login with your details

Forgot password? Click here to reset