Automatic Generation of Pull Request Descriptions

09/16/2019
by   Zhongxin Liu, et al.
0

Enabled by the pull-based development model, developers can easily contribute to a project through pull requests (PRs). When creating a PR, developers can add a free-form description to describe what changes are made in this PR and/or why. Such a description is helpful for reviewers and other developers to gain a quick understanding of the PR without touching the details and may reduce the possibility of the PR being ignored or rejected. However, developers sometimes neglect to write descriptions for PRs. For example, in our collected dataset with over 333K PRs, more than 34 alleviate this problem, we propose an approach to automatically generate PR descriptions based on the commit messages and the added source code comments in the PRs. We regard this problem as a text summarization problem and solve it using a novel sequence-to-sequence model. To cope with out-of-vocabulary words in software artifacts and bridge the gap between the training loss function of the sequence-to-sequence model and the evaluation metric ROUGE, which has been shown to correspond to human evaluation, we integrate the pointer generator and directly optimize for ROUGE using reinforcement learning and a special loss function. We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation. Our evaluation results show that our approach outperforms two baselines by significant margins.

READ FULL TEXT
research
02/14/2023

Developer-Intent Driven Code Comment Generation

Existing automatic code comment generators mainly focus on producing a g...
research
06/23/2022

AutoPRTitle: A Tool for Automatic Pull Request Title Generation

With the rise of the pull request mechanism in software development, the...
research
04/25/2020

Learning to Update Natural Language Comments Based on Code Changes

We formulate the novel task of automatically updating an existing natura...
research
08/14/2023

Semantic Similarity Loss for Neural Source Code Summarization

This paper presents an improved loss function for neural source code sum...
research
11/19/2021

Pointer over Attention: An Improved Bangla Text Summarization Approach Using Hybrid Pointer Generator Network

Despite the success of the neural sequence-to-sequence model for abstrac...
research
08/15/2018

A framework for automatic question generation from text using deep reinforcement learning

Automatic question generation (QG) is a useful yet challenging task in N...
research
08/21/2018

Automatic Generation of Text Descriptive Comments for Code Blocks

We propose a framework to automatically generate descriptive comments fo...

Please sign up or login with your details

Forgot password? Click here to reset