Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models

07/31/2023
by   Liran Wang, et al.
0

Commit message generation (CMG) is a challenging task in automated software engineering that aims to generate natural language descriptions of code changes for commits. Previous methods all start from the modified code snippets, outputting commit messages through template-based, retrieval-based, or learning-based models. While these methods can summarize what is modified from the perspective of code, they struggle to provide reasons for the commit. The correlation between commits and issues that could be a critical factor for generating rational commit messages is still unexplored. In this work, we delve into the correlation between commits and issues from the perspective of dataset and methodology. We construct the first dataset anchored on combining correlated commits and issues. The dataset consists of an unlabeled commit-issue parallel part and a labeled part in which each example is provided with human-annotated rational information in the issue. Furthermore, we propose (Extraction, Grounding, Fine-tuning), a novel paradigm that can introduce the correlation between commits and issues into the training phase of models. To evaluate whether it is effective, we perform comprehensive experiments with various state-of-the-art CMG models. The results show that compared with the original models, the performance of -enhanced models is significantly improved.

READ FULL TEXT

page 1

page 8

research
12/06/2019

ATOM: Commit Message Generation Based on Abstract Syntax Tree and Hybrid Ranking

Commit messages record code changes (e.g., feature modifications and bug...
research
07/12/2021

On the Evaluation of Commit Message Generation Models: An Experimental Study

Commit messages are natural language descriptions of code changes, which...
research
03/05/2022

ECMG: Exemplar-based Commit Message Generation

Commit messages concisely describe the content of code diffs (i.e., code...
research
06/26/2023

Context-Encoded Code Change Representation for Automated Commit Message Generation

Changes in source code are an inevitable part of software development. T...
research
08/15/2023

From Commit Message Generation to History-Aware Commit Message Completion

Commit messages are crucial to software development, allowing developers...
research
11/10/2016

Length Matters: Clustering System Log Messages using Length of Words

The analysis techniques of system log messages (syslog messages) have a ...

Please sign up or login with your details

Forgot password? Click here to reset