From Commit Message Generation to History-Aware Commit Message Completion

08/15/2023
by   Aleksandra Eliseeva, et al.
0

Commit messages are crucial to software development, allowing developers to track changes and collaborate effectively. Despite their utility, most commit messages lack important information since writing high-quality commit messages is tedious and time-consuming. The active research on commit message generation (CMG) has not yet led to wide adoption in practice. We argue that if we could shift the focus from commit message generation to commit message completion and use previous commit history as additional context, we could significantly improve the quality and the personal nature of the resulting commit messages. In this paper, we propose and evaluate both of these novel ideas. Since the existing datasets lack historical data, we collect and share a novel dataset called CommitChronicle, containing 10.7M commits across 20 programming languages. We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3.5-turbo. Our results show that in some contexts, commit message completion shows better results than generation, and that while in general GPT-3.5-turbo performs worse, it shows potential for long and detailed messages. As for the history, the results show that historical information improves the performance of CMG models in the generation task, and the performance of GPT-3.5-turbo in both generation and completion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2022

What Makes a Good Commit Message?

A key issue in collaborative software development is communication among...
research
07/12/2021

On the Evaluation of Commit Message Generation Models: An Experimental Study

Commit messages are natural language descriptions of code changes, which...
research
06/26/2023

Context-Encoded Code Change Representation for Automated Commit Message Generation

Changes in source code are an inevitable part of software development. T...
research
01/13/2020

DeepQuarantine for Suspicious Mail

In this paper, we introduce DeepQuarantine (DQ), a cloud technology to d...
research
09/09/2023

A Full-fledged Commit Message Quality Checker Based on Machine Learning

Commit messages (CMs) are an essential part of version control. By provi...
research
07/31/2023

Delving into Commit-Issue Correlation to Enhance Commit Message Generation Models

Commit message generation (CMG) is a challenging task in automated softw...
research
04/04/2022

Clues in Tweets: Twitter-Guided Discovery and Analysis of SMS Spam

With its critical role in business and service delivery through mobile d...

Please sign up or login with your details

Forgot password? Click here to reset