A Full-fledged Commit Message Quality Checker Based on Machine Learning

09/09/2023
by   David Faragó, et al.
0

Commit messages (CMs) are an essential part of version control. By providing important context in regard to what has changed and why, they strongly support software maintenance and evolution. But writing good CMs is difficult and often neglected by developers. So far, there is no tool suitable for practice that automatically assesses how well a CM is written, including its meaning and context. Since this task is challenging, we ask the research question: how well can the CM quality, including semantics and context, be measured with machine learning methods? By considering all rules from the most popular CM quality guideline, creating datasets for those rules, and training and evaluating state-of-the-art machine learning models to check those rules, we can answer the research question with: sufficiently well for practice, with the lowest F_1 score of 82.9%, for the most challenging task. We develop a full-fledged open-source framework that checks all these CM quality rules. It is useful for research, e.g., automatic CM generation, but most importantly for software practitioners to raise the quality of CMs and thus the maintainability and evolution speed of their software.

READ FULL TEXT
research
02/07/2022

What Makes a Good Commit Message?

A key issue in collaborative software development is communication among...
research
03/09/2020

Is this GitHub Project Maintained? Measuring the Level of Maintenance Activity of Open-Source Projects

Context: GitHub hosts an impressive number of high-quality OSS projects....
research
07/23/2020

MLJ: A Julia package for composable Machine Learning

MLJ (Machine Learing in Julia) is an open source software package provid...
research
08/15/2023

From Commit Message Generation to History-Aware Commit Message Completion

Commit messages are crucial to software development, allowing developers...
research
10/30/2020

A Review On Software Defects Prediction Methods

Software quality is one of the essential aspects of a software. With inc...
research
04/13/2022

Aspirations and Practice of Model Documentation: Moving the Needle with Nudging and Traceability

Machine learning models have been widely developed, released, and adopte...
research
07/20/2021

Mono2Micro: A Practical and Effective Tool for Decomposing Monolithic Java Applications to Microservices

In migrating production workloads to cloud, enterprises often face the d...

Please sign up or login with your details

Forgot password? Click here to reset