Boosting Automatic Commit Classification Into Maintenance Activities By Utilizing Source Code Changes

11/14/2017
by   Stanislav Levin, et al.
0

Background: Understanding maintenance activities performed in a source code repository could help practitioners reduce uncertainty and improve cost-effectiveness by planning ahead and pre-allocating resources towards source code maintenance. The research community uses 3 main classification categories for maintenance activities: Corrective: fault fixing; Perfective: system improvements; Adaptive: new feature introduction. Previous work in this area has mostly concentrated on evaluating commit classification (into maintenance activities) models in the scope of a single software project. Aims: In this work we seek to design a commit classification model capable of providing high accuracy and Kappa across different projects. In addition, we wish to compare the accuracy and kappa characteristics of classification models that utilize word frequency analysis, source code changes, and combination thereof. Method: We suggest a novel method for automatically classifying commits into maintenance activities by utilizing source code changes (e.g, statement added, method removed, etc.). The results we report are based on studying 11 popular open source projects from various professional domains from which we had manually classified 1151 commits, over 100 from each of the studied projects. Our models were trained using 85 remaining 15 accuracy of 76 for the test dataset, an improvement of over 20 percentage points, and a relative boost of 40 Conclusions: We show that by using source code changes in combination with commit message word frequency analysis we are able to considerably boost classification quality in a project agnostic manner.

READ FULL TEXT

Authors

page 6

03/09/2019

Towards Software Analytics: Modeling Maintenance Activities

Lehman's Laws teach us that a software system will become progressively ...
05/28/2020

Using Source Code Density to Improve the Accuracy of Automatic Commit Classification into Maintenance Activities

Source code is changed for a reason, e.g., to adapt, correct, or adapt i...
10/20/2019

Visually Exploring Software Maintenance Activities

Lehman's Laws teach us that a software system will become progressively ...
09/26/2017

The Co-Evolution of Test Maintenance and Code Maintenance through the lens of Fine-Grained Semantic Changes

Automatic testing is a widely adopted technique for improving software q...
05/12/2021

Parthenos: A Source Code Injection Approach for Software Transformation

Maintaining legacy enterprise information systems is a known necessity i...
07/19/2021

Detecting Oxbow Code in Erlang Codebases with the Highest Degree of Certainty

The presence of source code that is no longer needed is a handicap to pr...
04/27/2022

Release as a Contract: A Concept of Meta-Maintenance for the Entire FLOSS Ecosystem

We advocate for a paradigm shift in supporting free/libre and open sourc...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.