The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring

01/10/2020
by   Maurício Aniche, et al.
0

Refactoring is the process of changing the internal structure of software to improve its quality without modifying its external behavior. Empirical studies have repeatedly shown that refactoring has a positive impact on the understandability and maintainability of software systems. However, before carrying out refactoring activities, developers need to identify refactoring opportunities. Currently, refactoring opportunity identification heavily relies on developers' expertise and intuition. In this paper, we investigate the effectiveness of machine learning algorithms in predicting software refactorings. More specifically, we train six different machine learning algorithms (i.e., Logistic Regression, Naive Bayes, Support Vector Machine, Decision Trees, Random Forest, and Neural Network) with a dataset comprising over two million refactorings from 11,149 real-world projects from the Apache, F-Droid, and GitHub ecosystems. The resulting models predict 20 different refactorings at class, method, and variable-levels with an accuracy often higher than 90 for predicting software refactoring, (ii) process and ownership metrics seem to play a crucial role in the creation of better models, and (iii) models generalize well in different contexts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/21/2018

Ecological Data Analysis Based on Machine Learning Algorithms

Classification is an important supervised machine learning method, which...
research
09/05/2020

The Integrity of Machine Learning Algorithms against Software Defect Prediction

The increased computerization in recent years has resulted in the produc...
research
06/22/2022

Defect Prediction Using Stylistic Metrics

Defect prediction is one of the most popular research topics due to its ...
research
12/26/2020

Explainable Multi-class Classification of Medical Data

Machine Learning applications have brought new insights into a secondary...
research
07/18/2020

A Comparison of Machine Learning Algorithms Applied to American Legislature Polarization

We present a novel approach to the measurement of American state legisla...
research
08/25/2023

Using Adamic-Adar Index Algorithm to Predict Volunteer Collaboration: Less is More

Social networks exhibit a complex graph-like structure due to the uncert...
research
08/17/2017

Learning Effective Changes For Software Projects

The current generation of software analytics tools are mostly prediction...

Please sign up or login with your details

Forgot password? Click here to reset