Fault Prediction based on Software Metrics and SonarQube Rules. Machine or Deep Learning?

by   Francesco Lomio, et al.

Background. Developers spend more time fixing bugs and refactoring the code to increase the maintainability than developing new features. Researchers investigated the code quality impact on fault-proneness focusing on code smells and code metrics. Objective. We aim at advancing fault-inducing commit prediction based on SonarQube considering the contribution provided by each rule and metric. Method. We designed and conducted a case study among 33 Java projects analyzed with SonarQube and SZZ to identify fault-inducing and fault-fixing commits. Moreover, we investigated fault-proneness of each SonarQube rule and metric using Machine and Deep Learning models. Results. We analyzed 77,932 commits that contain 40,890 faults and infected by more than 174 SonarQube rules violated 1,9M times, on which there was calculated 24 software metrics available by the tool. Compared to machine learning models, deep learning provide a more accurate fault detection accuracy and allowed us to accurately identify the fault-prediction power of each SonarQube rule. As a result, fourteen of the 174 violated rules has an importance higher than 1% and account for 30% of the total fault-proneness importance, while the fault proneness of the remaining 165 rules is negligible. Conclusion. Future works might consider the adoption of timeseries analysis and anomaly detection techniques to better and more accurately detect the rules that impact fault-proneness.


page 1

page 2

page 3

page 4


On the Fault Proneness of SonarQube Technical Debt Violations: A comparison of eight Machine Learning Techniques

Background. The popularity of tools for analyzing Technical Debt, and pa...

Some SonarQube Issues have a Significant but SmallEffect on Faults and Changes. A large-scale empirical study

Context. Companies commonly invest effort to remove technical issues bel...

Performance and Power Modeling and Prediction Using MuMMI and Ten Machine Learning Methods

In this paper, we use modeling and prediction tool MuMMI (Multiple Metri...

Mutant Density: A Measure of Fault-Sensitive Complexity

Software code complexity is a well-studied property to determine softwar...

Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk

Background. Test resources are usually limited and therefore it is often...

Poster: Identification of Methods with Low Fault Risk

Test resources are usually limited and therefore it is often not possibl...

DQLAP: Deep Q-Learning Recommender Algorithm with Update Policy for a Real Steam Turbine System

In modern industrial systems, diagnosing faults in time and using the be...

Please sign up or login with your details

Forgot password? Click here to reset