Graph-Based Machine Learning Improves Just-in-Time Defect Prediction

10/11/2021
by   Jonathan Bryan, et al.
0

The increasing complexity of today's software requires the contribution of thousands of developers. This complex collaboration structure makes developers more likely to introduce defect-prone changes that lead to software faults. Determining when these defect-prone changes are introduced has proven challenging, and using traditional machine learning (ML) methods to make these determinations seems to have reached a plateau. In this work, we build contribution graphs consisting of developers and source files to capture the nuanced complexity of changes required to build software. By leveraging these contribution graphs, our research shows the potential of using graph-based ML to improve Just-In-Time (JIT) defect prediction. We hypothesize that features extracted from the contribution graphs may be better predictors of defect-prone changes than intrinsic features derived from software characteristics. We corroborate our hypothesis using graph-based ML for classifying edges that represent defect-prone changes. This new framing of the JIT defect prediction problem leads to remarkably better results. We test our approach on 14 open-source projects and show that our best model can predict whether or not a code change will lead to a defect with an F1 score as high as 86.25%. This represents an increase of as much as 55.4% over the state-of-the-art in JIT defect prediction. We describe limitations, open challenges, and how this method can be used for operational JIT defect prediction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2022

Leveraging Structural Properties of Source Code Graphs for Just-In-Time Bug Prediction

The most common use of data visualization is to minimize the complexity ...
research
12/25/2019

A Study of the Learnability of Relational Properties (Model Counting Meets Machine Learning)

Relational properties, e.g., the connectivity structure of nodes in a di...
research
02/15/2021

Investigating and Recommending Co-Changed Entities for JavaScript Programs

JavaScript (JS) is one of the most popular programming languages due to ...
research
10/05/2022

IRJIT – An Information Retrieval Technique for Just-in-time Defect Identification

Defect identification at commit check-in time prevents the introduction ...
research
12/18/2020

Neural Network Embeddings for Test Case Prioritization

In modern software engineering, Continuous Integration (CI) has become a...
research
08/08/2022

Learning to Learn to Predict Performance Regressions in Production at Meta

Catching and attributing code change-induced performance regressions in ...
research
08/25/2023

Human-in-the-loop online just-in-time software defect prediction

Online Just-In-Time Software Defect Prediction (O-JIT-SDP) uses an onlin...

Please sign up or login with your details

Forgot password? Click here to reset