Does chronology matter in JIT defect prediction? A Partial Replication Study

03/05/2021
by   Hadi Jahanshahi, et al.
0

Just-In-Time (JIT) models detect the fix-inducing changes (or defect-inducing changes). These models are designed based on the assumption that past code change properties are similar to future ones. However, as the system evolves, the expertise of developers and/or the complexity of the system also changes. In this work, we aim to investigate the effect of code change properties on JIT models over time. We also study the impact of using recent data as well as all available data on the performance of JIT models. Further, we analyze the effect of weighted sampling on the performance of fix-inducing properties of JIT models. For this purpose, we used datasets from Eclipse JDT, Mozilla, Eclipse Platform, and PostgreSQL. We used five families of change-code properties such as size, diffusion, history, experience, and purpose. We used Random Forest to train and test the JIT model and Brier Score and the area under the ROC curve for performance measurement. Our paper suggests that the predictive power of JIT models does not change over time. Furthermore, we observed that the chronology of data in JIT defect prediction models can be discarded by considering all the available data. On the other hand, the importance score of families of code change properties is found to oscillate over time. To mitigate the impact of the evolution of code change properties, it is recommended to use a weighted sampling approach in which more emphasis is placed upon the changes occurring closer to the current time. Moreover, since properties such as "Expertise of the Developer" and "Size" evolve with time, the models obtained from old data may exhibit different characteristics compared to those employing the newer dataset. Hence, practitioners should constantly retrain JIT models to include fresh data.

READ FULL TEXT
research
05/26/2019

Improving Change Prediction Models with Code Smell-Related Information

Code smells represent sub-optimal implementation choices applied by deve...
research
09/07/2023

Identifying Defect-Inducing Changes in Visual Code

Defects, or bugs, often form during software development. Identifying th...
research
12/07/2019

Accepted or Abandoned? Predicting the Fate of Code Changes

Many mature Open-Source Software (OSS), as well as commercial, organizat...
research
09/07/2023

Predicting Defective Visual Code Changes in a Multi-Language AAA Video Game Project

Video game development increasingly relies on using visual programming l...
research
06/21/2021

An empirical evaluation of the usefulness of Tree Kernels for Commit-time Defect Detection in large software systems

Defect detection at commit check-in time prevents the introduction of de...
research
08/24/2021

An Empirical Study on Refactoring-Inducing Pull Requests

Background: Pull-based development has shaped the practice of Modern Cod...
research
05/31/2021

Online Bayesian inference for multiple changepoints and risk assessment

The aim of the present study is to detect abrupt trend changes in the me...

Please sign up or login with your details

Forgot password? Click here to reset