Detecting Concept Drift for the reliability prediction of Software Defects using Instance Interpretation

05/06/2023
by   Zeynab Chitsazian, et al.
0

In the context of Just-In-Time Software Defect Prediction (JIT-SDP), Concept drift (CD) can occur due to changes in the software development process, the complexity of the software, or changes in user behavior that may affect the stability of the JIT-SDP model over time. Additionally, the challenge of class imbalance in JIT-SDP data poses a potential risk to the accuracy of CD detection methods if rebalancing is implemented. This issue has not been explored to the best of our knowledge. Furthermore, methods to check the stability of JIT-SDP models over time by considering labeled evaluation data have been proposed. However, it should be noted that future data labels may not always be available promptly. We aim to develop a reliable JIT-SDP model using CD point detection directly by identifying changes in the interpretation of unlabeled simplified and resampled data. To evaluate our approach, we first obtained baseline methods based on model performance monitoring to identify CD points on labeled data. We then compared the output of the proposed methods with baseline methods based on performance monitoring of threshold-dependent and threshold-independent criteria using well-known performance measures in CD detection methods, such as accuracy, MDR, MTD, MTFA, and MTR. We also utilize the Friedman statistical test to assess the effectiveness of our approach. As a result, our proposed methods show higher compatibility with baseline methods based on threshold-independent criteria when applied to rebalanced data, and with baseline methods based on threshold-dependent criteria when applied to simple data.

READ FULL TEXT
research
05/22/2023

Mitigating ML Model Decay in Continuous Integration with Data Drift Detection: An Empirical Study

Background: Machine Learning (ML) methods are being increasingly used fo...
research
04/04/2015

Concept Drift Detection for Streaming Data

Common statistical prediction models often require and assume stationari...
research
02/02/2021

Drift Estimation with Graphical Models

This paper deals with the issue of concept drift in supervised machine l...
research
03/21/2022

From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors

The dynamicity of real-world systems poses a significant challenge to de...
research
12/12/2020

Concept Drift Monitoring and Diagnostics of Supervised Learning Models via Score Vectors

Supervised learning models are one of the most fundamental classes of mo...
research
01/04/2012

Clustering Dynamic Web Usage Data

Most classification methods are based on the assumption that data confor...
research
01/17/2017

On The Construction of Extreme Learning Machine for Online and Offline One-Class Classification - An Expanded Toolbox

One-Class Classification (OCC) has been prime concern for researchers an...

Please sign up or login with your details

Forgot password? Click here to reset