Investigating the Use of One-Class Support Vector Machine for Software Defect Prediction

02/24/2022
by   Rebecca Moussa, et al.
0

Early software defect identification is considered an important step towards software quality assurance. Software defect prediction aims at identifying software components that are likely to cause faults before a software is made available to the end-user. To date, this task has been modeled as a two-class classification problem, however its nature also allows it to be formulated as a one-class classification task. Preliminary results obtained in prior work show that One-Class Support Vector Machine (OCSVM) can outperform two-class classifiers for defect prediction. If confirmed, these results would overcome the data imbalance problem researchers have for long attempted to tackle in this field. In this paper, we further investigate whether learning from one class only is sufficient to produce effective defect prediction models by conducting a thorough large-scale empirical study investigating 15 real-world software projects, three validation scenarios, eight classifiers, robust evaluation measures and statistical significance tests. The results reveal that OCSVM is more suitable for cross-version and cross-project, rather than for within-project defect prediction, thus suggesting it performs better with heterogeneous data. While, we cannot conclude that OCSVM is the best classifier (Random Forest performs best herein), our results show interesting findings that open up further research avenues for training accurate defect prediction classifiers when defective instances are scarce or unavailable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2022

An Empirical Study on the Effectiveness of Data Resampling Approaches for Cross-Project Software Defect Prediction

Crossp-roject defect prediction (CPDP), where data from different softwa...
research
01/19/2023

Source Code Metrics for Software Defects Prediction

In current research, there are contrasting results about the applicabili...
research
06/09/2023

Robust Twin Parametric Margin Support Vector Machine for Multiclass Classification

In this paper we present a Twin Parametric-Margin Support Vector Machine...
research
08/26/2021

On the use of test smells for prediction of flaky tests

Regression testing is an important phase to deliver software with qualit...
research
03/31/2020

On the Need of Removing Last Releases of Data When Using or Validating Defect Prediction Models

To develop and train defect prediction models, researchers rely on datas...
research
10/22/2022

Generalized Likelihood Ratio Test With One-Class Classifiers

One-class classification (OCC) is the problem of deciding whether an obs...
research
04/13/2021

Feature-Oriented Defect Prediction: Scenarios, Metrics, and Classifiers

Several software defect prediction techniques have been developed over t...

Please sign up or login with your details

Forgot password? Click here to reset