Transfer-Learning Oriented Class Imbalance Learning for Cross-Project Defect Prediction

01/24/2019
by   Haonan Tong, et al.
0

Cross-project defect prediction (CPDP) aims to predict defects of projects lacking training data by using prediction models trained on historical defect data from other projects. However, since the distribution differences between datasets from different projects, it is still a challenge to build high-quality CPDP models. Unfortunately, class imbalanced nature of software defect datasets further increases the difficulty. In this paper, we propose a transferlearning oriented minority over-sampling technique (TOMO) based feature weighting transfer naive Bayes (FWTNB) approach (TOMOFWTNB) for CPDP by considering both classimbalance and feature importance problems. Differing from traditional over-sampling techniques, TOMO not only can balance the data but reduce the distribution difference. And then FWTNB is used to further increase the similarity of two distributions. Experiments are performed on 11 public defect datasets. The experimental results show that (1) TOMO improves the average G-Measure by 23.7%∼41.8%, and the average MCC by 54.2%∼77.8%. (2) feature weighting (FW) strategy improves the average G-Measure by 11%, and the average MCC by 29.2%. (3) TOMOFWTNB improves the average G-Measure value by at least 27.8%, and the average MCC value by at least 71.5%, compared with existing state-of-theart CPDP approaches. It can be concluded that (1) TOMO is very effective for addressing class-imbalance problem in CPDP scenario; (2) our FW strategy is helpful for CPDP; (3) TOMOFWTNB outperforms previous state-of-the-art CPDP approaches.

READ FULL TEXT

page 1

page 3

page 6

page 14

page 15

page 25

page 26

page 28

research
06/16/2022

An Empirical Study on the Effectiveness of Data Resampling Approaches for Cross-Project Software Defect Prediction

Crossp-roject defect prediction (CPDP), where data from different softwa...
research
02/08/2020

Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study

Data-driven defect prediction has become increasingly important in softw...
research
04/13/2021

Feature-Oriented Defect Prediction: Scenarios, Metrics, and Classifiers

Several software defect prediction techniques have been developed over t...
research
01/12/2018

Benchmarking cross-project defect prediction approaches with costs metrics

Defect prediction can be a powerful tool to guide the use of quality ass...
research
05/28/2018

An empirical study of public data quality problems in cross project defect prediction

Background: Two public defect data, including Jureczko and NASA datasets...
research
05/15/2021

Generative Adversarial Network-based Cross-Project Fault Prediction

Background: The early stage of defect prediction in the software develop...
research
09/05/2018

Preserving Order of Data When Validating Defect Prediction Models

[Context] The use of defect prediction models, such as classifiers, can ...

Please sign up or login with your details

Forgot password? Click here to reset