Does class size matter? An in-depth assessment of the effect of class size in software defect prediction

by   Amjed Tahir, et al.

In the past 20 years, defect prediction studies have generally acknowledged the effect of class size on software prediction performance. To quantify the relationship between object-oriented (OO) metrics and defects, modelling has to take into account the direct, and potentially indirect, effects of class size on defects. However, some studies have shown that size cannot be simply controlled or ignored, when building prediction models. As such, there remains a question whether, and when, to control for class size. This study provides a new in-depth examination of the impact of class size on the relationship between OO metrics and software defects or defect-proneness. We assess the impact of class size on the number of defects and defect-proneness in software systems by employing a regression-based mediation (with bootstrapping) and moderation analysis to investigate the direct and indirect effect of class size in count and binary defect prediction. Our results show that the size effect is not always significant for all metrics. Of the seven OO metrics we investigated, size consistently has significant mediation impact only on the relationship between Coupling Between Objects (CBO) and defects/defect-proneness, and a potential moderation impact on the relationship between Fan-out and defects/defect-proneness. Based on our results we make three recommendations. One, we encourage researchers and practitioners to examine the impact of class size for the specific data they have in hand and through the use of the proposed statistical mediation/moderation procedures. Two, we encourage empirical studies to investigate the indirect effect of possible additional variables in their models when relevant. Three, the statistical procedures adopted in this study could be used in other empirical software engineering research to investigate the influence of potential mediators/moderators.


Revisiting the size effect in software fault prediction models

BACKGROUND: In object oriented (OO) software systems, class size has bee...

Exploring the relationship between performance metrics and cost saving potential of defect prediction models

Performance metrics are a core component of the evaluation of any machin...

Evaluating prediction systems in software project estimation

Context: Software engineering has a problem in that when we empirically ...

Evaluating software defect prediction performance: an updated benchmarking study

Accurately predicting faulty software units helps practitioners target f...

The impact of using biased performance metrics on software defect prediction research

Context: Software engineering researchers have undertaken many experimen...

Assert Use and Defectiveness in Industrial Code

The use of asserts in code has received increasing attention in the soft...

Power of Mediation Effects Using Bootstrap Resampling

Mediation analyses are a statistical tool for testing the hypothesis abo...

Please sign up or login with your details

Forgot password? Click here to reset