The Impact of Sampling and Rule Set Size on Generated Fuzzy Inference System Predictive Accuracy: Analysis of a Software Engineering Data Set

02/05/2021
by   Stephen G. MacDonell, et al.
0

Software project management makes extensive use of predictive modeling to estimate product size, defect proneness and development effort. Although uncertainty is acknowledged in these tasks, fuzzy inference systems, designed to cope well with uncertainty, have received only limited attention in the software engineering domain. In this study we empirically investigate the impact of two choices on the predictive accuracy of generated fuzzy inference systems when applied to a software engineering data set: sampling of observations for training and testing; and the size of the rule set generated using fuzzy c-means clustering. Over ten samples we found no consistent pattern of predictive performance given certain rule set size. We did find, however, that a rule set compiled from multiple samples generally resulted in more accurate predictions than single sample rule sets. More generally, the results provide further evidence of the sensitivity of empirical analysis outcomes to specific model-building decisions.

READ FULL TEXT
research
02/18/2018

The Dangerous Dogmas of Software Engineering

To legitimize itself as a scientific discipline, the software engineerin...
research
12/24/2019

The Evolution of Empirical Methods in Software Engineering

Empirical methods like experimentation have become a powerful means to d...
research
10/19/2012

Dealing with uncertainty in fuzzy inductive reasoning methodology

The aim of this research is to develop a reasoning under uncertainty str...
research
03/16/2004

An approach to membrane computing under inexactitude

In this paper we introduce a fuzzy version of symport/antiport membrane ...
research
09/26/2018

Arguing Practical Significance in Software Engineering Using Bayesian Data Analysis

This paper provides a case for using Bayesian data analysis (BDA) to mak...
research
12/15/2020

Run, Forest, Run? On Randomization and Reproducibility in Predictive Software Engineering

Machine learning (ML) has been widely used in the literature to automate...
research
08/23/2021

ComSum: Commit Messages Summarization and Meaning Preservation

We present ComSum, a data set of 7 million commit messages for text summ...

Please sign up or login with your details

Forgot password? Click here to reset