Predicting the Number of Reported Bugs in a Software Repository

by   Hadi Jahanshahi, et al.

The bug growth pattern prediction is a complicated, unrelieved task, which needs considerable attention. Advance knowledge of the likely number of bugs discovered in the software system helps software developers in designating sufficient resources at a convenient time. The developers may also use such information to take necessary actions to increase the quality of the system and in turn customer satisfaction. In this study, we examine eight different time series forecasting models, including Long Short Term Memory Neural Networks (LSTM), auto-regressive integrated moving average (ARIMA), and Random Forest Regressor. Further, we assess the impact of exogenous variables such as software release dates by incorporating those into the prediction models. We analyze the quality of long-term prediction for each model based on different performance metrics. The assessment is conducted on Mozilla, which is a large open-source software application. The dataset is originally mined from Bugzilla and contains the number of bugs for the project between Jan 2010 and Dec 2019. Our numerical analysis provides insights on evaluating the trends in a bug repository. We observe that LSTM is effective when considering long-run predictions whereas Random Forest Regressor enriched by exogenous variables performs better for predicting the number of bugs in the short term.


page 1

page 2

page 3

page 4


Revisiting reopened bugs in open source software systems

Reopened bugs can degrade the overall quality of a software system since...

The Impact of Feature Selection on Predicting the Number of Bugs

Bug prediction is the process of training a machine learning model on so...

Data-driven Real-time Short-term Prediction of Air Quality: Comparison of ES, ARIMA, and LSTM

Air pollution is a worldwide issue that affects the lives of many people...

What Happens When We Fuzz? Investigating OSS-Fuzz Bug History

BACKGROUND: Software engineers must be vigilant in preventing and correc...

A Bug or a Suggestion? An Automatic Way to Label Issues

More and more users and developers are using Issue Tracking Systems (ITS...

PrAIoritize: Learning to Prioritize Smart Contract Bugs and Vulnerabilities

Smart contract vulnerabilities and bugs have become a key concern for so...

Employing Partial Least Squares Regression with Discriminant Analysis for Bug Prediction

Forecasting defect proneness of source code has long been a major resear...

Please sign up or login with your details

Forgot password? Click here to reset