Run, Forest, Run? On Randomization and Reproducibility in Predictive Software Engineering

12/15/2020
by   Cynthia C. S. Liem, et al.
0

Machine learning (ML) has been widely used in the literature to automate software engineering tasks. However, ML outcomes may be sensitive to randomization in data sampling mechanisms and learning procedures. To understand whether and how researchers in SE address these threats, we surveyed 45 recent papers related to three predictive tasks: defect prediction (DP), predictive mutation testing (PMT), and code smell detection (CSD). We found that less than 50 randomized data sampling (via multiple repetitions); only 8 address the random nature of ML; and parameter values are rarely reported (only 18 empirical study using 26 real-world datasets commonly considered for the three predictive tasks of interest, considering eight common supervised ML classifiers. We show that different data resamplings for 10-fold cross-validation lead to extreme variability in observed performance results. Furthermore, randomized ML methods also show non-negligible variability for different choices of random seeds. More worryingly, performance and variability are inconsistent for different implementations of the conceptually same ML method in different libraries, as also shown through multi-dataset pairwise comparison. To cope with these critical threats, we provide practical guidelines on how to validate, assess, and report the results of predictive methods.

READ FULL TEXT

page 1

page 7

page 8

research
11/17/2022

Machine Learning for Software Engineering: A Tertiary Study

Machine learning (ML) techniques increase the effectiveness of software ...
research
08/12/2020

Synergy between Machine/Deep Learning and Software Engineering: How Far Are We?

Since 2009, the deep learning revolution, which was triggered by the int...
research
07/04/2022

Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation

In the past two decades, several Machine Learning (ML) libraries have be...
research
06/25/2020

On the Replicability and Reproducibility of Deep Learning in Software Engineering

Deep learning (DL) techniques have gained significant popularity among s...
research
09/09/2021

The challenge of reproducible ML: an empirical study on the impact of bugs

Reproducibility is a crucial requirement in scientific research. When re...
research
08/22/2021

Evaluation Methodologies for Code Learning Tasks

There has been a growing interest in developing machine learning (ML) mo...

Please sign up or login with your details

Forgot password? Click here to reset