Pitfalls in Experiments with DNN4SE: An Analysis of the State of the Practice

05/19/2023
by   Sira Vegas, et al.
0

Software engineering techniques are increasingly relying on deep learning approaches to support many software engineering tasks, from bug triaging to code generation. To assess the efficacy of such techniques researchers typically perform controlled experiments. Conducting these experiments, however, is particularly challenging given the complexity of the space of variables involved, from specialized and intricate architectures and algorithms to a large number of training hyper-parameters and choices of evolving datasets, all compounded by how rapidly the machine learning technology is advancing, and the inherent sources of randomness in the training process. In this work we conduct a mapping study, examining 194 experiments with techniques that rely on deep neural networks appearing in 55 papers published in premier software engineering venues to provide a characterization of the state-of-the-practice, pinpointing experiments common trends and pitfalls. Our study reveals that most of the experiments, including those that have received ACM artifact badges, have fundamental limitations that raise doubts about the reliability of their findings. More specifically, we find: weak analyses to determine that there is a true relationship between independent and dependent variables (87 relevant variables, which can render a relationship between dependent variables and treatments that may not be causal but rather correlational (100 experiments); and lack of specificity in terms of what are the DNN variables and their values utilized in the experiments (86 the treatments being applied, which makes it unclear whether the techniques designed are the ones being assessed, or how the sources of extraneous variation are controlled. We provide some practical recommendations to address these limitations.

READ FULL TEXT
research
02/18/2018

The Dangerous Dogmas of Software Engineering

To legitimize itself as a scientific discipline, the software engineerin...
research
08/23/2023

Reflecting on the Use of the Policy-Process-Product Theory in Empirical Software Engineering

The primary theory of software engineering is that an organization's Pol...
research
08/15/2017

Controlled Experiments with Student Participants in Software Engineering: Preliminary Results from a Systematic Mapping Study

[Context] In software engineering research, emphasis is given to sound e...
research
02/18/2020

Sampling in Software Engineering Research: A Critical Review and Guidelines

Representative sampling appears rare in software engineering research. N...
research
06/03/2019

NeuralVis: Visualizing and Interpreting Deep Learning Models

Deep Neural Network(DNN) techniques have been prevalent in software engi...
research
11/05/2020

Comparing the Results of Replications in Software Engineering

Context: It has been argued that software engineering replications are u...
research
06/20/2023

Fingerprinting and Building Large Reproducible Datasets

Obtaining a relevant dataset is central to conducting empirical studies ...

Please sign up or login with your details

Forgot password? Click here to reset