Test cases as a measurement instrument in experimentation

11/09/2021
by   Oscar Dieste, et al.
0

Background: Test suites are frequently used to quantify relevant software attributes, such as quality or productivity. Problem: We have detected that the same response variable, measured using different test suites, yields different experiment results. Aims: Assess to which extent differences in test case construction influence measurement accuracy and experimental outcomes. Method: Two industry experiments have been measured using two different test suites, one generated using an ad-hoc method and another using equivalence partitioning. The accuracy of the measures has been studied using standard procedures, such as ISO 5725, Bland-Altman and Interclass Correlation Coefficients. Results: There are differences in the values of the response variables up to +-60 partitioning) used. Conclusions: The disclosure of datasets and analysis code is insufficient to ensure the reproducibility of SE experiments. Experimenters should disclose all experimental materials needed to perform independent measurement and re-analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2022

Investigating Software Testability and Test cases Effectiveness

Software measurement is an essential management tool to develop robust a...
research
11/24/2020

A Family of Experiments on Test-Driven Development

Context: Test-driven development (TDD) is an agile software development ...
research
03/07/2023

Do the Test Smells Assertion Roulette and Eager Test Impact Students' Troubleshooting and Debugging Capabilities?

To ensure the quality of a software system, developers perform an activi...
research
01/22/2023

CodeScore: Evaluating Code Generation by Learning Code Execution

A proper code evaluation metric (CEM) profoundly impacts the evolution o...
research
06/26/2018

How Do Static and Dynamic Test Case Prioritization Techniques Perform on Modern Software Systems? An Extensive Study on GitHub Projects

Test Case Prioritization (TCP) is an increasingly important regression t...
research
10/26/2020

How to Measure the Reproducibility of System-oriented IR Experiments

Replicability and reproducibility of experimental results are primary co...
research
06/23/2020

Benchmarking features from different radiomics toolkits / toolboxes using Image Biomarkers Standardization Initiative

There is no consensus regarding the radiomic feature terminology, the un...

Please sign up or login with your details

Forgot password? Click here to reset