On the role of benchmarking data sets and simulations in method comparison studies

08/02/2022
by   Sarah Friedrich, et al.
0

Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favour a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, i.e. real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss their advantages and disadvantages and ultimately to develop new approaches to the evaluation of methods picking the best of both worlds. To this aim, we borrow ideas from different contexts such as mixed methods research and Clinical Scenario Evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2022

Pitfalls and Potentials in Simulation Studies

Comparative simulation studies are workhorse tools for benchmarking stat...
research
09/09/2019

INTEREST: INteractive Tool for Exploring REsults from Simulation sTudies

Simulation studies allow us to explore the properties of statistical met...
research
08/15/2023

How to Simulate Realistic Survival Data? A Simulation Study to Compare Realistic Simulation Models

In statistics, it is important to have realistic data sets available for...
research
05/10/2023

Statistical Plasmode Simulations – Potentials, Challenges and Recommendations

Statistical data simulation is essential in the development of statistic...
research
12/03/2018

Essential guidelines for computational method benchmarking

In computational biology and other sciences, researchers are frequently ...
research
09/03/2020

Data-First Visualization Design Studies

We introduce the notion of a data-first design study which is triggered ...
research
07/07/2021

Bayesian model-based clustering for multiple network data

There is increasing appetite for analysing multiple network data. This i...

Please sign up or login with your details

Forgot password? Click here to reset