Economically rational sample-size choice and irreproducibility

08/23/2019
by   Oliver Braganza, et al.
0

Several systematic studies have suggested that a large fraction of published research is not reproducible. One probable reason for low reproducibility is insufficient sample size, resulting in low power and low positive predictive value. It has been suggested that insufficient sample-size choice is driven by a combination of scientific competition and 'positive publication bias'. Here we formalize this intuition in a simple model, in which scientists choose economically rational sample sizes, balancing the cost of experimentation with income from publication. Specifically, assuming that a scientist's income derives only from 'positive' findings (positive publication bias) and that individual samples cost a fixed amount, allows to leverage basic statistical formulas into an economic optimality prediction. We find that if effects have i) low base probability, ii) small effect size or iii) low grant income per publication, then the rational (economically optimal) sample size is small. Furthermore, for plausible distributions of these parameters we find a robust emergence of a bimodal distribution of obtained statistical power and low overall reproducibility rates, matching empirical findings. Overall, the model describes a simple mechanism explaining both the prevalence and the persistence of small sample sizes. It suggests economic rationality, or economic pressures, as a principal driver of irreproducibility.

READ FULL TEXT

page 4

page 5

page 6

page 8

page 9

page 10

page 11

research
10/05/2020

Statistical Reliability of 10 Years of Cyber Security User Studies (Extended Version)

Background. In recent years, cyber security security user studies have b...
research
11/16/2017

Converting P-Values in Adaptive Robust Lower Bounds of Posterior Probabilities to increase the reproducible Scientific "Findings"

We put forward a novel calibration of p values, the "Adaptive Robust Low...
research
08/02/2022

Estimating the prevalence of anemia rates among children under five in Peruvian districts with a small sample size

In this paper we attempt to answer the following question: “Is it possib...
research
05/21/2008

Kendall's tau in high-dimensional genomic parsimony

High-dimensional data models, often with low sample size, abound in many...
research
02/25/2021

Computing Accurate Probabilistic Estimates of One-D Entropy from Equiprobable Random Samples

We develop a simple Quantile Spacing (QS) method for accurate probabilis...
research
01/06/2012

The Interaction of Entropy-Based Discretization and Sample Size: An Empirical Study

An empirical investigation of the interaction of sample size and discret...

Please sign up or login with your details

Forgot password? Click here to reset