Weak Signal Detection via Displacement Interpolation
Detecting weak, systematic signals hidden in a large collection of p-values published in academic journals is instrumental to identifying and understanding publication bias and p-value hacking in social and economic sciences. Given two probability distributions P (null) and Q (signal), we study the problem of detecting weak signals from the null P based on n independent samples: we model weak signals via displacement interpolation between P and Q, where the signal strength vanishes with n. We propose a hypothesis testing procedure based on the Wasserstein distance from optimal transport theory, derive sharp conditions under which detection is possible, and provide the exact characterization of the asymptotic Type I and Type II errors at the detection boundary using empirical processes. Applying our testing procedure to real data sets on published p-values across academic journals, we demonstrate that a rigorous testing procedure can detect weak signals that are otherwise indistinguishable.
READ FULL TEXT