Reallocating and Resampling: A Comparison for Inference
Simulation-based inference plays a major role in modern statistics, and often employs either reallocating (as in a randomization test) or resampling (as in bootstrapping). Reallocating mimics random allocation to treatment groups, while resampling mimics random sampling from a larger population; does it matter whether the simulation method matches the data collection method? Moreover, do the results differ for testing versus estimation? Here we answer these questions in a simple setting by exploring the distribution of a sample difference in means under a basic two group design and four different scenarios: true random allocation, true random sampling, reallocating, and resampling. For testing a sharp null hypothesis, reallocating is superior in small samples, but reallocating and resampling are asymptotically equivalent. For estimation, resampling is generally superior, unless the effect is truly additive. Moreover, these results hold regardless of whether the data were collected by random sampling or random allocation.
READ FULL TEXT