Robust Mean Estimation with the Bayesian Median of Means
The sample mean is often used to aggregate different unbiased estimates of a parameter, producing a final estimate that is unbiased but possibly high-variance. This paper introduces the Bayesian median of means, an aggregation rule that roughly interpolates between the sample mean and median, resulting in estimates with much smaller variance at the expense of bias. While the procedure is non-parametric, its squared bias is asymptotically negligible relative to the variance, similar to maximum likelihood estimators. The Bayesian median of means is consistent, and concentration bounds for the estimator's bias and L_1 error are derived, as well as a fast non-randomized approximating algorithm. The performances of both the exact and the approximate procedures match that of the sample mean in low-variance settings, and exhibit much better results in high-variance scenarios. The empirical performances are examined in real and simulated data, and in applications such as importance sampling, cross-validation and bagging.
READ FULL TEXT