Finite-sample Guarantees for Winsorized Importance Sampling
Importance sampling is a widely used technique to estimate the properties of a distribution. The resulting estimator is always unbiased, but may sometimes incur huge variance. This paper investigates trading-off some bias for variance by winsorizing the importance sampling estimator. The threshold level at which to winsorize is determined by a concrete version of the Balancing Principle, also known as Lepski's Method, which may be of independent interest. The procedure adaptively chooses a threshold level among a pre-defined set by roughly balancing the bias and variance of the estimator when winsorized at different levels. As a consequence, it provides a principled way to perform winsorization, with finite-sample optimality guarantees. The empirical performance of the winsorized estimator is considered in various examples, both real and synthetic. The estimator outperforms the usual importance sampling estimator in high-variance settings, and remains competitive when the variance of the importance sampling weights is low.
READ FULL TEXT