Coupled Bootstrap Test Error Estimation for Poisson Variables
Test error estimation is a fundamental problem in statistics and machine learning. Correctly assessing the future performance of an algorithm is an essential task, especially with the development of complex predictive algorithms that require data-driven parameter tuning. We propose a new coupled bootstrap estimator for the test error of Poisson-response algorithms, a fundamental model for count data and with applications such as signal processing, density estimation, and queue theory. The idea behind our estimator is to generate two carefully designed new random vectors from the original data, where one acts as a training sample and the other as a test set. It is unbiased for an intuitive parameter: the out-of-sample error of a Poisson random vector whose mean has been shrunken by a small factor. Moreover, in a limiting regime, the coupled bootstrap estimator recovers an exactly unbiased estimator for test error. Our framework is applicable to loss functions of the Bregman divergence family, and our analysis and examples focus on two important cases: Poisson likelihood deviance and squared loss. Through a bias-variance decomposition, we analyze the effect of the number of bootstrap samples and the added noise due to the two auxiliary variables. We then apply our method to different scenarios with both simulated and real data.
READ FULL TEXT