A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis

10/04/2019
by   Lorenzo De Stefani, et al.
0

While standard statistical inference techniques and machine learning generalization bounds assume that tests are run on data selected independently of the hypotheses, practical data analysis and machine learning are usually iterative and adaptive processes where the same holdout data is often used for testing a sequence of hypotheses (or models), which may each depend on the outcome of the previous tests on the same data. In this work, we present RadaBound a rigorous, efficient and practical procedure for controlling the generalization error when using a holdout sample for multiple adaptive testing. Our solution is based on a new application of the Rademacher Complexity generalization bounds, adapted to dependent tests. We demonstrate the statistical power and practicality of our method through extensive simulations and comparisons to alternative approaches.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset