The Limits of Post-Selection Generalization

06/15/2018
by   Kobbi Nissim, et al.
0

While statistics and machine learning offers numerous methods for ensuring generalization, these methods often fail in the presence of adaptivity---the common practice in which the choice of analysis depends on previous interactions with the same dataset. A recent line of work has introduced powerful, general purpose algorithms that ensure post hoc generalization (also called robust or post-selection generalization), which says that, given the output of the algorithm, it is hard to find any statistic for which the data differs significantly from the population it came from. In this work we show several limitations on the power of algorithms satisfying post hoc generalization. First, we show a tight lower bound on the error of any algorithm that satisfies post hoc generalization and answers adaptively chosen statistical queries, showing a strong barrier to progress in post selection data analysis. Second, we show that post hoc generalization is not closed under composition, despite many examples of such algorithms exhibiting strong composition properties.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2022

Making Progress Based on False Discoveries

We consider the question of adaptive data analysis within the framework ...
research
06/06/2023

Bayesian post-hoc regularization of random forests

Random Forests are powerful ensemble learning algorithms widely used in ...
research
01/25/2022

Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

Existing and planned legislation stipulates various obligations to provi...
research
03/12/2023

Branch Learn with Post-hoc Correction for Predict+Optimize with Unknown Parameters in Constraints

Combining machine learning and constrained optimization, Predict+Optimiz...
research
07/17/2023

Q(D)O-ES: Population-based Quality (Diversity) Optimisation for Post Hoc Ensemble Selection in AutoML

Automated machine learning (AutoML) systems commonly ensemble models pos...
research
11/11/2019

A post hoc test on the Sharpe ratio

We describe a post hoc test for the Sharpe ratio, analogous to Tukey's t...
research
07/01/2023

CMA-ES for Post Hoc Ensembling in AutoML: A Great Success and Salvageable Failure

Many state-of-the-art automated machine learning (AutoML) systems use gr...

Please sign up or login with your details

Forgot password? Click here to reset