Valid Post-selection Inference in Assumption-lean Linear Regression
Construction of valid statistical inference for estimators based on data-driven selection has received a lot of attention in the recent times. Berk et al. (2013) is possibly the first work to provide valid inference for Gaussian homoscedastic linear regression with fixed covariates under arbitrary covariate/variable selection. The setting is unrealistic and is extended by Bachoc et al. (2016) by relaxing the distributional assumptions. A major drawback of the aforementioned works is that the construction of valid confidence regions is computationally intensive. In this paper, we first prove that post-selection inference is equivalent to simultaneous inference and then construct valid post-selection confidence regions which are computationally simple. Our construction is based on deterministic inequalities and apply to independent as well as dependent random variables without the requirement of correct distributional assumptions. Finally, we compare the volume of our confidence regions with the existing ones and show that under non-stochastic covariates, our regions are much smaller.
READ FULL TEXT