More powerful post-selection inference, with application to the Lasso

01/27/2018
by   Keli Liu, et al.
0

Investigators often use the data to generate interesting hypotheses and then perform inference for the generated hypotheses. P-values and confidence intervals must account for this explorative data analysis. A fruitful method for doing so is to condition any inferences on the components of the data used to generate the hypotheses, thus preventing information in those components from being used again. Some currently popular methods "over-condition", leading to wide intervals. We show how to perform the minimal conditioning in a computationally tractable way. In high dimensions, even this minimal conditioning can lead to intervals that are too wide to be useful, suggesting that up to now the cost of hypothesis generation has been underestimated. We show how to generate hypotheses in a strategic manner that sharply reduces the cost of data exploration and results in useful confidence intervals. Our discussion focuses on the problem of post-selection inference after fitting a lasso regression model, but we also outline its extension to a much more general setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2018

Expected length of post-model-selection confidence intervals conditional on polyhedral constraints

Valid inference after model selection is currently a very active area of...
research
12/29/2021

Exact Post-selection Inference For Tracking S P500

The problem that is solved in this paper is known as index tracking. The...
research
10/14/2019

More Powerful Selective Kernel Tests for Feature Selection

Refining one's hypotheses in the light of data is a commonplace scientif...
research
02/23/2014

Exact Post Model Selection Inference for Marginal Screening

We develop a framework for post model selection inference, via marginal ...
research
06/24/2023

Post-Selection Inference for the Cox Model with Interval-Censored Data

We develop a post-selection inference method for the Cox proportional ha...
research
05/21/2023

A parametric distribution for exact post-selection inference with data carving

Post-selection inference (PoSI) is a statistical technique for obtaining...
research
04/23/2018

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

Inference is the process of using facts we know to learn about facts we ...

Please sign up or login with your details

Forgot password? Click here to reset