Selective inference after variable selection via multiscale bootstrap

05/25/2019
by   Yoshikazu Terada, et al.
0

A general resampling approach is considered for selective inference problem after variable selection in regression analysis. Even after variable selection, it is important to know whether the selected variables are actually useful by showing p-values and confidence intervals of regression coefficients. In the classical approach, significance levels for the selected variables are usually computed by t-test but they are subject to selection bias. In order to adjust the bias in this post-selection inference, most existing studies of selective inference consider the specific variable selection algorithm such as Lasso for which the selection event can be explicitly represented as a simple region in the space of the response variable. Thus, the existing approach cannot handle more complicated algorithm such as MCP (minimax concave penalty). Moreover, most existing approaches set an event, that a specific model is selected, as the selection event. This selection event is too restrictive and may reduce the statistical power, because the hypothesis selection with a specific variable only depends on whether the variable is selected or not. In this study, we consider more appropriate selection event such that the variable is selected, and propose a new bootstrap method to compute an approximately unbiased selective p-value for the selected variable. Our method is applicable to a wide class of variable selection algorithms. In addition, the computational cost of our method is the same order as the classical bootstrap method. Through the numerical experiments, we show the usefulness of our selective inference approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2021

Selective Inference in Propensity Score Analysis

Selective inference (post-selection inference) is a methodology that has...
research
02/13/2019

Selective Inference for Testing Trees and Edges in Phylogenetics

Selective inference is considered for testing trees and edges in phyloge...
research
11/02/2022

Inferring independent sets of Gaussian variables after thresholding correlations

We consider testing whether a set of Gaussian variables, selected from t...
research
05/15/2020

Evaluating methods for Lasso selective inference in biomedical research by a comparative simulation study

Variable selection for regression models plays a key role in the analysi...
research
02/03/2021

Splitting strategies for post-selection inference

We consider the problem of providing valid inference for a selected para...
research
02/28/2018

Semi-Analytic Resampling in Lasso

An approximate method for conducting resampling in Lasso, the ℓ_1 penali...
research
12/14/2020

Variable Selection with Second-Generation P-Values

Many statistical methods have been proposed for variable selection in th...

Please sign up or login with your details

Forgot password? Click here to reset