Post model-fitting exploration via a "Next-Door" analysis

06/04/2018
by   Leying Guan, et al.
0

We propose a simple method for evaluating the model that has been chosen by an adaptive regression procedure, our main focus being the lasso. This procedure deletes each chosen predictor and refits the lasso to get a set of models that are "close" to the one chosen, referred to as "base model". If the deletion of a predictor leads to significant deterioration in the model's predictive power, the predictor is called indispensable; otherwise, the nearby model is called acceptable and can serve as a good alternative to the base model. This provides both an assessment of the predictive contribution of each variable and a set of alternative models that may be used in place of the chosen model. In this paper, we will focus on the cross-validation (CV) setting and a model's predictive power is measured by its CV error, with base model tuned by cross-validation. We propose a method for comparing the error rates of the base model with that of nearby models, and a p-value for testing whether a predictor is dispensable. We also propose a new quantity called model score which works similarly as the p-value for the control of type I error. Our proposal is closely related to the LOCO (leave-one-covarate-out) methods of ([Rinaldo 2016 Bootstrapping]) and less so, to Stability Selection ([Meinshausen 2010 stability]). We call this procedure "Next-Door analysis" since it examines models close to the base model. It can be applied to Gaussian regression data, generalized linear models, and other supervised learning problems with ℓ_1 penalization. It could also be applied to best subset and stepwise regression procedures. We have implemented it in the R language as a library to accompany the well-known glmnet library.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2022

Leave-group-out cross-validation for latent Gaussian models

Evaluating predictive performance is essential after fitting a model and...
research
05/20/2020

On the use of cross-validation for the calibration of the tuning parameter in the adaptive lasso

The adaptive lasso is a popular extension of the lasso, which was shown ...
research
04/08/2014

A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection

We describe a simple, efficient, permutation based procedure for selecti...
research
12/13/2017

Stability Selection for Structured Variable Selection

In variable or graph selection problems, finding a right-sized model or ...
research
01/09/2020

Semi-automated simultaneous predictor selection for Regression-SARIMA models

Deciding which predictors to use plays an integral role in deriving stat...
research
07/30/2020

A Power Analysis for Knockoffs with the Lasso Coefficient-Difference Statistic

In a linear model with possibly many predictors, we consider variable se...
research
02/10/2022

Loss-guided Stability Selection

In modern data analysis, sparse model selection becomes inevitable once ...

Please sign up or login with your details

Forgot password? Click here to reset