Fast and robust model selection based on ranks

05/14/2019
by   Wojciech Rejchel, et al.
0

We consider the problem of identifying important predictors in large data bases, where the relationship between the response variable and the explanatory variables is specified by the general single index model, with unknown link function and unknown distribution of the error term. We utilize the natural robust and efficient approach, which relies on replacing values of the response variable with their ranks and then identifying important predictors by using the well known LASSO. The resulting RankLasso coincides with the previously proposed distribution-based LASSO, where the relationship with the rank approach was not realized. We refine the consistency results for RankLasso provided in the earlier papers and extend the scope of applications of this method by proposing its thresholded and adaptive versions. We present theoretical results which show that similarly as in the context of regular LASSO, the proposed modifications are model selection consistent under much weaker assumptions than RankLasso. These theoretical results are illustrated by extensive simulation study, which shows that the proposed procedures are indeed much more efficient than the vanilla version of RankLasso and that they can properly identify relevant predictors, even if the error terms come from the Cauchy distribution. The simulation study shows also that concerning model selection RankLasso performs substantially better than LADLasso, which is a well established methodology for robust model selection.

READ FULL TEXT
research
11/24/2020

Identifying important predictors in large data bases – multiple testing and model selection

This is a chapter of the forthcoming Handbook of Multiple Testing. We co...
research
01/30/2008

On the Distribution of the Adaptive LASSO Estimator

We study the distribution of the adaptive LASSO estimator (Zou (2006)) i...
research
04/11/2020

Robust adaptive variable selection in ultra-high dimensional regression models based on the density power divergence loss

We consider the problem of simultaneous model selection and the estimati...
research
06/01/2016

Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso

Model selection is difficult to analyse yet theoretically and empiricall...
research
06/10/2019

Selection consistency of Lasso-based procedures for misspecified high-dimensional binary model and random regressors

We consider selection of random predictors for high-dimensional regressi...
research
09/18/2019

Evaluating Effects of Tuition Fees: Lasso for the Case of Germany

We study the effect of the introduction of university tuition fees on th...
research
06/14/2023

The generalized hyperbolic family and automatic model selection through the multiple-choice LASSO

We revisit the generalized hyperbolic (GH) distribution and its nested m...

Please sign up or login with your details

Forgot password? Click here to reset