Accuracy and stability of solar variable selection comparison under complicated dependence structures

07/30/2020
by   Ning Xu, et al.
0

In this paper we focus on the variable-selection peformance of solar on the empirical data with complicated dependence structures and, hence, severe multicollinearity and grouping effect issues. We choose the prostate cancer data and the Sydney house price data and apply two lasso solvers, elastic net and solar on them (code can be found at <https://github.com/isaac2math/>). The results shows that (i) lasso is affected by the grouping effect and randomly drop variables with high correlations, resulting unreliable and uninterpretable results; (ii) elastic net is more robust to grouping effect; however, it completely lose variable-selection sparsity when the dependence structure of the data is complicated; (iii) solar demonstrates its superior robustness to complicated dependence structures and grouping effect, returning variable-selection results with better stability and sparsity. Also, such stability and sparsity make solar a reliable variable pre-estimation filter of a linear dependence structure esimation (linear probablistic graph learning). The linear probablistic graph estimated on the variable selected by solar returns an intuitive, sparse and stable dependence structure.

READ FULL TEXT
research
07/30/2020

Solar: a least-angle regression for accurate and stable variable selection in high-dimensional data

We propose a new least-angle regression algorithm for variable selection...
research
07/30/2020

Ultrahigh dimensional instrument detection using graph learning: an application to high dimensional GIS-census data for house pricing

The exogeneity bias and instrument validation have always been critical ...
research
11/23/2021

Trimming Stability Selection increases variable selection robustness

Contamination can severely distort an estimator unless the estimation pr...
research
05/16/2018

Structured nonlinear variable selection

We investigate structured sparsity methods for variable selection in reg...
research
10/02/2007

Structured variable selection in support vector machines

When applying the support vector machine (SVM) to high-dimensional class...
research
12/21/2020

A critical review of LASSO and its derivatives for variable selection under dependence among covariates

We study the limitations of the well known LASSO regression as a variabl...
research
06/10/2020

Robust Grouped Variable Selection Using Distributionally Robust Optimization

We propose a Distributionally Robust Optimization (DRO) formulation with...

Please sign up or login with your details

Forgot password? Click here to reset