solar
subsample-order least-angle regression, a algorithm that performs quick, sparse, stable and accurate variable selection even under complicated dependence structures, harsh irrepresentable conditions and high multicollinearity.
view repo
We propose a new least-angle regression algorithm for variable selection in high-dimensional data, called subsample-ordered least-angle regression (solar). Solar relies on the average L_0 solution path computed across subsamples and largely alleviates several known high-dimensional issues with least-angle regression. Using examples based on directed acyclic graphs, we illustrate the advantages of solar in comparison to least-angle regression, forward regression and variable screening. Simulations demonstrate that, with a similar computation load, solar yields substantial improvements over two lasso solvers (least-angle regression for lasso and coordinate-descent) in terms of the sparsity (37-64% reduction in the average number of selected variables), stability and accuracy of variable selection. Simulations also demonstrate that solar enhances the robustness of variable selection to different settings of the irrepresentable condition and to variations in the dependence structures assumed in regression analysis. We provide a Python package solarpy for the algorithm.
READ FULL TEXT
In this paper we focus on the variable-selection peformance of solar on ...
read it
Least Angle Regression is a promising technique for variable selection
a...
read it
The cqrReg package for R is the first to introduce a family of robust,
h...
read it
This paper describes a flexible approach to short term prediction of
met...
read it
Variable selection problems generally present more than a single solutio...
read it
Penalized likelihood methods are widely used for high-dimensional regres...
read it
Applied statisticians use sequential regression procedures to produce a
...
read it
subsample-order least-angle regression, a algorithm that performs quick, sparse, stable and accurate variable selection even under complicated dependence structures, harsh irrepresentable conditions and high multicollinearity.
Comments
There are no comments yet.