COMBSS: Best Subset Selection via Continuous Optimization

05/05/2022
by   Sarat Moka, et al.
0

We consider the problem of best subset selection in linear regression, where the goal is to find for every model size k, that subset of k features that best fit the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. We propose COMBSS, a novel continuous optimization based method that identifies a solution path, a small set of models of varying size, that consists of candidates for the best subset in linear regression. COMBSS turns out to be very fast, making subset selection possible when the number of features is well in excess of thousands. Simulation results are presented to highlight the performance of COMBSS in comparison to existing popular methods such as Forward Stepwise, the Lasso and Mixed-Integer Optimization. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models.

READ FULL TEXT

page 7

page 13

research
04/21/2018

Best subset selection in linear regression via bi-objective mixed integer linear programming

We study the problem of choosing the best subset of p features in linear...
research
01/09/2016

On Computationally Tractable Selection of Experiments in Measurement-Constrained Regression Models

We derive computationally tractable methods to select a small subset of ...
research
12/05/2011

On best subset regression

In this paper we discuss the variable selection method from ℓ0-norm cons...
research
01/27/2017

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression is to choose a subset of ...
research
07/27/2017

Extended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso

In exciting new work, Bertsimas et al. (2016) showed that the classical ...
research
06/11/2020

Probabilistic Best Subset Selection by Gradient-Based Optimization

In high-dimensional statistics, variable selection is an optimization pr...
research
06/11/2020

Probabilistic Best Subset Selection via Gradient-Based Optimization

In high-dimensional statistics, variable selection is an optimization pr...

Please sign up or login with your details

Forgot password? Click here to reset