Sparse Hierarchical Regression with Polynomials

09/28/2017
by   Dimitris Bertsimas, et al.
0

We present a novel method for exact hierarchical sparse polynomial regression. Our regressor is that degree r polynomial which depends on at most k inputs, counting at most ℓ monomial terms, which minimizes the sum of the squares of its prediction errors. The previous hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We present a two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve our resulting reduced hierarchical (k, ℓ)-sparse problem exactly. The ability of our method to identify all k relevant inputs and all ℓ monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. The presented work fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with n≈ 10,000 observations for input dimension p≈ 1,000.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2017

Sparse High-Dimensional Regression: Exact Scalable Algorithms and Phase Transitions

We present a novel binary convex reformulation of the sparse regression ...
research
08/26/2022

High-dimensional sparse vine copula regression with application to genomic prediction

High-dimensional data sets are often available in genome-enabled predict...
research
11/19/2019

Adaptive Sparse Polynomial Chaos Expansions via Leja Interpolation

This work suggests an interpolation-based stochastic collocation method ...
research
04/17/2020

Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization

We consider the least squares regression problem, penalized with a combi...
research
06/11/2020

The Backbone Method for Ultra-High Dimensional Sparse Machine Learning

We present the backbone method, a generic framework that enables sparse ...
research
10/29/2019

Efficient Computation for Centered Linear Regression with Sparse Inputs

Regression with sparse inputs is a common theme for large scale models. ...
research
02/04/2020

Sparse Polynomial Chaos Expansions: Literature Survey and Benchmark

Sparse polynomial chaos expansions are a popular surrogate modelling met...

Please sign up or login with your details

Forgot password? Click here to reset