Feature Selection for Ridge Regression with Provable Guarantees

06/17/2015
by   Saurabh Paul, et al.
0

We introduce single-set spectral sparsification as a deterministic sampling based feature selection technique for regularized least squares classification, which is the classification analogue to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world datasets, namely a subset of TechTC-300 datasets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.

READ FULL TEXT
research
03/15/2018

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Ridge leverage scores provide a balance between low-rank approximation a...
research
02/14/2012

Generalized Fisher Score for Feature Selection

Fisher score is one of the most widely used supervised feature selection...
research
02/05/2019

Robust Regression via Online Feature Selection under Adversarial Data Corruption

The presence of data corruption in user-generated streaming data, such a...
research
06/11/2022

Feature Selection using e-values

In the context of supervised parametric models, we introduce the concept...
research
12/22/2020

Improving Sample and Feature Selection with Principal Covariates Regression

Selecting the most relevant features and samples out of a large set of c...
research
01/12/2020

On Feature Interactions Identified by Shapley Values of Binary Classification Games

For feature selection and related problems, we introduce the notion of c...
research
05/30/2009

A Minimum Description Length Approach to Multitask Feature Selection

Many regression problems involve not one but several response variables ...

Please sign up or login with your details

Forgot password? Click here to reset