Feature Selection using e-values

06/11/2022
by   Subhabrata Majumdar, et al.
0

In the context of supervised parametric models, we introduce the concept of e-values. An e-value is a scalar quantity that represents the proximity of the sampling distribution of parameter estimates in a model trained on a subset of features to that of the model trained on all features (i.e. the full model). Under general conditions, a rank ordering of e-values separates models that contain all essential features from those that do not. The e-values are applicable to a wide range of parametric models. We use data depths and a fast resampling-based algorithm to implement a feature selection procedure using e-values, providing consistency results. For a p-dimensional feature space, this procedure requires fitting only the full model and evaluating p+1 models, as opposed to the traditional requirement of fitting and evaluating 2^p models. Through experiments across several model settings and synthetic and real datasets, we establish that the e-values method as a promising general alternative to existing model-specific methods of feature selection.

READ FULL TEXT

page 9

page 20

research
05/05/2020

Feature Selection Methods for Uplift Modeling

Uplift modeling is a predictive modeling technique that estimates the us...
research
07/09/2020

Probabilistic Value Selection for Space Efficient Model

An alternative to current mainstream preprocessing methods is proposed: ...
research
06/17/2015

Feature Selection for Ridge Regression with Provable Guarantees

We introduce single-set spectral sparsification as a deterministic sampl...
research
01/12/2020

On Feature Interactions Identified by Shapley Values of Binary Classification Games

For feature selection and related problems, we introduce the notion of c...
research
09/15/2019

Target-Focused Feature Selection Using a Bayesian Approach

In many real-world scenarios where data is high dimensional, test time a...
research
02/22/2021

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literat...
research
07/25/2021

Identifying the fragment structure of the organic compounds by deeply learning the original NMR data

We preprocess the raw NMR spectrum and extract key characteristic featur...

Please sign up or login with your details

Forgot password? Click here to reset