Bayesian feature selection with strongly-regularizing priors maps to the Ising Model

11/03/2014
by   Charles K. Fisher, et al.
0

Identifying small subsets of features that are relevant for prediction and/or classification tasks is a central problem in machine learning and statistics. The feature selection task is especially important, and computationally difficult, for modern datasets where the number of features can be comparable to, or even exceed, the number of samples. Here, we show that feature selection with Bayesian inference takes a universal form and reduces to calculating the magnetizations of an Ising model, under some mild conditions. Our results exploit the observation that the evidence takes a universal form for strongly-regularizing priors --- priors that have a large effect on the posterior probability even in the infinite data limit. We derive explicit expressions for feature selection for generalized linear models, a large class of statistical techniques that include linear and logistic regression. We illustrate the power of our approach by analyzing feature selection in a logistic regression-based classifier trained to distinguish between the letters B and D in the notMNIST dataset.

READ FULL TEXT

page 3

page 4

research
07/30/2014

Fast Bayesian Feature Selection for High Dimensional Linear Regression in Genomics via the Ising Approximation

Feature selection, identifying a subset of variables that are relevant f...
research
03/22/2021

Detecting Racial Bias in Jury Selection

To support the 2019 U.S. Supreme Court case "Flowers v. Mississippi", AP...
research
03/24/2023

Feature Space Sketching for Logistic Regression

We present novel bounds for coreset construction, feature selection, and...
research
05/29/2022

A Conditional Randomization Test for Sparse Logistic Regression in High-Dimension

Identifying the relevant variables for a classification model with corre...
research
02/01/2020

On the Consistency of Optimal Bayesian Feature Selection in the Presence of Correlations

Optimal Bayesian feature selection (OBFS) is a multivariate supervised s...
research
08/02/2020

Predicting United States policy outcomes with Random Forests

Two decades of U.S. government legislative outcomes, as well as the poli...
research
08/20/2022

Should univariate Cox regression be used for feature selection with respect to time-to-event outcomes?

IMPORTANCE: Time-to-event outcomes are commonly used in clinical trials ...

Please sign up or login with your details

Forgot password? Click here to reset