Sequential Attention for Feature Selection

09/29/2022
by   MohammadHossein Bateni, et al.
60

Feature selection is the problem of selecting a subset of features for a machine learning model that maximizes model quality subject to a resource budget constraint. For neural networks, prior methods, including those based on ℓ_1 regularization, attention, and stochastic gates, typically select all of the features in one evaluation round, ignoring the residual value of the features during selection (i.e., the marginal contribution of a feature conditioned on the previously selected features). We propose a feature selection algorithm called Sequential Attention that achieves state-of-the-art empirical results for neural networks. This algorithm is based on an efficient implementation of greedy forward selection and uses attention weights at each step as a proxy for marginal feature importance. We provide theoretical insights into our Sequential Attention algorithm for linear regression models by showing that an adaptation to this setting is equivalent to the classical Orthogonal Matching Pursuit algorithm [PRK1993], and thus inherits all of its provable guarantees. Lastly, our theoretical and empirical analyses provide new explanations towards the effectiveness of attention and its connections to overparameterization, which might be of independent interest.

READ FULL TEXT

page 9

page 16

page 18

page 20

research
07/19/2022

Neural Greedy Pursuit for Feature Selection

We propose a greedy algorithm to select N important features among P inp...
research
01/25/2011

Using Feature Weights to Improve Performance of Neural Networks

Different features have different relevance to a particular learning pro...
research
07/08/2020

Binary Stochastic Filtering: feature selection and beyond

Feature selection is one of the most decisive tools in understanding dat...
research
02/28/2022

Fast Feature Selection with Fairness Constraints

We study the fundamental problem of selecting optimal features for model...
research
02/22/2021

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literat...
research
05/08/2021

Parameterized Complexity of Feature Selection for Categorical Data Clustering

We develop new algorithmic methods with provable guarantees for feature ...
research
10/04/2019

Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

For AI systems to garner widespread public acceptance, we must develop m...

Please sign up or login with your details

Forgot password? Click here to reset