Sobolev Independence Criterion

10/31/2019
by   Youssef Mroueh, et al.
6

We propose the Sobolev Independence Criterion (SIC), an interpretable dependency measure between a high dimensional random variable X and a response variable Y . SIC decomposes to the sum of feature importance scores and hence can be used for nonlinear feature selection. SIC can be seen as a gradient regularized Integral Probability Metric (IPM) between the joint distribution of the two random variables and the product of their marginals. We use sparsity inducing gradient penalties to promote input sparsity of the critic of the IPM. In the kernel version we show that SIC can be cast as a convex optimization problem by introducing auxiliary variables that play an important role in feature selection as they are normalized feature importance scores. We then present a neural version of SIC where the critic is parameterized as a homogeneous neural network, improving its representation power as well as its interpretability. We conduct experiments validating SIC for feature selection in synthetic and real-world experiments. We show that SIC enables reliable and interpretable discoveries, when used in conjunction with the holdout randomization test and knockoffs to control the False Discovery Rate. Code is available at http://github.com/ibm/sic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2022

Feature Selection for Machine Learning Algorithms that Bounds False Positive Rate

The problem of selecting a handful of truly relevant variables in superv...
research
05/11/2021

Comparing interpretability and explainability for feature selection

A common approach for feature selection is to examine the variable impor...
research
10/20/2021

PPFS: Predictive Permutation Feature Selection

We propose Predictive Permutation Feature Selection (PPFS), a novel wrap...
research
11/10/2014

N^3LARS: Minimum Redundancy Maximum Relevance Feature Selection for Large and High-dimensional Data

We propose a feature selection method that finds non-redundant features ...
research
06/18/2012

Copula-based Kernel Dependency Measures

The paper presents a new copula based method for measuring dependence be...
research
11/01/2018

The Holdout Randomization Test: Principled and Easy Black Box Feature Selection

We consider the problem of feature selection using black box predictive ...
research
07/11/2023

Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks

Providing a model that achieves a strong predictive performance and at t...

Please sign up or login with your details

Forgot password? Click here to reset