Information-theoretic Feature Selection via Tensor Decomposition and Submodularity

10/30/2020
by   Magda Amiridi, et al.
12

Feature selection by maximizing high-order mutual information between the selected feature vector and a target variable is the gold standard in terms of selecting the best subset of relevant features that maximizes the performance of prediction models. However, such an approach typically requires knowledge of the multivariate probability distribution of all features and the target, and involves a challenging combinatorial optimization problem. Recent work has shown that any joint Probability Mass Function (PMF) can be represented as a naive Bayes model, via Canonical Polyadic (tensor rank) Decomposition. In this paper, we introduce a low-rank tensor model of the joint PMF of all variables and indirect targeting as a way of mitigating complexity and maximizing the classification performance for a given number of features. Through low-rank modeling of the joint PMF, it is possible to circumvent the curse of dimensionality by learning principal components of the joint distribution. By indirectly aiming to predict the latent variable of the naive Bayes model instead of the original target variable, it is possible to formulate the feature selection problem as maximization of a monotone submodular function subject to a cardinality constraint - which can be tackled using a greedy algorithm that comes with performance guarantees. Numerical experiments with several standard datasets suggest that the proposed approach compares favorably to the state-of-art for this important problem.

READ FULL TEXT

page 1

page 4

research
02/21/2018

Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

We introduce instancewise feature selection as a methodology for model i...
research
07/18/2022

High-Order Conditional Mutual Information Maximization for dealing with High-Order Dependencies in Feature Selection

This paper presents a novel feature selection method based on the condit...
research
10/13/2022

Learning Multivariate CDFs and Copulas using Tensor Factorization

Learning the multivariate distribution of data is a core challenge in st...
research
11/29/2018

Simple stopping criteria for information theoretic feature selection

Information theoretic feature selection aims to select a smallest featur...
research
06/30/2020

Recovering Joint Probability of Discrete Random Variables from Pairwise Marginals

Learning the joint probability of random variables (RVs) lies at the hea...
research
08/03/2018

A Two-Dimensional (2-D) Learning Framework for Particle Swarm based Feature Selection

This paper proposes a new generalized two dimensional learning approach ...

Please sign up or login with your details

Forgot password? Click here to reset