Log In Sign Up

Feature Selection via Mutual Information: New Theoretical Insights

by   Mario Beraha, et al.

Mutual information has been successfully adopted in filter feature-selection methods to assess both the relevancy of a subset of features in predicting the target variable and the redundancy with respect to other variables. However, existing algorithms are mostly heuristic and do not offer any guarantee on the proposed solution. In this paper, we provide novel theoretical results showing that conditional mutual information naturally arises when bounding the ideal regression/classification errors achieved by different subsets of features. Leveraging on these insights, we propose a novel stopping condition for backward and forward greedy methods which ensures that the ideal prediction error using the selected feature subset remains bounded by a user-specified threshold. We provide numerical simulations to support our theoretical claims and compare to common heuristic methods.


page 1

page 2

page 3

page 4


Simple stopping criteria for information theoretic feature selection

Information theoretic feature selection aims to select a smallest featur...

Mutual Information-Based Unsupervised Feature Transformation for Heterogeneous Feature Subset Selection

Conventional mutual information (MI) based feature selection (FS) method...

An Adaptive Neighborhood Partition Full Conditional Mutual Information Maximization Method for Feature Selection

Feature selection is used to eliminate redundant features and keep relev...

Active Feature Selection for the Mutual Information Criterion

We study active feature selection, a novel feature selection setting in ...

A theoretical framework for evaluating forward feature selection methods based on mutual information

Feature selection problems arise in a variety of applications, such as m...

Greedy Search Algorithms for Unsupervised Variable Selection: A Comparative Study

Dimensionality reduction is a important step in the development of scala...