On the Effect of Suboptimal Estimation of Mutual Information in Feature Selection and Classification

04/30/2018
by   Kiran Karra, et al.
0

This paper introduces a new property of estimators of the strength of statistical association, which helps characterize how well an estimator will perform in scenarios where dependencies between continuous and discrete random variables need to be rank ordered. The new property, termed the estimator response curve, is easily computable and provides a marginal distribution agnostic way to assess an estimator's performance. It overcomes notable drawbacks of current metrics of assessment, including statistical power, bias, and consistency. We utilize the estimator response curve to test various measures of the strength of association that satisfy the data processing inequality (DPI), and show that the CIM estimator's performance compares favorably to kNN, vME, AP, and H_MI estimators of mutual information. The estimators which were identified to be suboptimal, according to the estimator response curve, perform worse than the more optimal estimators when tested with real-world data from four different areas of science, all with varying dimensionalities and sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/20/2017

Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals

This paper introduces a nonparametric copula-based approach for detectin...
research
05/06/2019

Estimating the Mutual Information between two Discrete, Asymmetric Variables with Limited Samples

Determining the strength of non-linear statistical dependencies between ...
research
07/20/2018

Information Estimation Using Non-Parametric Copulas

Estimation of mutual information between random variables has become cru...
research
12/20/2022

fastMI: a fast and consistent copula-based estimator of mutual information

As a fundamental concept in information theory, mutual information (MI) ...
research
11/07/2014

Efficient Estimation of Mutual Information for Strongly Dependent Variables

We demonstrate that a popular class of nonparametric mutual information ...
research
12/06/2019

Conditional Mutual Information Estimation for Mixed Discrete and Continuous Variables with Nearest Neighbors

Fields like public health, public policy, and social science often want ...
research
10/14/2019

Understanding the Limitations of Variational Mutual Information Estimators

Variational approaches based on neural networks are showing promise for ...

Please sign up or login with your details

Forgot password? Click here to reset