Measuring Dependence with Matrix-based Entropy Functional

01/25/2021
by   Shujian Yu, et al.
0

Measuring the dependence of data plays a central role in statistics and machine learning. In this work, we summarize and generalize the main idea of existing information-theoretic dependence measures into a higher-level perspective by the Shearer's inequality. Based on our generalization, we then propose two measures, namely the matrix-based normalized total correlation (T_α^*) and the matrix-based normalized dual total correlation (D_α^*), to quantify the dependence of multiple variables in arbitrary dimensional space, without explicit estimation of the underlying data distributions. We show that our measures are differentiable and statistically more powerful than prevalent ones. We also show the impact of our measures in four different machine learning problems, namely the gene regulatory network inference, the robust machine learning under covariate shift and non-Gaussian noises, the subspace outlier detection, and the understanding of the learning dynamics of convolutional neural networks (CNNs), to demonstrate their utilities, advantages, as well as implications to those problems. Code of our dependence measure is available at: https://bit.ly/AAAI-dependence

READ FULL TEXT

page 7

page 14

research
05/05/2020

Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications

We propose a simple yet powerful test statistic to quantify the discrepa...
research
01/30/2023

Copula-based dependence measures for arbitrary data

In this article, we define extensions of copula-based dependence measure...
research
08/16/2022

Measuring Statistical Dependencies via Maximum Norm and Characteristic Functions

In this paper, we focus on the problem of statistical dependence estimat...
research
06/27/2017

Unsupervised Learning via Total Correlation Explanation

Learning by children and animals occurs effortlessly and largely without...
research
09/27/2018

Statistical dependence: Beyond Pearson's ρ

Pearson's ρ is the most used measure of statistical dependence. It gives...
research
11/30/2022

Robust and Fast Measure of Information via Low-rank Representation

The matrix-based Rényi's entropy allows us to directly quantify informat...
research
06/21/2016

An artificial neural network to find correlation patterns in an arbitrary number of variables

Methods to find correlation among variables are of interest to many disc...

Please sign up or login with your details

Forgot password? Click here to reset