A geometric view on Pearson's correlation coefficient and a generalization of it to non-linear dependencies

04/21/2018
by   Priyantha Wijayatunga, et al.
0

Measuring strength or degree of statistical dependence between two random variables is a common problem in many domains. Pearson's correlation coefficient ρ is an accurate measure of linear dependence. We show that ρ is a normalized, Euclidean type distance between joint probability distribution of the two random variables and that when their independence is assumed while keeping their marginal distributions. And the normalizing constant is the geometric mean of two maximal distances, each between the joint probability distribution when the full linear dependence is assumed while preserving respective marginal distribution and that when the independence is assumed. Usage of it is restricted to linear dependence because it is based on Euclidean type distances that are generally not metrics and considered full dependence is linear. Therefore, we argue that if a suitable distance metric is used while considering all possible maximal dependences then it can measure any non-linear dependence. But then, one must define all the full dependences. Hellinger distance that is a metric can be used as the distance measure between probability distributions and obtain a generalization of ρ for the discrete case.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2013

The Randomized Dependence Coefficient

We introduce the Randomized Dependence Coefficient (RDC), a measure of n...
research
06/11/2020

Modeling high-dimensional dependence among astronomical data

Fixing the relationship among a set of experimental quantities is a fund...
research
01/11/2019

On the Importance of Asymmetry and Monotonicity Constraints in Maximal Correlation Analysis

The maximal correlation coefficient is a well-established generalization...
research
08/15/2019

Pearson Distance is not a Distance

The Pearson distance between a pair of random variables X,Y with correla...
research
03/15/2020

Wasserstein Distance to Independence Models

An independence model for discrete random variables is a Segre-Veronese ...
research
02/16/2023

Towards a universal representation of statistical dependence

Dependence is undoubtedly a central concept in statistics. Though, it pr...
research
03/20/2017

Copula Index for Detecting Dependence and Monotonicity between Stochastic Signals

This paper introduces a nonparametric copula-based approach for detectin...

Please sign up or login with your details

Forgot password? Click here to reset