Measuring Association on Topological Spaces Using Kernels and Geometric Graphs

10/05/2020
by   Nabarun Deb, et al.
0

In this paper we propose and study a class of simple, nonparametric, yet interpretable measures of association between two random variables X and Y taking values in general topological spaces. These nonparametric measures – defined using the theory of reproducing kernel Hilbert spaces – capture the strength of dependence between X and Y and have the property that they are 0 if and only if the variables are independent and 1 if and only if one variable is a measurable function of the other. Further, these population measures can be consistently estimated using the general framework of graph functionals which include k-nearest neighbor graphs and minimum spanning trees. Moreover, a sub-class of these estimators are also shown to adapt to the intrinsic dimensionality of the underlying distribution. Some of these empirical measures can also be computed in near linear time. Under the hypothesis of independence between X and Y, these empirical measures (properly normalized) have a standard normal limiting distribution. Thus, these measures can also be readily used to test the hypothesis of mutual independence between X and Y. In fact, as far as we are aware, these are the only procedures that possess all the above mentioned desirable properties. Furthermore, when restricting to Euclidean spaces, we can make these sample measures of association finite-sample distribution-free, under the hypothesis of independence, by using multivariate ranks defined via the theory of optimal transport. The recent correlation coefficient proposed in Dette et al. (2013), Chatterjee (2019), and Azadkia and Chatterjee (2019) can be seen as a special case of this general class of measures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2020

Kernel Partial Correlation Coefficient – a Measure of Conditional Dependence

In this paper we propose and study a class of simple, nonparametric, yet...
research
01/10/2022

Rearranged dependence measures

Most of the popular dependence measures for two random variables X and Y...
research
11/09/2018

Ball: An R package for detecting distribution difference and association in metric spaces

The rapid development of modern technology facilitates the appearance of...
research
04/25/2022

Generalized Cramér's coefficient via f-divergence for contingency tables

This study proposes measures describing the strength of association betw...
research
09/22/2022

Azadkia-Chatterjee's correlation coefficient adapts to manifold data

In their seminal work, Azadkia and Chatterjee (2021) initiated graph-bas...
research
07/06/2023

Geometric Mean Type of Proportional Reduction in Variation Measure for Two-Way Contingency Tables

In a two-way contingency table analysis with explanatory and response va...
research
05/05/2021

Transport Dependency: Optimal Transport Based Dependency Measures

Finding meaningful ways to determine the dependency between two random v...

Please sign up or login with your details

Forgot password? Click here to reset