Kernel Partial Correlation Coefficient – a Measure of Conditional Dependence

12/29/2020
by   Zhen Huang, et al.
0

In this paper we propose and study a class of simple, nonparametric, yet interpretable measures of conditional dependence between two random variables Y and Z given a third variable X, all taking values in general topological spaces. The population version of any of these measures captures the strength of conditional dependence and it is 0 if and only if Y and Z are conditionally independent given X, and 1 if and only if Y is a measurable function of Z and X. Thus, our measure – which we call kernel partial correlation (KPC) coefficient – can be thought of as a nonparametric generalization of the partial correlation coefficient that possesses the above properties when (X,Y,Z) is jointly normal. We describe two consistent methods of estimating KPC. Our first method utilizes the general framework of geometric graphs, including K-nearest neighbor graphs and minimum spanning trees. A sub-class of these estimators can be computed in near linear time and converges at a rate that automatically adapts to the intrinsic dimension(s) of the underlying distribution(s). Our second strategy involves direct estimation of conditional mean embeddings using cross-covariance operators in the reproducing kernel Hilbert spaces. Using these empirical measures we develop forward stepwise (high-dimensional) nonlinear variable selection algorithms. We show that our algorithm, using the graph-based estimator, yields a provably consistent model-free variable selection procedure, even in the high-dimensional regime when the number of covariates grows exponentially with the sample size, under suitable sparsity assumptions. Extensive simulation and real-data examples illustrate the superior performance of our methods compared to existing procedures. The recent conditional dependence measure proposed by Azadkia and Chatterjee (2019) can be viewed as a special case of our general framework.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2020

Measuring Association on Topological Spaces Using Kernels and Geometric Graphs

In this paper we propose and study a class of simple, nonparametric, yet...
research
10/27/2019

A simple measure of conditional dependence

We propose a coefficient of conditional dependence between two random va...
research
08/13/2012

Nonparametric sparsity and regularization

In this work we are interested in the problems of supervised learning an...
research
08/11/2023

Quantifying and estimating dependence via sensitivity of conditional distributions

Recently established, directed dependence measures for pairs (X,Y) of ra...
research
09/22/2022

Azadkia-Chatterjee's correlation coefficient adapts to manifold data

In their seminal work, Azadkia and Chatterjee (2021) initiated graph-bas...
research
08/15/2021

On Azadkia-Chatterjee's conditional dependence coefficient

In recent work, Azadkia and Chatterjee laid out an ingenious approach to...
research
10/20/2022

Vine copula based knockoff generation for high-dimensional controlled variable selection

Vine copulas are a flexible tool for high-dimensional dependence modelin...

Please sign up or login with your details

Forgot password? Click here to reset