Distance-based and RKHS-based Dependence Metrics in High Dimension

02/08/2019
by   Changbo Zhu, et al.
0

In this paper, we study distance covariance, Hilbert-Schmidt covariance (aka Hilbert-Schmidt independence criterion [Gretton et al. (2008)]) and related independence tests under the high dimensional scenario. We show that the sample distance/Hilbert-Schmidt covariance between two random vectors can be approximated by the sum of squared componentwise sample cross-covariances up to an asymptotically constant factor, which indicates that the distance/Hilbert-Schmidt covariance based test can only capture linear dependence in high dimension. As a consequence, the distance correlation based t-test developed by Szekely and Rizzo (2013) for independence is shown to have trivial limiting power when the two random vectors are nonlinearly dependent but component-wisely uncorrelated. This new and surprising phenomenon, which seems to be discovered for the first time, is further confirmed in our simulation study. As a remedy, we propose tests based on an aggregation of marginal sample distance/Hilbert-Schmidt covariances and show their superior power behavior against their joint counterparts in simulations. We further extend the distance correlation based t-test to those based on Hilbert-Schmidt covariance and marginal distance/Hilbert-Schmidt covariance. A novel unified approach is developed to analyze the studentized sample distance/Hilbert-Schmidt covariance as well as the studentized sample marginal distance covariance under both null and alternative hypothesis. Our theoretical and simulation results shed light on the limitation of distance/Hilbert-Schmidt covariance when used jointly in the high dimensional setting and suggest the aggregation of marginal distance/Hilbert-Schmidt covariance as a useful alternative.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2019

A New Framework for Distance and Kernel-based Metrics in High Dimensions

The paper presents new metrics to quantify and test for (i) the equality...
research
06/14/2021

Generalized kernel distance covariance in high dimensions: non-null CLTs and power universality

Distance covariance is a popular dependence measure for two random vecto...
research
11/25/2017

Distance Metrics for Measuring Joint Dependence with Application to Causal Inference

Many statistical applications require the quantification of joint depend...
research
08/03/2023

Robust Independence Tests with Finite Sample Guarantees for Synchronous Stochastic Linear Systems

The paper introduces robust independence tests with non-asymptotically g...
research
06/25/2018

Distance covariance for discretized stochastic processes

Given an iid sequence of pairs of stochastic processes on the unit inter...
research
05/24/2023

Interpretation and visualization of distance covariance through additive decomposition of correlations formula

Distance covariance is a widely used statistical methodology for testing...
research
06/05/2019

Estimating Feature-Label Dependence Using Gini Distance Statistics

Identifying statistical dependence between the features and the label is...

Please sign up or login with your details

Forgot password? Click here to reset