Generalized kernel distance covariance in high dimensions: non-null CLTs and power universality

06/14/2021
by   Qiyang Han, et al.
0

Distance covariance is a popular dependence measure for two random vectors X and Y of possibly different dimensions and types. Recent years have witnessed concentrated efforts in the literature to understand the distributional properties of the sample distance covariance in a high-dimensional setting, with an exclusive emphasis on the null case that X and Y are independent. This paper derives the first non-null central limit theorem for the sample distance covariance, and the more general sample (Hilbert-Schmidt) kernel distance covariance in high dimensions, primarily in the Gaussian case. The new non-null central limit theorem yields an asymptotically exact first-order power formula for the widely used generalized kernel distance correlation test of independence between X and Y. The power formula in particular unveils an interesting universality phenomenon: the power of the generalized kernel distance correlation test is completely determined by n·dcor^2(X,Y)/√(2) in the high dimensional limit, regardless of a wide range of choices of the kernels and bandwidth parameters. Furthermore, this separation rate is also shown to be optimal in a minimax sense. The key step in the proof of the non-null central limit theorem is a precise expansion of the mean and variance of the sample distance covariance in high dimensions, which shows, among other things, that the non-null Gaussian approximation of the sample distance covariance involves a rather subtle interplay between the dimension-to-sample ratio and the dependence between X and Y.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2019

Distance-based and RKHS-based Dependence Metrics in High Dimension

In this paper, we study distance covariance, Hilbert-Schmidt covariance ...
research
12/17/2022

Assessing bivariate independence: Revisiting Bergsma's covariance

Bergsma (2006) proposed a covariance κ(X,Y) between random variables X a...
research
05/24/2023

Interpretation and visualization of distance covariance through additive decomposition of correlations formula

Distance covariance is a widely used statistical methodology for testing...
research
07/07/2021

Distance correlation for long-range dependent time series

We apply the concept of distance correlation for testing independence of...
research
05/26/2022

Gaussian Universality of Linear Classifiers with Random Labels in High-Dimension

While classical in many theoretical settings, the assumption of Gaussian...
research
02/11/2023

A High-dimensional Convergence Theorem for U-statistics with Applications to Kernel-based Testing

We prove a convergence theorem for U-statistics of degree two, where the...
research
12/21/2018

The FDR-Linking Theorem

This paper introduces the FDR-linking theorem, a novel technique for und...

Please sign up or login with your details

Forgot password? Click here to reset