The Chi-Square Test of Distance Correlation

12/27/2019
by   Cencheng Shen, et al.
37

Distance correlation has gained much recent attention in the statistics and machine learning community: the sample statistic is straightforward to compute, works for any metric or kernel choice, and equals 0 asymptotically if and only if independence. One major bottleneck is the testing process: the null distribution of distance correlation depends on the metric choice and marginal distributions, which cannot be readily estimated. To compute a p-value, the standard approach is to estimate the null distribution via permutation, which generally requires O(rn^2) time complexity for n samples and r permutations and too costly for big data applications. In this paper, we propose a chi-square distribution to approximate the null distribution of the unbiased distance correlation. We prove that the chi-square distribution either equals or well-approximates the null distribution, and always upper tail dominates the null distribution. The resulting distance correlation chi-square test does not require any permutation nor parameter estimation, is simple and fast to implement, works with any strong negative type metric or characteristic kernel, is valid and universally consistent for independence testing, and enjoys a similar finite-sample testing power as the standard permutation test. When testing one-dimensional data using Euclidean distance, the unbiased distance correlation testing runs in O(nlog(n)), rendering it comparable in speed to the Pearson correlation t-test. The results are supported and demonstrated via simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/27/2022

A Permutation-free Kernel Two-Sample Test

The kernel Maximum Mean Discrepancy (MMD) is a popular multivariate dist...
research
10/20/2019

The Exact Equivalence of Independence Testing and Two-Sample Testing

Testing independence and testing equality of distributions are two tight...
research
12/18/2022

A Permutation-Free Kernel Independence Test

In nonparametric independence testing, we observe i.i.d. data {(X_i,Y_i)...
research
01/04/2020

High-Dimensional Independence Testing and Maximum Marginal Correlation

A number of universally consistent dependence measures have been recentl...
research
06/09/2019

Graph Independence Testing

Identifying statistically significant dependency between variables is a ...
research
01/12/2023

Testing for Coefficient Randomness in Local-to-Unity Autoregressions

In this study, we propose a test for the coefficient randomness in autor...
research
10/26/2017

From Distance Correlation to Multiscale Generalized Correlation

Understanding and developing a correlation measure that can detect gener...

Please sign up or login with your details

Forgot password? Click here to reset