From Distance Correlation to Multiscale Generalized Correlation

10/26/2017
by   Cencheng Shen, et al.
0

Understanding and developing a correlation measure that can detect general dependencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age. We proposed the Multiscale Generalized Correlation (MGC) in Shen et al. 2017 as a novel correlation measure, which worked well empirically and helped a number of real data discoveries. But there is a wide gap with respect to the theoretical side, e.g., the population statistic, the convergence from sample to population, how well does the algorithmic Sample MGC perform, etc. To better understand its underlying mechanism, in this paper we formalize the population version of local distance correlations, MGC, and the optimal local scale between the underlying random variables, by utilizing the characteristic functions and incorporating the nearest-neighbor machinery. The population version enables a seamless connection with, and significant improvement to, the algorithmic Sample MGC, both theoretically and in practice, which further allows a number of desirable asymptotic and finite-sample properties to be proved and explored for MGC. The advantages of MGC are further illustrated via a comprehensive set of simulations with linear, nonlinear, univariate, multivariate, and noisy dependencies, where it loses almost no power against monotone dependencies while achieving superior performance against general dependencies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2019

Asymptotic Distributions of High-Dimensional Nonparametric Inference with Distance Correlation

Understanding the nonlinear association between a pair of potentially hi...
research
07/07/2021

Distance correlation for long-range dependent time series

We apply the concept of distance correlation for testing independence of...
research
10/12/2017

Inference for partial correlation when data are missing not at random

We introduce uncertainty regions to perform inference on partial correla...
research
05/03/2022

On the asymptotic distribution of the symmetrized Chatterjee's correlation coefficient

Chatterjee (2021) introduced an asymmetric correlation measure that has ...
research
12/09/2020

Searching for genetic interactions in complex disease by using distance correlation

Understanding epistasis (genetic interaction) may shed some light on the...
research
11/25/2018

Generalized R^2 Measures for a Mixture of Bivariate Linear Dependences

Motivated by the pressing needs for capturing complex but interperetable...
research
12/27/2019

The Chi-Square Test of Distance Correlation

Distance correlation has gained much recent attention in the statistics ...

Please sign up or login with your details

Forgot password? Click here to reset