Exploring and measuring non-linear correlations: Copulas, Lightspeed Transportation and Clustering

by   Gautier Marti, et al.

We propose a methodology to explore and measure the pairwise correlations that exist between variables in a dataset. The methodology leverages copulas for encoding dependence between two variables, state-of-the-art optimal transport for providing a relevant geometry to the copulas, and clustering for summarizing the main dependence patterns found between the variables. Some of the clusters centers can be used to parameterize a novel dependence coefficient which can target or forget specific dependence patterns. Finally, we illustrate and benchmark the methodology on several datasets. Code and numerical experiments are available online for reproducible research.


page 2

page 3

page 4

page 5

page 6


Optimal Transport vs. Fisher-Rao distance between Copulas for Clustering Multivariate Time Series

We present a methodology for clustering N objects which are described by...

Optimal Copula Transport for Clustering Multivariate Time Series

This paper presents a new methodology for clustering multivariate time s...

The Randomized Dependence Coefficient

We introduce the Randomized Dependence Coefficient (RDC), a measure of n...

Dissimilarity functions for rank-based hierarchical clustering of continuous variables

We present a theoretical framework for a (copula-based) notion of dissim...

Dependence Measure for non-additive model

We proposed a new statistical dependency measure called Copula Dependenc...

Robust Dependence Measure using RKHS based Uncertainty Moments and Optimal Transport

Reliable measurement of dependence between variables is essential in man...

Towards a universal representation of statistical dependence

Dependence is undoubtedly a central concept in statistics. Though, it pr...

Please sign up or login with your details

Forgot password? Click here to reset