A new correlation coefficient between categorical, ordinal and interval variables with Pearson characteristics

11/28/2018
by   M. Baak, et al.
0

A prescription is presented for a new and practical correlation coefficient, ϕ_K, based on several refinements to Pearson's hypothesis test of independence of two variables. The combined features of ϕ_K form an advantage over existing coefficients. First, it works consistently between categorical, ordinal and interval variables. Second, it captures non-linear dependency. Third, it reverts to the Pearson correlation coefficient in case of a bi-variate normal input distribution. These are useful features when studying the correlation between variables with mixed types. Particular emphasis is paid to the proper evaluation of statistical significance of correlations and to the interpretation of variable relationships in a contingency table, in particular in case of low statistics samples and significant dependencies. Three practical applications are discussed. The presented algorithms are easy to use and available through a public Python library.

READ FULL TEXT

page 13

page 17

page 24

page 27

research
09/23/2019

A new coefficient of correlation

Is it possible to define a coefficient of correlation which is (a) as si...
research
04/02/2023

Multivariate probability distribution for categorical and ordinal random variables

We propose a multivariate probability distribution for categorical and o...
research
01/12/2023

Non-linear correlation analysis in financial markets using hierarchical clustering

Distance correlation coefficient (DCC) can be used to identify new assoc...
research
09/26/2018

A new Gini correlation between quantitative and qualitative variables

We propose a new Gini correlation to measure dependence between a catego...
research
09/13/2021

The correlation coefficient between citation metrics and winning a Nobel or Abel Prize

Computing such correlation coefficient would be straightforward had we h...
research
10/20/2022

Iteratively Reweighte Least Squares Method for Estimating Polyserial and Polychoric Correlation Coefficients

An iteratively reweighted least squares (IRLS) method is proposed for es...

Please sign up or login with your details

Forgot password? Click here to reset