latentcor: An R Package for estimating latent correlations from mixed data types

08/20/2021
by   Mingze Huang, et al.
0

We present `latentcor`, an R package for correlation estimation from data with mixed variable types. Mixed variables types, including continuous, binary, ordinal, zero-inflated, or truncated data are routinely collected in many areas of science. Accurate estimation of correlations among such variables is often the first critical step in statistical analysis workflows. Pearson correlation as the default choice is not well suited for mixed data types as the underlying normality assumption is violated. The concept of semi-parametric latent Gaussian copula models, on the other hand, provides a unifying way to estimate correlations between mixed data types. The R package `latentcor` comprises a comprehensive list of these models, enabling the estimation of correlations between any of continuous/binary/ternary/zero-inflated (truncated) variable types. The underlying implementation takes advantage of a fast multi-linear interpolation scheme with an efficient choice of interpolation grid points, thus giving the package a small memory footprint without compromising estimation accuracy. This makes latent correlation estimation readily available for modern high-throughput data analysis.

READ FULL TEXT
research
05/13/2022

Semiparametric Gaussian Copula Regression modeling for Mixed Data Types (SGCRM)

Many clinical and epidemiological studies encode collected participant-l...
research
06/24/2020

Fast computation of latent correlations

Latent Gaussian copula models provide a powerful means to perform multi-...
research
07/13/2018

Sparse semiparametric canonical correlation analysis for data of mixed types

Canonical correlation analysis investigates linear relationships between...
research
09/17/2018

Rank-based approach for estimating correlations in mixed ordinal data

High-dimensional mixed data as a combination of both continuous and ordi...
research
07/13/2018

Improved Methods for Making Inferences About Multiple Skipped Correlations

A skipped correlation has the advantage of dealing with outliers in a ma...
research
11/21/2022

High-Dimensional Undirected Graphical Models for Arbitrary Mixed Data

Graphical models are an important tool in exploring relationships betwee...
research
11/28/2017

Latent Association Mining in Binary Data

We consider the problem of identifying groups of mutually associated var...

Please sign up or login with your details

Forgot password? Click here to reset