Latent Association Mining in Binary Data

11/28/2017
by   Kelly Bodwin, et al.
0

We consider the problem of identifying groups of mutually associated variables in moderate or high dimensional data. In many cases, ordinary Pearson correlation provides useful information concerning the linear relationship between variables. However, for binary data, ordinary correlation may lose power and may lack interpretability. In this paper, we develop and investigate a new method called Latent Association Mining in Binary Data (LAMB). The LAMB method is built on the assumption that the binary observations represent a random thresholding of a latent continuous variable that may have a complex correlation structure. We consider a new measure of association, latent correlation, that is designed to assess association in the underlying continuous variable, without bias due to the mediating effects of the thresholding procedure. The full LAMB procedure makes use of iterative hypothesis testing to identify groups of latently correlated variables. LAMB is shown to improve power over existing methods in simulated settings, to be computationally efficient for large datasets, and to uncover new meaningful results from common real data types.

READ FULL TEXT
research
09/28/2018

A Unified Approach to Construct Correlation Coefficient Between Random Variables

Measuring the correlation (association) between two random variables is ...
research
07/13/2018

Improved Methods for Making Inferences About Multiple Skipped Correlations

A skipped correlation has the advantage of dealing with outliers in a ma...
research
02/28/2022

Asymptotic Normality of Gini Correlation in High Dimension with Applications to the K-sample Problem

The categorical Gini correlation proposed by Dang et al. is a dependence...
research
08/31/2022

Two-stage Hypothesis Tests for Variable Interactions with FDR Control

In many scenarios such as genome-wide association studies where dependen...
research
12/24/2019

Power Comparisons in 2x2 Contingency Tables: Odds Ratio versus Pearson Correlation versus Canonical Correlation

It is an important inferential problem to test no association between tw...
research
08/20/2021

latentcor: An R Package for estimating latent correlations from mixed data types

We present `latentcor`, an R package for correlation estimation from dat...
research
09/27/2018

Auto-Encoding Knockoff Generator for FDR Controlled Variable Selection

A new statistical procedure (Model-X candes2018) has provided a way to i...

Please sign up or login with your details

Forgot password? Click here to reset