Entropic Latent Variable Discovery

07/26/2018
by   Murat Kocaoglu, et al.
2

We consider the problem of discovering the simplest latent variable that can make two observed discrete variables conditionally independent. This problem has appeared in the literature as probabilistic latent semantic analysis (pLSA), and has connections to non-negative matrix factorization. When the simplicity of the variable is measured through its cardinality, we show that a solution to this latent variable discovery problem can be used to distinguish direct causal relations from spurious correlations among almost all joint distributions on simple causal graphs with two observed variables. Conjecturing a similar identifiability result holds with Shannon entropy, we study a loss function that trades-off between entropy of the latent variable and the conditional mutual information of the observed variables. We then propose a latent variable discovery algorithm -- LatentSearch -- and show that its stationary points are the stationary points of our loss function. We experimentally show that LatentSearch can indeed be used to distinguish direct causal relations from spurious correlations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2016

Latent Variable Discovery Using Dependency Patterns

The causal discovery of Bayesian networks is an active and important res...
research
03/04/2021

Causal Channels

We consider causal models with two observed variables and one latent var...
research
03/15/2012

Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery

In nonlinear latent variable models or dynamic models, if we consider th...
research
07/21/2020

Relations between networks, regression, partial correlation, and latent variable model

The Gaussian graphical model (GGM) has become a popular tool for analyzi...
research
10/12/2021

Single Independent Component Recovery and Applications

Latent variable discovery is a central problem in data analysis with a b...
research
06/11/2019

Unsupervised Discovery of Gendered Language through Latent-Variable Modeling

Studying the ways in which language is gendered has long been an area of...
research
09/18/2020

Causal Clustering for 1-Factor Measurement Models on Data with Various Types

The tetrad constraint is a condition of which the satisfaction signals a...

Please sign up or login with your details

Forgot password? Click here to reset