Margins of discrete Bayesian networks
Bayesian network models with latent variables are widely used in statistics and machine learning. In this paper we provide a complete algebraic characterization of Bayesian network models with latent variables when the observed variables are discrete and no assumption is made about the state-space of the latent variables. We show that it is algebraically equivalent to the so-called nested Markov model, meaning that the two are the same up to inequality constraints on the joint probabilities. In particular these two models have the same dimension. The nested Markov model is therefore the best possible description of the latent variable model that avoids consideration of inequalities, which are extremely complicated in general. A consequence of this is that the constraint finding algorithm of Tian and Pearl (UAI 2002, pp519-527) is complete for finding equality constraints. Latent variable models suffer from difficulties of unidentifiable parameters and non-regular asymptotics; in contrast the nested Markov model is fully identifiable, represents a curved exponential family of known dimension, and can easily be fitted using an explicit parameterization.
READ FULL TEXT