Amortized learning of neural causal representations

08/21/2020 ∙ by Nan Rosemary Ke, et al. ∙ 5

Causal models can compactly and efficiently encode the data-generating process under all interventions and hence may generalize better under changes in distribution. These models are often represented as Bayesian networks and learning them scales poorly with the number of variables. Moreover, these approaches cannot leverage previously learned knowledge to help with learning new causal models. In order to tackle these challenges, we represent a novel algorithm called causal relational networks (CRN) for learning causal models using neural networks. The CRN represent causal models using continuous representations and hence could scale much better with the number of variables. These models also take in previously learned information to facilitate learning of new causal models. Finally, we propose a decoding-based metric to evaluate causal models with continuous representations. We test our method on synthetic data achieving high accuracy and quick adaptation to previously unseen causal models.

READ FULL TEXT
POST COMMENT

Comments

Robert R Tucci

Your definition of Bayesian networks is too limited. Bayesian Networks can have continuous nodes and also deterministic nodes. Such nodes have been used by B net practitioners since the beginning of B nets. In the continuous node case, one assigns a transition matrix to the node which is a probability density instead of a discrete probability distribution. Andrew Gelman (Columbia Univ.) has been using continuous nodes in his B nets his entire career. As for deterministic nodes, if the node outputs y and the input is x, then the transition probability matrix for the node is \delta(x, f(y)), where \delta is either the Kronecker or the Dirac delta function, and f(\cdot) is a function of x. A delta function is a perfectly legal probability distribution.

So the distinctions you are making are fallacious. Neural nets are Bayesian networks too! They are  very narrow class of B nets in which all of the nodes are deterministic. 

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.