Hard and Soft EM in Bayesian Network Learning from Incomplete Data

12/09/2020
by   Andrea Ruggieri, et al.
11

Incomplete data are a common feature in many domains, from clinical trials to industrial applications. Bayesian networks (BNs) are often used in these domains because of their graphical and causal interpretations. BN parameter learning from incomplete data is usually implemented with the Expectation-Maximisation algorithm (EM), which computes the relevant sufficient statistics ("soft EM") using belief propagation. Similarly, the Structural Expectation-Maximisation algorithm (Structural EM) learns the network structure of the BN from those sufficient statistics using algorithms designed for complete data. However, practical implementations of parameter and structure learning often impute missing data ("hard EM") to compute sufficient statistics instead of using belief propagation, for both ease of implementation and computational speed. In this paper, we investigate the question: what is the impact of using imputation instead of belief propagation on the quality of the resulting BNs? From a simulation study using synthetic data and reference BNs, we find that it is possible to recommend one approach over the other in several scenarios based on the characteristics of the data. We then use this information to build a simple decision tree to guide practitioners in choosing the EM algorithm best suited to their problem.

READ FULL TEXT

page 4

page 5

page 7

page 8

page 9

page 11

page 14

page 16

research
01/30/2013

The Bayesian Structural EM Algorithm

In recent years there has been a flurry of works on learning Bayesian ne...
research
02/07/2018

Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets

Learning a Bayesian networks with bounded treewidth is important for red...
research
04/29/2020

Identifiability and Consistency of Bayesian Network Structure Learning from Incomplete Data

Bayesian network (BN) structure learning from complete data has been ext...
research
04/07/2012

The threshold EM algorithm for parameter learning in bayesian network with incomplete data

Bayesian networks (BN) are used in a big range of applications but they ...
research
02/06/2013

Learning Bayesian Networks from Incomplete Databases

Bayesian approaches to learn the graphical structure of Bayesian Belief ...
research
06/09/2021

EMFlow: Data Imputation in Latent Space via EM and Deep Flow Models

High dimensional incomplete data can be found in a wide range of systems...
research
01/23/2013

Discovering the Hidden Structure of Complex Dynamic Systems

Dynamic Bayesian networks provide a compact and natural representation f...

Please sign up or login with your details

Forgot password? Click here to reset