MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

05/27/2022
by   Erdun Gao, et al.
0

State-of-the-art causal discovery methods usually assume that the observational data is complete. However, the missing data problem is pervasive in many practical scenarios such as clinical trials, economics, and biology. One straightforward way to address the missing data problem is first to impute the data using off-the-shelf imputation methods and then apply existing causal discovery methods. However, such a two-step method may suffer from suboptimality, as the imputation algorithm is unaware of the causal discovery step. In this paper, we develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations. Focusing mainly on the assumptions of ignorable missingness and the identifiable additive noise models (ANMs), MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization (EM) framework. In the E-step, in cases where computing the posterior distributions of parameters in closed-form is not feasible, Monte Carlo EM is leveraged to approximate the likelihood. In the M-step, MissDAG leverages the density transformation to model the noise distributions with simpler and specific formulations by virtue of the ANMs and uses a likelihood-based causal discovery algorithm with directed acyclic graph prior as an inductive bias. We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.

READ FULL TEXT
research
01/15/2020

Causal Discovery from Incomplete Data: A Deep Learning Approach

As systems are getting more autonomous with the development of artificia...
research
09/04/2019

Likelihood-Free Overcomplete ICA and Applications in Causal Discovery

Causal discovery witnessed significant progress over the past decades. I...
research
06/09/2020

Causal Discovery from Incomplete Data using An Encoder and Reinforcement Learning

Discovering causal structure among a set of variables is a fundamental p...
research
02/06/2018

An Imputation-Consistency Algorithm for High-Dimensional Missing Data Problems and Beyond

Missing data are frequently encountered in high-dimensional problems, bu...
research
01/09/2022

Causal Discovery from Sparse Time-Series Data Using Echo State Network

Causal discovery between collections of time-series data can help diagno...
research
05/17/2023

Causal Discovery with Missing Data in a Multicentric Clinical Study

Causal inference for testing clinical hypotheses from observational data...
research
06/17/2020

Analytical Probability Distributions and EM-Learning for Deep Generative Networks

Deep Generative Networks (DGNs) with probabilistic modeling of their out...

Please sign up or login with your details

Forgot password? Click here to reset