Causal discovery in the presence of missing data

07/11/2018
by   Ruibo Tu, et al.
6

Missing data are ubiquitous in many domains such as healthcare. Depending on how they are missing, the (conditional) independence relations in the observed data may be different from those for the complete data generated by the underlying causal process and, as a consequence, simply applying existing causal discovery methods to the observed data may lead to wrong conclusions. It is then essential to extend existing causal discovery approaches to find true underlying causal structure from such incomplete data. In this paper, we aim at solving this problem for data that are missing with different mechanisms, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). With missingness mechanisms represented by missingness Graph (m-Graph), we analyze conditions under which addition correction is needed to derive conditional independence/dependence relations in the complete data. Based on our analysis, we propose missing value PC (MVPC), which combines additional corrections with traditional causal discovery algorithm, in particular, PC. Our proposed MVPC is shown in theory to give asymptotically correct results even using data that are MAR and MNAR. Experiment results illustrate that the proposed algorithm can correct the conditional independence for values MCAR, MAR and rather general cases of values MNAR both with synthetic data as well as real-life healthcare application.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/15/2020

Causal Discovery from Incomplete Data: A Deep Learning Approach

As systems are getting more autonomous with the development of artificia...
research
11/15/2016

Recoverability of Joint Distribution from Missing Data

A probabilistic query may not be estimable from observed data corrupted ...
research
12/01/2021

Learning Invariant Representations with Missing Data

Spurious correlations allow flexible models to predict well during train...
research
11/25/2018

What is meant by 'P(R|Yobs)'?

Missing at Random (MAR) is a central concept in incomplete data methods,...
research
01/20/2018

Missing at random: a stochastic process perspective

We offer a natural and extensible measure-theoretic treatment of missing...
research
05/25/2017

Fast Causal Inference with Non-Random Missingness by Test-Wise Deletion

Many real datasets contain values missing not at random (MNAR). In this ...
research
02/11/2021

Causal Discovery of a River Network from its Extremes

Causal inference for extremes aims to discover cause and effect relation...

Please sign up or login with your details

Forgot password? Click here to reset