Directed Graphical Models and Causal Discovery for Zero-Inflated Data

by   Shiqing Yu, et al.

Modern RNA sequencing technologies provide gene expression measurements from single cells that promise refined insights on regulatory relationships among genes. Directed graphical models are well-suited to explore such (cause-effect) relationships. However, statistical analyses of single cell data are complicated by the fact that the data often show zero-inflated expression patterns. To address this challenge, we propose directed graphical models that are based on Hurdle conditional distributions parametrized in terms of polynomials in parent variables and their 0/1 indicators of being zero or nonzero. While directed graphs for Gaussian models are only identifiable up to an equivalence class in general, we show that, under a natural and weak assumption, the exact directed acyclic graph of our zero-inflated models can be identified. We propose methods for graph recovery, apply our model to real single-cell RNA-seq data on T helper cells, and show simulated experiments that validate the identifiability and graph estimation methods in practice.


Addendum on the scoring of Gaussian directed acyclic graphical models

We provide a correction to the expression for scoring Gaussian directed ...

Causal Effect Identification in Acyclic Directed Mixed Graphs and Gated Models

We introduce a new family of graphical models that consists of graphs wi...

Bayesian causal inference in probit graphical models

We consider a binary response which is potentially affected by a set of ...

Identifiability Assumptions and Algorithm for Directed Graphical Models with Feedback

Directed graphical models provide a useful framework for modeling causal...

Equivalence class selection of categorical graphical models

Learning the structure of dependence relations between variables is a pe...

Model Uncertainty and Correctability for Directed Graphical Models

Probabilistic graphical models are a fundamental tool in probabilistic m...

Exponential family measurement error models for single-cell CRISPR screens

CRISPR genome engineering and single-cell RNA sequencing have transforme...