Nonparametric causal structure learning in high dimensions

by   Shubhadeep Chakraborty, et al.

The PC and FCI algorithms are popular constraint-based methods for learning the structure of directed acyclic graphs (DAGs) in the absence and presence of latent and selection variables, respectively. These algorithms (and their order-independent variants, PC-stable and FCI-stable) have been shown to be consistent for learning sparse high-dimensional DAGs based on partial correlations. However, inferring conditional independences from partial correlations is valid if the data are jointly Gaussian or generated from a linear structural equation model – an assumption that may be violated in many applications. To broaden the scope of high-dimensional causal structure learning, we propose nonparametric variants of the PC-stable and FCI-stable algorithms that employ the conditional distance covariance (CdCov) to test for conditional independence relationships. As the key theoretical contribution, we prove that the high-dimensional consistency of the PC-stable and FCI-stable algorithms carry over to general distributions over DAGs when we implement CdCov-based nonparametric tests for conditional independence. Numerical studies demonstrate that our proposed algorithms perform nearly as good as the PC-stable and FCI-stable for Gaussian distributions, and offer advantages in non-Gaussian graphical models.


page 1

page 2

page 3

page 4


On the Logic of Causal Models

This paper explores the role of Directed Acyclic Graphs (DAGs) as a repr...

The Dual PC Algorithm for Structure Learning

While learning the graphical structure of Bayesian networks from observa...

Learning Directed Acyclic Graphs with Penalized Neighbourhood Regression

We study a family of regularized score-based estimators for learning the...

A Simple Unified Approach to Testing High-Dimensional Conditional Independences for Categorical and Ordinal Data

Conditional independence (CI) tests underlie many approaches to model te...

A Scalable Conditional Independence Test for Nonlinear, Non-Gaussian Data

Many relations of scientific interest are nonlinear, and even in linear ...

Sparse Additive Functional and Kernel CCA

Canonical Correlation Analysis (CCA) is a classical tool for finding cor...

A Distribution-Free Independence Test for High Dimension Data

Test of independence is of fundamental importance in modern data analysi...