Distributed Learning of Generalized Linear Causal Networks

01/23/2022
by   Qiaoling Ye, et al.
0

We consider the task of learning causal structures from data stored on multiple machines, and propose a novel structure learning method called distributed annealing on regularized likelihood score (DARLS) to solve this problem. We model causal structures by a directed acyclic graph that is parameterized with generalized linear models, so that our method is applicable to various types of data. To obtain a high-scoring causal graph, DARLS simulates an annealing process to search over the space of topological sorts, where the optimal graphical structure compatible with a sort is found by a distributed optimization method. This distributed optimization relies on multiple rounds of communication between local and central machines to estimate the optimal structure. We establish its convergence to a global optimizer of the overall score that is computed on all data across local machines. To the best of our knowledge, DARLS is the first distributed method for learning causal graphs with such theoretical guarantees. Through extensive simulation studies, DARLS has shown competing performance against existing methods on distributed data, and achieved comparable structure learning accuracy and test-data likelihood with competing methods applied to pooled data across all local machines. In a real-world application for modeling protein-DNA binding networks with distributed ChIP-Sequencing data, DARLS also exhibits higher predictive power than other methods, demonstrating a great advantage in estimating causal networks from distributed data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2019

Optimizing regularized Cholesky score for order-based learning of Bayesian networks

Bayesian networks are a class of popular graphical models that encode ca...
research
07/22/2021

Efficient Neural Causal Discovery without Acyclicity Constraints

Learning the structure of a causal graphical model using both observatio...
research
01/17/2020

Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates

Distributed statistical inference has recently attracted immense attenti...
research
11/03/2021

Multi-task Learning of Order-Consistent Causal Graphs

We consider the problem of discovering K related Gaussian directed acycl...
research
08/19/2021

Structure Learning for Directed Trees

Knowing the causal structure of a system is of fundamental interest in m...
research
11/30/2022

Directed Acyclic Graph Structure Learning from Dynamic Graphs

Estimating the structure of directed acyclic graphs (DAGs) of features (...
research
07/19/2022

ReBoot: Distributed statistical learning via refitting Bootstrap samples

In this paper, we study a one-shot distributed learning algorithm via re...

Please sign up or login with your details

Forgot password? Click here to reset