Communication-Avoiding Optimization Methods for Massive-Scale Graphical Model Structure Learning

10/30/2017 ∙ by Penporn Koanantakool, et al. ∙ 0

Undirected graphical models compactly represent the structure of large, high-dimensional data sets, which are especially important in interpreting complex scientific data. Some data sets may run to multiple terabytes, and current methods are intractable in both memory size and running time. We introduce HP-CONCORD, a highly scalable optimization algorithm to estimate a sparse inverse covariance matrix based on a regularized pseudolikelihood framework. Our parallel proximal gradient method runs across a multi-node cluster and achieves parallel scalability using a novel communication-avoiding linear algebra algorithm. We demonstrate scalability on problems with 1.28 million dimensions (over 800 billion parameters) and show that it can outperform a previous method on a single node and scales to 1K nodes (24K cores). We use HP-CONCORD to estimate the underlying conditional dependency structure of the brain from fMRI data and use the result to automatically identify functional regions. The results show good agreement with a state-of-the-art clustering from the neuroscience literature.



There are no comments yet.


page 23

page 24

page 25

page 26

page 27

page 28

page 29

page 30

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.