Fast Parallel Bayesian Network Structure Learning

12/08/2022
by   Jiantong Jiang, et al.
0

Bayesian networks (BNs) are a widely used graphical model in machine learning for representing knowledge with uncertainty. The mainstream BN structure learning methods require performing a large number of conditional independence (CI) tests. The learning process is very time-consuming, especially for high-dimensional problems, which hinders the adoption of BNs to more applications. Existing works attempt to accelerate the learning process with parallelism, but face issues including load unbalancing, costly atomic operations and dominant parallel overhead. In this paper, we propose a fast solution named Fast-BNS on multi-core CPUs to enhance the efficiency of the BN structure learning. Fast-BNS is powered by a series of efficiency optimizations including (i) designing a dynamic work pool to monitor the processing of edges and to better schedule the workloads among threads, (ii) grouping the CI tests of the edges with the same endpoints to reduce the number of unnecessary CI tests, (iii) using a cache-friendly data storage to improve the memory efficiency, and (iv) generating the conditioning sets on-the-fly to avoid extra memory consumption. A comprehensive experimental study shows that the sequential version of Fast-BNS is up to 50 times faster than its counterpart, and the parallel version of Fast-BNS achieves 4.8 to 24.5 times speedup over the state-of-the-art multi-threaded solution. Moreover, Fast-BNS has a good scalability to the network size as well as sample size. Fast-BNS source code is freely available at https://github.com/jjiantong/FastBN.

READ FULL TEXT
research
12/08/2022

Fast Parallel Exact Inference on Bayesian Networks: Poster

Bayesian networks (BNs) are attractive, because they are graphical and i...
research
09/09/2020

tsBNgen: A Python Library to Generate Time Series Data from an Arbitrary Dynamic Bayesian Network Structure

Synthetic data is widely used in various domains. This is because many m...
research
11/01/2022

Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism

The accuracy of AlphaFold2, a frontier end-to-end structure prediction s...
research
12/20/2018

cuPC: CUDA-based Parallel PC Algorithm for Causal Structure Learning on GPU

The main goal in many fields in empirical sciences is to discover causal...
research
03/02/2022

FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours

Protein structure prediction is an important method for understanding ge...
research
08/26/2020

Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA

The dedicated memory of hardware accelerators can be insufficient to sto...
research
03/30/2022

A Fast Transformer-based General-Purpose Lossless Compressor

Deep-learning-based compressor has received interests recently due to mu...

Please sign up or login with your details

Forgot password? Click here to reset