BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning

11/08/2021
by   Bicheng Ying, et al.
25

Decentralized algorithm is a form of computation that achieves a global goal through local dynamics that relies on low-cost communication between directly-connected agents. On large-scale optimization tasks involving distributed datasets, decentralized algorithms have shown strong, sometimes superior, performance over distributed algorithms with a central node. Recently, developing decentralized algorithms for deep learning has attracted great attention. They are considered as low-communication-overhead alternatives to those using a parameter server or the Ring-Allreduce protocol. However, the lack of an easy-to-use and efficient software package has kept most decentralized algorithms merely on paper. To fill the gap, we introduce BlueFog, a python library for straightforward, high-performance implementations of diverse decentralized algorithms. Based on a unified abstraction of various communication operations, BlueFog offers intuitive interfaces to implement a spectrum of decentralized algorithms, from those using a static, undirected graph for synchronous operations to those using dynamic and directed graphs for asynchronous operations. BlueFog also adopts several system-level acceleration techniques to further optimize the performance on the deep learning tasks. On mainstream DNN training tasks, BlueFog reaches a much higher throughput and achieves an overall 1.2×∼ 1.8× speedup over Horovod, a state-of-the-art distributed deep learning package based on Ring-Allreduce. BlueFog is open source at https://github.com/Bluefog-Lib/bluefog.

READ FULL TEXT

page 13

page 16

page 27

research
06/14/2023

A^2CiD^2: Accelerating Asynchronous Communication in Decentralized Deep Learning

Distributed training of Deep Learning models has been critical to many r...
research
03/24/2022

Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning

Distributed training algorithms of deep neural networks show impressive ...
research
04/29/2020

Caramel: Accelerating Decentralized Distributed Deep Learning with Computation Scheduling

The method of choice for parameter aggregation in Deep Neural Network (D...
research
11/06/2019

DISROPT: a Python Framework for Distributed Optimization

In this paper we introduce DISROPT, a Python package for distributed opt...
research
03/06/2021

Decentralized Langevin Dynamics over a Directed Graph

The prevalence of technologies in the space of the Internet of Things an...
research
01/24/2019

Asynchronous Decentralized Optimization in Directed Networks

A popular asynchronous protocol for decentralized optimization is random...
research
10/05/2022

Personalized Decentralized Bilevel Optimization over Stochastic and Directed Networks

While personalization in distributed learning has been extensively studi...

Please sign up or login with your details

Forgot password? Click here to reset