Decentralized Learning Made Easy with DecentralizePy

04/17/2023
by   Akash Dhasade, et al.
0

Decentralized learning (DL) has gained prominence for its potential benefits in terms of scalability, privacy, and fault tolerance. It consists of many nodes that coordinate without a central server and exchange millions of parameters in the inherently iterative process of machine learning (ML) training. In addition, these nodes are connected in complex and potentially dynamic topologies. Assessing the intricate dynamics of such networks is clearly not an easy task. Often in literature, researchers resort to simulated environments that do not scale and fail to capture practical and crucial behaviors, including the ones associated to parallelism, data transfer, network delays, and wall-clock time. In this paper, we propose DecentralizePy, a distributed framework for decentralized ML, which allows for the emulation of large-scale learning networks in arbitrary topologies. We demonstrate the capabilities of DecentralizePy by deploying techniques such as sparsification and secure aggregation on top of several topologies, including dynamic networks with more than one thousand nodes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance

Distributed Machine Learning refers to the practice of training a model ...
research
09/11/2023

Practical Homomorphic Aggregation for Byzantine ML

Due to the large-scale availability of data, machine learning (ML) algor...
research
07/29/2023

The effect of network topologies on fully decentralized learning: a preliminary investigation

In a decentralized machine learning system, data is typically partitione...
research
03/02/2022

UAV-Aided Decentralized Learning over Mesh Networks

Decentralized learning empowers wireless network devices to collaborativ...
research
04/06/2022

DeFTA: A Plug-and-Play Decentralized Replacement for FedAvg

Federated learning (FL) is identified as a crucial enabler for large-sca...
research
02/14/2019

Distributed Processes and Scalability in Sub-networks of Large-Scale Networks

Performance of standard processes over large distributed networks typica...
research
03/12/2020

Customized Video QoE Estimation with Algorithm-Agnostic Transfer Learning

The development of QoE models by means of Machine Learning (ML) is chall...

Please sign up or login with your details

Forgot password? Click here to reset