Fundamental Limits of Distributed Data Shuffling

06/29/2018
by   Kai Wan, et al.
0

Data shuffling of training data among different computing nodes (workers) has been identified as a core element to improve the statistical performance of modern large scale machine learning algorithms. Data shuffling is often considered one of the most significant bottlenecks in such systems due to the heavy communication load. Under a master-worker architecture (where a master has access to the entire dataset and only communications between the master and workers is allowed) coding has been recently proved to considerably reduce the communication load. In this work, we consider a different communication paradigm referred to as distributed data shuffling, where workers, connected by a shared link, are allowed to communicate with one another while no communication between the master and workers is allowed. Under the constraint of uncoded cache placement, we first propose a general coded distributed data shuffling scheme, which achieves the optimal communication load within a factor two. Then, we propose an improved scheme achieving the exact optimality for either large memory size or at most four workers in the system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2018

On the Fundamental Limits of Coded Data Shuffling for Distributed Learning Systems

We consider the data shuffling problem in a distributed learning system,...
research
02/06/2019

CodedReduce: A Fast and Robust Framework for Gradient Aggregation in Distributed Learning

We focus on the commonly used synchronous Gradient Descent paradigm for ...
research
05/16/2021

LocalNewton: Reducing Communication Bottleneck for Distributed Learning

To address the communication bottleneck problem in distributed optimizat...
research
02/26/2019

On Maintaining Linear Convergence of Distributed Learning and Optimization under Limited Communication

In parallel and distributed machine learning multiple nodes or processor...
research
09/04/2022

Communication Efficient Distributed Learning over Wireless Channels

Vertical distributed learning exploits the local features collected by m...
research
04/12/2023

Vers: fully distributed Coded Computing System with Distributed Encoding

Coded computing has proved to be useful in distributed computing. We hav...
research
09/29/2020

A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning

We design a low complexity decentralized learning algorithm to train a r...

Please sign up or login with your details

Forgot password? Click here to reset