CAMR: Coded Aggregated MapReduce

Many big data algorithms executed on MapReduce-like systems have a shuffle phase that often dominates the overall job execution time. Recent work has demonstrated schemes where the communication load in the shuffle phase can be traded off for the computation load in the map phase. In this work, we focus on a class of distributed algorithms, broadly used in deep learning, where intermediate computations of the same task can be combined. Even though prior techniques reduce the communication load significantly, they require a number of jobs that grows exponentially in the system parameters. This limitation is crucial and may diminish the load gains as the algorithm scales. We propose a new scheme which achieves the same load as the state-of-the-art while ensuring that the number of jobs as well as the number of subfiles that the data set needs to be split into remain small.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2019

Resolvable Designs for Speeding up Distributed Computing

Distributed computing frameworks such as MapReduce are often used to pro...
research
02/08/2018

Leveraging Coding Techniques for Speeding up Distributed Computing

Large scale clusters leveraging distributed computing frameworks such as...
research
05/26/2018

A Storage-Computation-Communication Tradeoff for Distributed Computing

This paper investigates distributed computing systems where computations...
research
05/05/2018

Compressed Coded Distributed Computing

Communication overhead is one of the major performance bottlenecks in la...
research
02/01/2018

Distributed Computing with Heterogeneous Communication Constraints: The Worst-Case Computation Load and Proof by Contradiction

We consider a distributed computing framework where the distributed node...
research
03/02/2021

Stream Distributed Coded Computing

The emerging large-scale and data-hungry algorithms require the computat...
research
08/20/2018

Improved Latency-Communication Trade-Off for Map-Shuffle-Reduce Systems with Stragglers

In a distributed computing system operating according to the map-shuffle...

Please sign up or login with your details

Forgot password? Click here to reset