Optimal Communication-Computation Trade-Off in Heterogeneous Gradient Coding

03/02/2021
by   Tayyebeh Jahani-Nezhad, et al.
0

Gradient coding allows a master node to derive the aggregate of the partial gradients, calculated by some worker nodes over the local data sets, with minimum communication cost, and in the presence of stragglers. In this paper, for gradient coding with linear encoding, we characterize the optimum communication cost for heterogeneous distributed systems with arbitrary data placement, with s ∈ℕ stragglers and a ∈ℕ adversarial nodes. In particular, we show that the optimum communication cost, normalized by the size of the gradient vectors, is equal to (r-s-2a)^-1, where r ∈ℕ is the minimum number that a data partition is replicated. In other words, the communication cost is determined by the data partition with the minimum replication, irrespective of the structure of the placement. The proposed achievable scheme also allows us to target the computation of a polynomial function of the aggregated gradient matrix. It also allows us to borrow some ideas from approximation computing and propose an approximate gradient coding scheme for the cases when the repetition in data placement is smaller than what is needed to meet the restriction imposed on communication cost or when the number of stragglers appears to be more than the presumed value in the system design.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/19/2021

On Gradient Coding with Partial Recovery

We consider a generalization of the recently proposed gradient coding fr...
07/01/2020

Distributed Linearly Separable Computation

This paper formulates a distributed computation problem, where a master ...
05/22/2019

LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning

Gradient-based distributed learning in Parameter Server (PS) computing a...
10/27/2017

Near-Optimal Straggler Mitigation for Distributed Gradient Methods

Modern learning algorithms use gradient descent updates to train inferen...
06/08/2020

Adaptive Gradient Coding

This paper focuses on mitigating the impact of stragglers in distributed...
01/17/2022

Universal Coded Distributed Computing For MapReduce Frameworks

Coded distributed computing (CDC) can trade extra computing power to red...
05/06/2021

Coded Gradient Aggregation: A Tradeoff Between Communication Costs at Edge Nodes and at Helper Nodes

The increasing amount of data generated at the edge/client nodes and the...