Age-Based Coded Computation for Bias Reduction in Distributed Learning

06/02/2020
by   Emre Ozfatura, et al.
0

Coded computation can be used to speed up distributed learning in the presence of straggling workers. Partial recovery of the gradient vector can further reduce the computation time at each iteration; however, this can result in biased estimators, which may slow down convergence, or even cause divergence. Estimator bias will be particularly prevalent when the straggling behavior is correlated over time, which results in the gradient estimators being dominated by a few fast servers. To mitigate biased estimators, we design a timely dynamic encoding framework for partial recovery that includes an ordering operator that changes the codewords and computation orders at workers over time. To regulate the recovery frequencies, we adopt an age metric in the design of the dynamic encoding scheme. We show through numerical results that the proposed dynamic encoding strategy increases the timeliness of the recovered computations, which as a result, reduces the bias in model updates, and accelerates the convergence compared to the conventional static partial recovery schemes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2020

Coded Distributed Computing with Partial Recovery

Coded computation techniques provide robustness against straggling worke...
research
11/22/2018

Distributed Gradient Descent with Coded Partial Gradient Computations

Coded computation techniques provide robustness against straggling serve...
research
05/24/2018

Polynomially Coded Regression: Optimal Straggler Mitigation via Data Encoding

We consider the problem of training a least-squares regression model on ...
research
11/03/2020

Gradient Coding with Dynamic Clustering for Straggler Mitigation

In distributed synchronous gradient descent (GD) the main performance bo...
research
04/26/2023

Coded matrix computation with gradient coding

Polynomial based approaches, such as the Mat-Dot and entangled polynomia...
research
04/11/2019

Timely-Throughput Optimal Coded Computing over Cloud Networks

In modern distributed computing systems, unpredictable and unreliable in...
research
09/24/2021

A Unified Treatment of Partial Stragglers and Sparse Matrices in Coded Matrix Computation

The overall execution time of distributed matrix computations is often d...

Please sign up or login with your details

Forgot password? Click here to reset