Coded Distributed Computing: Performance Limits and Code Designs

06/24/2019
by   Mohammad Vahid Jamali, et al.
0

We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into k smaller tasks, encoded using an (n,k) linear code, and performed over n distributed nodes. The goal is to reduce the average execution time of the computational job. We provide a connection between the problem of characterizing the average execution time of a coded distributed computing system and the problem of analyzing the error probability of codes of length n used over erasure channels. Accordingly, we present closed-form expressions for the execution time using binary random linear codes and the best execution time any linear-coded distributed computing system can achieve. It is also shown that there exist good binary linear codes that attain, asymptotically, the best performance any linear code, not necessarily binary, can achieve. We also investigate the performance of coded distributed computing systems using polar and Reed-Muller (RM) codes that can benefit from low-complexity decoding, and superior performance, respectively, as well as explicit constructions. The proposed framework in this paper can enable efficient designs of distributed computing systems given the rich literature in the channel coding theory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2021

Coded Computing via Binary Linear Codes: Designs and Performance Limits

We consider the problem of coded distributed computing where a large lin...
research
01/25/2022

Polar Coded Computing: The Role of the Scaling Exponent

We consider the problem of coded distributed computing using polar codes...
research
01/21/2019

Polar Coded Distributed Matrix Multiplication

We propose a polar coding mechanism for distributed matrix multiplicatio...
research
02/07/2021

Coded Computing with Noise

Distributed computation is a framework used to break down a complex comp...
research
09/08/2021

Computational Polarization: An Information-theoretic Method for Resilient Computing

We introduce an error resilient distributed computing method based on an...
research
05/22/2020

Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing Systems

Robustness of a distributed computing system is defined as the ability t...
research
09/17/2018

C^3LES: Codes for Coded Computation that Leverage Stragglers

In distributed computing systems, it is well recognized that worker node...

Please sign up or login with your details

Forgot password? Click here to reset