Coded Computing via Binary Linear Codes: Designs and Performance Limits

03/02/2021
by   Mahdi Soleymani, et al.
0

We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into k smaller tasks, encoded using an (n,k) linear code, and performed over n distributed nodes. The goal is to reduce the average execution time of the computational job. We provide a connection between the problem of characterizing the average execution time of a coded distributed computing system and the problem of analyzing the error probability of codes of length n used over erasure channels. Accordingly, we present closed-form expressions for the execution time using binary random linear codes and the best execution time any linear-coded distributed computing system can achieve. It is also shown that there exist good binary linear codes that not only attain (asymptotically) the best performance that any linear code (not necessarily binary) can achieve but also are numerically stable against the inevitable rounding errors in practice. We then develop a low-complexity algorithm for decoding Reed-Muller (RM) codes over erasure channels. Our decoder only involves additions and subtractions and enables coded computation over real-valued data. Extensive numerical analysis of the fundamental results as well as RM- and polar-coded computing schemes demonstrate the excellence of the RM-coded computation in achieving close-to-optimal performance while having a low-complexity decoding and explicit construction. The proposed framework in this paper enables efficient designs of distributed computing systems given the rich literature in the channel coding theory.

READ FULL TEXT
research
06/24/2019

Coded Distributed Computing: Performance Limits and Code Designs

We consider the problem of coded distributed computing where a large lin...
research
09/08/2021

Computational Polarization: An Information-theoretic Method for Resilient Computing

We introduce an error resilient distributed computing method based on an...
research
02/07/2021

Coded Computing with Noise

Distributed computation is a framework used to break down a complex comp...
research
01/25/2022

Polar Coded Computing: The Role of the Scaling Exponent

We consider the problem of coded distributed computing using polar codes...
research
09/01/2023

Randomized Polar Codes for Anytime Distributed Machine Learning

We present a novel distributed computing framework that is robust to slo...
research
05/22/2020

Autonomous Task Dropping Mechanism to Achieve Robustness in Heterogeneous Computing Systems

Robustness of a distributed computing system is defined as the ability t...
research
02/08/2018

Leveraging Coding Techniques for Speeding up Distributed Computing

Large scale clusters leveraging distributed computing frameworks such as...

Please sign up or login with your details

Forgot password? Click here to reset