Exploitation of Stragglers in Coded Computation

06/26/2018
by   Shahrzad Kiani, et al.
0

In cloud computing systems slow processing nodes, often referred to as "stragglers", can significantly extend the computation time. Recent results have shown that error correction coding can be used to reduce the effect of stragglers. In this work we introduce a scheme that, in addition to using error correction to distribute mixed jobs across nodes, is also able to exploit the work completed by all nodes, including stragglers. We first consider vector-matrix multiplication and apply maximum distance separable (MDS) codes to small blocks of sub-matrices. The worker nodes process blocks sequentially, working block-by-block, transmitting partial per-block results to the master as they are completed. Sub-blocking allows a more continuous completion process, which thereby allows us to exploit the work of a much broader spectrum of processors and reduces computation time. We then apply this technique to matrix-matrix multiplication using product code. In this case, we show that the order of computing sub-tasks is a new degree of design freedom that can be exploited to reduce computation time further. We propose a novel approach to analyze the finishing time, which is different from typical order statistics. Simulation results show that the expected computation time decreases by a factor of at least two in compared to previous methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2018

Hierarchical Coded Computation

Coded computation is a method to mitigate "stragglers" in distributed co...
research
07/20/2019

Hierarchical Coded Matrix Multiplication

Slow working nodes, known as stragglers, can greatly reduce the speed of...
research
07/20/2019

Cuboid Partitioning for Hierarchical Coded Matrix Multiplication

Coded matrix multiplication is a technique to enable straggler-resistant...
research
11/05/2020

Straggler Mitigation through Unequal Error Protection for Distributed Matrix Multiplication

Large-scale machine learning and data mining methods routinely distribut...
research
12/08/2019

Improved Algoritms in Parallel Evaluation of Large Cryptographic S-Box

Nowadays computational complexity of fast walsh hadamard transform and n...
research
12/07/2020

Gradient-based Automatic Look-Up Table Generator for Atmospheric Radiative Transfer Models

Atmospheric correction of Earth Observation data is one of the most crit...
research
04/25/2019

Array BP-XOR Codes for Parallel Matrix Multiplication using Hierarchical Computing

This study presents a novel coded computation technique for parallel mat...

Please sign up or login with your details

Forgot password? Click here to reset