Array BP-XOR Codes for Parallel Matrix Multiplication using Hierarchical Computing

04/25/2019
by   Suayb S. Arslan, et al.
0

This study presents a novel coded computation technique for parallel matrix-matrix product computation using hierarchical compute architectures that outperforms well known previous strategies in terms of total end-to-end execution time. The proposed method uses array codes to achieve this performance by distributing the encoding operation over the cluster (slave) nodes at the expense of increased master-slave communication. The matrix multiplication is performed using MDS array Belief Propagation (BP)-decodable codes based on pure XOR operations. The proposed scheme is shown to be configurable and suited for modern hierarchical compute architectures equipped with multiple nodes, each having multiple, independent and less capable processing units. In addition, to address scaling number of strugglers, asymptotic versions of the code is used and latency analysis is conducted. We shall demonstrate that the proposed scheme achieves order-optimal computation in both the sub-linear as well as the linear regimes in the size of the computed product from an end-to-end delay perspective while ensuring acceptable communication requirements that can be addressed by today's high speed network link infrastructures.

READ FULL TEXT
research
06/26/2018

Hierarchical Coded Computation

Coded computation is a method to mitigate "stragglers" in distributed co...
research
01/21/2020

Serverless Straggler Mitigation using Local Error-Correcting Codes

Inexpensive cloud services, such as serverless computing, are often vuln...
research
05/13/2023

Fully Private Grouped Matrix Multiplication with Colluding Workers

In this paper, we present a novel variation of the coded matrix multipli...
research
11/27/2018

A Unified Coded Deep Neural Network Training Strategy Based on Generalized PolyDot Codes for Matrix Multiplication

This paper has two contributions. First, we propose a novel coded matrix...
research
01/21/2019

Polar Coded Distributed Matrix Multiplication

We propose a polar coding mechanism for distributed matrix multiplicatio...
research
06/26/2018

Exploitation of Stragglers in Coded Computation

In cloud computing systems slow processing nodes, often referred to as "...
research
10/10/2022

Fault-Tolerant Strassen-Like Matrix Multiplication

In this study, we propose a simple method for fault-tolerant Strassen-li...

Please sign up or login with your details

Forgot password? Click here to reset