Numerically Stable Polynomially Coded Computing

by   Mohammad Fahim, et al.

We study the numerical stability of polynomial based encoding methods, which has emerged to be a powerful class of techniques for providing straggler and fault tolerance in the area of coded computing. Our contributions are as follows: 1) We construct new codes for matrix multiplication that achieve the same fault/straggler tolerance as the previously constructed MatDot codes. Unlike previous codes that use polynomials expanded in a monomial basis, our codes uses a basis of orthogonal polynomials. 2) We show that the condition number of every m × m sub-matrix of an m × n, n ≥ m Chebyshev-Vandermonde matrix, evaluated on the n-point Chebyshev grid, grows as O(n^2(n-m)) for n > m. An implication of this result is that, when Chebyshev-Vandermonde matrices are used for coded computing, for a fixed number of redundant nodes s=n-m, the condition number grows at most polynomially in the number of nodes n. 3) By specializing our orthogonal polynomial based constructions to Chebyshev polynomials, and using our condition number bound for Chebyshev-Vandermonde matrices, we construct new numerically stable techniques for coded matrix multiplication. We empirically demonstrate that our constructions have significantly lower numerical errors compared to previous approaches which involve inversion of Vandermonde matrices. We generalize our constructions to explore the trade-off between computation/communication and fault-tolerance. 4) We propose a numerically stable specialization of Lagrange coded computing. Motivated by our condition number bound, our approach involves the choice of evaluation points and a suitable decoding procedure that involves inversion of an appropriate Chebyshev-Vandermonde matrix. Our approach is demonstrated empirically to have lower numerical errors as compared to standard methods.


Folded Polynomial Codes for Coded Distributed AA^⊤-Type Matrix Multiplication

In this paper, due to the important value in practical applications, we ...

Numerically stable coded matrix computations via circulant and rotation matrix embeddings

Several recent works have used coding-theoretic ideas for mitigating the...

Random Khatri-Rao-Product Codes for Numerically-Stable Distributed Matrix Multiplication

We propose a class of codes called random Khatri-Rao-Product (RKRP) code...

Numerically Stable Binary Coded Computations

This paper addresses the gradient coding and coded matrix multiplication...

A Unified Coded Deep Neural Network Training Strategy Based on Generalized PolyDot Codes for Matrix Multiplication

This paper has two contributions. First, we propose a novel coded matrix...

Collaborative Decoding of Polynomial Codes for Distributed Computation

We show that polynomial codes (and some related codes) used for distribu...

Coded Computing with Noise

Distributed computation is a framework used to break down a complex comp...

Please sign up or login with your details

Forgot password? Click here to reset