Cuboid Partitioning for Hierarchical Coded Matrix Multiplication
Coded matrix multiplication is a technique to enable straggler-resistant multiplication of large matrices in distributed computing systems. In this paper, we first present a conceptual framework to represent the division of work amongst processors in coded matrix multiplication as a cuboid partitioning problem. This framework allows us to unify existing methods and motivates new techniques. Building on this framework, we apply the idea of hierarchical coding (Ferdinand & Draper, 2018) to coded matrix multiplication. The hierarchical scheme we develop is able to exploit the work completed by all processors (fast and slow), rather than ignoring the slow ones, even if the amount of work completed by stragglers is much less than that completed by the fastest workers. On Amazon EC2, we achieve a 37 finishing time compared to non-hierarchical schemes.
READ FULL TEXT