Optimal Load Allocation for Coded Distributed Computation in Heterogeneous Clusters

04/20/2019
by   DaeJin Kim, et al.
0

Recently, coding has been a useful technique to mitigate the effect of stragglers in distributed computing. However, coding in this context has been mainly explored under the assumption of homogeneous workers, although the real-world computing clusters can be often composed of heterogeneous workers that have different computing capabilities. The uniform load allocation without the awareness of heterogeneity possibly causes a significant loss in latency. In this paper, we suggest the optimal load allocation for coded distributed computing with heterogeneous workers. Specifically, we focus on the scenario that there exist workers having the same computing capability, which can be regarded as a group for analysis. We rely on the lower bound on the expected latency and obtain the optimal load allocation by showing that our proposed load allocation achieves the minimum of the lower bound for a sufficiently large number of workers. From numerical simulations, when assuming the group heterogeneity, our load allocation reduces the expected latency by orders of magnitude over the existing load allocation scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/27/2019

Heterogeneity-aware Gradient Coding for Straggler Tolerance

Gradient descent algorithms are widely used in machine learning. In orde...
research
11/22/2017

Combating Computational Heterogeneity in Large-Scale Distributed Computing via Work Exchange

Owing to data-intensive large-scale applications, distributed computatio...
research
09/23/2021

Coded Computation across Shared Heterogeneous Workers with Communication Delay

Distributed computing enables large-scale computation tasks to be proces...
research
04/16/2019

Heterogeneous Coded Computation across Heterogeneous Workers

Coded distributed computing framework enables large-scale machine learni...
research
04/27/2022

Stream Iterative Distributed Coded Computing for Learning Applications in Heterogeneous Systems

To improve the utility of learning applications and render machine learn...
research
01/16/2019

Coded Matrix Multiplication on a Group-Based Model

Coded distributed computing has been considered as a promising technique...
research
04/11/2019

Timely-Throughput Optimal Coded Computing over Cloud Networks

In modern distributed computing systems, unpredictable and unreliable in...

Please sign up or login with your details

Forgot password? Click here to reset