Data Replication for Reducing Computing Time inDistributed Systems with Stragglers

12/06/2019
by   Amir Behrouzi-Far, et al.
0

In distributed computing systems with stragglers,various forms of redundancy can improve the average delayperformance. We study the optimal replication of data in systemswhere the job execution time is a stochastically decreasing andconvex random variable. We show that in such systems, theoptimum assignment policy is the balanced replication of disjointbatches of data. Furthermore, for Exponential and Shifted-Exponential service times, we derive the optimum redundancylevels for minimizing both expected value and the variance ofthe job completion time. Our analysis shows that, the optimumredundancy level may not be the same for the two metrics, thusthere is a trade-off between reducing the expected value of thecompletion time and reducing its variance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset