Improving the performance of heterogeneous data centers through redundancy

03/03/2020
by   E Anton, et al.
0

We analyze the performance of redundancy in a multi-type job and multi-type server system. We assume the job dispatcher is unaware of the servers' capacities, and we set out to study under which circumstances redundancy improves the performance. With redundancy an arriving job dispatches redundant copies to all its compatible servers, and departs as soon as one of its copies completes service. As a benchmark comparison, we take the non-redundant system in which a job arrival is routed to only one randomly selected compatible server. Service times are generally distributed and all copies of a job are identical, i.e., have the same service requirement. In our first main result, we characterize the sufficient and necessary stability conditions of the redundancy system. This condition coincides with that of a system where each job type only dispatches copies into its least-loaded servers, and those copies need to be fully served. In our second result, we compare the stability regions of the system under redundancy to that of no redundancy. We show that if the server's capacities are sufficiently heterogeneous, the stability region under redundancy can be much larger than that without redundancy. We apply the general solution to particular classes of systems, including redundancy-d and nested models, to derive simple conditions on the degree of heterogeneity required for redundancy to improve the stability. As such, our result is the first in showing that redundancy can improve the stability and hence performance of a system when copies are non-i.i.d..

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/21/2022

Efficient scheduling in redundancy systems with general service times

We characterize the impact of scheduling policies on the mean response t...
research
04/21/2021

Stability and Optimization of Speculative Queueing Networks

We provide a queueing-theoretic framework for job replication schemes ba...
research
03/17/2021

A Survey of Stability Results for Redundancy Systems

Redundancy mechanisms consist in sending several copies of a same job to...
research
05/28/2021

Parallel server systems with cancel-on-completion redundancy

We consider a parallel server system with so-called cancel-on-completion...
research
03/24/2021

Comparison of the FCFS and PS discipline in Redundancy Systems

We consider the c.o.c. redundancy system with N parallel servers where i...
research
04/30/2020

A Lower Bound on the stability region of Redundancy-d with FIFO service discipline

Redundancy-d (R(d)) is a load balancing method used to route incoming jo...
research
06/09/2019

Partial Server Pooling in Redundancy Systems

Partial sharing allows providers to possibly pool a fraction of their re...

Please sign up or login with your details

Forgot password? Click here to reset