Computing Redundancy in Blocking Systems: Fast Service or No Service

03/01/2023
by   Pei Peng, et al.
0

Redundancy in distributed computing systems reduces job completion time. It is widely employed in practice and studied in theory for queuing systems, often in a low-traffic regime where queues remain empty. Motivated by emerging edge systems, this paper initiates a study of using redundancy in blocking systems. Edge nodes often operate in highly unpredictable environments, and replicating job execution improves the job mean execution time. However, directing more resources to some computing jobs will block (and pass to the cloud) the execution of others. We evaluate the system performance using two metrics: job computing time and job blocking probability. We show that the job computing time decreases with increasing replication factor but so does the job blocking probability. Therefore, there is a tradeoff between job computing time and blocking probability. Interestingly, some minimal replication significantly reduces computing time with almost no blocking probability change. This paper proposes the system service rate as a new combined metric to evaluate the tradeoff and a single system's performance indicator.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2019

Data Replication for Reducing Computing Time in Distributed Systems with Stragglers

In distributed computing systems with stragglers, various forms of redun...
research
12/06/2019

Data Replication for Reducing Computing Time inDistributed Systems with Stragglers

In distributed computing systems with stragglers,various forms of redund...
research
10/05/2020

Diversity/Parallelism Trade-off in Distributed Systems with Redundancy

As numerous machine learning and other algorithms increase in complexity...
research
06/03/2020

Efficient Replication for Straggler Mitigation in Distributed Computing

Master-worker distributed computing systems use task replication in orde...
research
10/19/2022

Fries: Fast and Consistent Runtime Reconfiguration in Dataflow Systems with Transactional Guarantees (Extended Version)

A computing job in a big data system can take a long time to run, especi...
research
06/05/2018

Blocking time under basic priority inheritance: Polynomial bound and exact computation

The Priority Inheritance Protocol (PIP) is arguably the best-known proto...
research
05/24/2023

Workrs: Fault Tolerant Horizontal Computation Offloading

The broad development and usage of edge devices has highlighted the impo...

Please sign up or login with your details

Forgot password? Click here to reset