Computing Redundancy in Blocking Systems: Fast Service or No Service
Redundancy in distributed computing systems reduces job completion time. It is widely employed in practice and studied in theory for queuing systems, often in a low-traffic regime where queues remain empty. Motivated by emerging edge systems, this paper initiates a study of using redundancy in blocking systems. Edge nodes often operate in highly unpredictable environments, and replicating job execution improves the job mean execution time. However, directing more resources to some computing jobs will block (and pass to the cloud) the execution of others. We evaluate the system performance using two metrics: job computing time and job blocking probability. We show that the job computing time decreases with increasing replication factor but so does the job blocking probability. Therefore, there is a tradeoff between job computing time and blocking probability. Interestingly, some minimal replication significantly reduces computing time with almost no blocking probability change. This paper proposes the system service rate as a new combined metric to evaluate the tradeoff and a single system's performance indicator.
READ FULL TEXT