Scheduling Stochastic Real-Time Coflows in Unreliable Computing Machines
We consider a distributed computing network consisting of a master machine and multiple computing machines. The master machine is running multiple jobs. Each job stochastically generates real-time coflows with a strict coflows' deadline. While a coflow is a collection of tasks that can be processed by corresponding computing machines, it is completed only when all its tasks are completed within the deadline. Moreover, we consider unreliable computing machines, whose processing speed is uncertain but is limited. Because of the limited processing abilities of the computing machines, an algorithm for scheduling coflows in the unreliable computing machines is critical to maximize the average number of completed coflows for each job. In this paper, we develop two scheduling algorithms, namely, a feasibility-optimal scheduling algorithm and an approximate scheduling algorithm. The feasibility-optimal scheduling algorithm can fulfill the largest region of jobs' requirements for the average number of completed coflows. However, the feasibility-optimal scheduling algorithm suffers from high computational complexity when the number of jobs is large. To address the issue, the approximate scheduling algorithm is proposed with a guaranteed approximation ratio in the worst-case scenario. The approximate scheduling algorithm is also validated in the average-case scenario via computer simulations.
READ FULL TEXT