Scheduling Stochastic Real-Time Jobs in Unreliable Workers

10/02/2019 ∙ by Yu-Pin Hsu, et al. ∙ 0

We consider a distributed computing network consisting of a master and multiple workers processing tasks of different types. The master is running multiple applications. Each application stochastically generates real-time jobs with a strict job deadline, where each job is a collection of tasks of some types specified by the application. A real-time job is completed only when all its tasks are completed by the corresponding workers within the deadline. Moreover, we consider unreliable workers, whose processing speeds are uncertain. Because of the limited processing abilities of the workers, an algorithm for scheduling the jobs in the workers is needed to maximize the average number of completed jobs for each application. The scheduling problem is not only critical but also practical in distributed computing networks. In this paper, we develop two scheduling algorithms, namely, a feasibility-optimal scheduling algorithm and an approximate scheduling algorithm. The feasibility-optimal scheduling algorithm can fulfill the largest region of applications' requirements for the average number of completed jobs. However, the feasibility-optimal scheduling algorithm suffers from high computational complexity when the number of applications is large. To address the issue, the approximate scheduling algorithm is proposed with a guaranteed approximation ratio in the worst-case scenario. The approximate scheduling algorithm is also validated in the average-case scenario via computer simulations.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Fig. 1: A distributed computing network with two () applications and and four () workers , , , and . At the beginning of frame 1, application generates job with tasks for workers and , and application generates job with tasks for workers , , and .

Distributed computing networks (such as MapReduce [4]) become increasingly popular to support data-intensive jobs. The underlying idea to process a data-intensive job is to divide the job into a group of small tasks that can be processed in parallel by multiple workers. In general, a worker can be specialized to process a type of tasks. For example, MapReduce allows an application to specify its computing network. Another outstanding example is distributed computing networks for massive multiplayer online games [16]. The online game system illustrated in Fig. 1 includes one master and four workers processing different types of tasks. The master is serving two players. While the present job of player 1 needs two types of workers to get completed, that of player 2 needs three types of workers.

Moreover, because of the real-time nature of latency-intensive applications (e.g., online games), a real-time job needs to be completed in a deadline. To maximize the number of jobs that meet the deadline, a scheduling algorithm allocating workers to jobs is needed. Job-level scheduling poses more challenges than packet-level scheduling. That is because all tasks in a job are dependent in the sense that a job is not completed until all its tasks are completed, but all packets or tasks in traditional packet-based networks are independently treated.

Most prior research on job-level scheduling considered general-purpose workers. The closest scenario to ours (i.e., specialized workers) is the coflow model proposed in [2], where a coflow is a job consisting of tasks of various types. Since the coflow model was proposed, coflow scheduling has been a hot topic, e.g. [3, 11, 14, 15, 8]. See the recent survey paper [17]. However, almost all prior research on the coflow scheduling focused on deterministic networks; in contrast, little attention was given to stochastic networks. Note that a job can be randomly generated; moreover, a worker can be unreliable because of unpredictable events [1] like hardware failures. Because of the practical issues, a scheduling algorithm for stochastic real-time jobs in unreliable workers is crucial in distributed computing networks. The most relevant works to ours are [12, 10]. While [12] focused on homogeneous stochastic jobs in the coflow model, [10] extended to a heterogeneous case. The fundamental difference between those relevant works and ours is that we consider stochastic real-time jobs and unreliable workers.

In this paper, we consider a master and

specialized workers. The master is running multiple applications, which stochastically generate real-time jobs with a hard deadline. The workers are unreliable. Our main contribution lies in developing job scheduling algorithms with provable performance guarantees. Leveraging Lyapunov techniques, we propose a feasibility-optimal scheduling algorithm for maximizing the region of achievable requirements for the average number of completed jobs. However, the feasibility-scheduling algorithm turns out to involve an NP-hard combinatorial optimization problem. To tackle the computational issue, we propose an approximate scheduling algorithm that is computationally tractable; furthermore, prove that its region of achievable requirements shrinks by a factor of at most

from the largest one. More surprisingly, our simulation results show that the region of achievable requirements by the approximate scheduling algorithm is close to the largest one.

Ii System overview

Ii-a Network model

Consider a distributed computing network consisting of a master and specialized workers . The master is running  applications . Fig. 1 illustrates an example network with and . Suppose that data transfer between the master and the workers occurs instantaneously with no error. Note that the prior works on the coflow model focused on the time for data transfer. To investigate the unreliability of the workers, we ignore the time for data transfer; instead, focus on the time for computation.

Divide time into frames and index them by . At the beginning of each frame , each application stochastically

generates a job, where a job is a collection of tasks that can be processed by the corresponding workers. Precisely, we use vector

to represent the job generated by application in frame , where each element indicates if the job has a task for worker : if , then the job has a task for worker ; otherwise, it does not. See Fig. 1 for example. Each task is also stochastically generated, i.e.,

is a random variable for all

, , and . By we denote the number of 1’s in vector ; in particular, if , then application generates no job in frame 

. Suppose that the probability distribution of random variable

is independently and identically distributed (i.i.d.) over frame , for all and . Suppose that the tasks generated by application for worker have the same workload. See Remark 15 later for time-varying workloads. Moreover, the jobs need real-time computations. Suppose that the deadline for each job is one frame. The real-time system has been justified in the literature, e.g., see [7].

Consider a time-varying processing speed for each worker. Suppose that the processing speed of each worker is i.i.d. over frames. With the i.i.d. assumption along with those constant workloads, we can assume that a task generated by application can be completed by worker (i.e., when ) with a constant probability over frames. At the end of each frame, each worker reports if its task is completed in that frame. A job is completed only when all its tasks are completed in the arriving frame. If any task of a job cannot be completed in the arriving frame, the job expires and is removed from the application.

Unaware of the completion of a task at the beginning of each frame, we suppose that the master assigns at most one task to a worker for each frame. If two jobs and , for some and , need the same worker in frame , i.e., for some , then we say the two jobs have interference. For example, jobs and in Fig. 1 have the interference.

As a result of the interference, the master has to decide a set of interference-free jobs for computing in each frame. Let be the set of interference-free jobs decided for computing in frame . For example, decision in Fig. 1 can be either or . If in Fig. 1, then workers and are allocated to job in frame ; moreover, job is completed only when the two workers complete their respective tasks in frame 1. A scheduling algorithm is a time sequence of the decisions for all frames.

Ii-B Problem formulation

Let random variable indicate if job is completed in frame  under scheduling algorithm , where if job is generated (i.e., ) and all tasks of the job are completed by the corresponding workers in frame ; otherwise. The random variable depends on the random variables , the task completion probabilities , and a potential randomized scheduling algorithm .

We define the average number of completed jobs for application under scheduling algorithm by

(1)

Let vector  represent an applications’ requirement for the average numbers of completed jobs. We say that requirement  can be fulfilled (or achieved) by scheduling algorithm  if for all . Moreover, We refer to requirement  as a feasible requirement if there exists a scheduling algorithm that can fulfill the requirement. We define the maximum feasibility region as follows.

Definition 1.

The maximum feasibility region is the (-dimensional) region consisting of all feasible requirements .

We define an optimal scheduling algorithm as follows.

Definition 2.

A scheduling algorithm  is called a feasibility-optimal111The feasibility-optimal scheduling defined in this paper is analogy to the throughput-optimal scheduling (e.g., [13]) or the timely-throughput-optimal scheduling (e.g., [7]). scheduling algorithm if, for any requirement  interior222We say that requirement is interior of the region if there exists an such that lies in the region . The concept of the strictly feasible requirement has been widely used in the throughput-optimal scheduling or timely-throughput-optimal scheduling. of , it can be fulfilled by the scheduling algorithm .

The goal of this paper is to devise a feasibility-optimal scheduling algorithm.

Iii Scheduling algorithm design

In this section, we develop a feasibility-optimal scheduling algorithm for managing the stochastic real-time jobs in the unreliable masters. To that end, we introduce a virtual queueing network in Section III-A. With the assistance of the virtual queueing network, we propose a feasibility-optimal scheduling design in Section III-B. However, the proposed feasibility-optimal scheduling algorithm involves a combinatorial optimization problem. We show that the combinatorial optimization problem is NP-hard. Thus, we develop a tractable approximate scheduling algorithm in Section III-C; meanwhile, we establish its approximation ratio.

Iii-a Virtual queueing network

Given the distributed computing network with scheduling algorithm  and requirement , we construct a virtual queueing network. The virtual queueing network consists of  queues , operating under the same frame system as that in Section II-A. For example, Fig. 2 is the virtual queueing network for the distributed computing network in Fig. 1. We want to emphasize that the virtual queueing network is not a real-world network. It is introduced for the scheduling design in Section III-B.

Fig. 2: Virtual queueing network for the distributed computing network in Fig. 1 with scheduling algorithm  and requirement .

At the beginning of each frame , a fixed number333The virtual queueing network has a fractional number of packets. of packets arrive at queue . At the end of frame , queue can remove packet, i.e., if job is completed in frame , then queue can remove one packet at the end of frame ; otherwise, it removes no packet in frame . Again, note that those packets are not real-word packets. We summarize the packet arrival rate and the packet service rate as follows.

Proposition 3.

The packet arrival rate for queue  is , and the packet service rate for queue  is .

Proof.

The packet arrival rate for is . The packet service rate for is . ∎

Let be the queue size at queue at the beginning (before new packet arrival) of frame . Then, the queueing dynamics of queue can be expressed by . Let be the vector of all queue sizes at the beginning of frame . We define the notion of a stable queue in Definition 4, followed by introducing a necessary condition for the stable queue in Proposition 5.

Definition 4.

Queue is stable if the average total queue size is finite.

Proposition 5 ([5], Lemma 3.6).

If queue is stable, then its packet service rate is greater than or equal to its packet arrival rate.

By Propositions 3 and 5, we can turn our attention to developing a scheduling algorithm such that, for any requirement  interior of , all queues in the virtual queueing network are stable.

We want to emphasize that, unlike traditional stochastic networks (e.g., [13, 7]), each packet in our virtual queueing network can be removed only when all associated tasks are completed in its arriving frame. Thus, our paper generalizes to stochastic networks with multiple required servers; in particular, we develop a tractable approximate scheduling algorithm for the scenario in Section III-C.

Iii-B Feasibility-optimal scheduling algorithm

/* At the beginning of frame , perform as follows: */
1   for all ;
/* At the beginning of each frame , perform as follows: */
2    for all ;
3 Perform a set of interference-free jobs such that
(2)
is maximized;
/* At the end of each frame , perform as follows: */
4 for  to  do
5       if , , and all its workers complete their respective tasks then
6             ;
7            
8       end if
9      
10 end for
Algorithm 1 Feasibility-optimal scheduling algorithm.

In this section, we propose a feasibility-optimal scheduling algorithm in Alg. 1. At the beginning of frame 1, Alg. 1 (in Line 1) initializes all queue sizes to be zeros. At the beginning of each frame , Alg. 1 (in Line 1) updates each queue with the new arriving  packets; then, Alg. 1 (in Line 1) decides for that frame according to the present queue size vector . The decision is made for maximizing the weighted sum of the queue sizes in Eq. (2). The term in Eq. (2) calculates the expected packet service rate for , where the indicator function indicates if job is generated in frame , and if so, that job can be completed with probability . The underlying idea of Alg. 1 is to remove as many packets from the virtual queueing network as possible (for stabilizing all queues).

After performing the decision , Alg. 1 (in Line 1) updates each at the end of frame : if job is scheduled, the job is indeed generated, and all its required workers complete their respective tasks, then one packet is removed from queue  in the virtual queuing network.

Example 6.

Take Figs. 1 and 2 for example. Suppose that and for all , and . According to Line 1, Alg. 1 calculates and . Thus, Alg. 1 decides to compute for frame 1. If workers , , and in Fig. 1 can complete their respective tasks in frame 1, then one packet is removed from queue in Fig. 2 at the end of frame 1, i.e., queue has packet at the end of frame 1.

Leveraging Lyapunov techniques [13], we can establish the optimality of Alg. 1 in the following.

Theorem 7.

Alg. 1 is a feasibility-optimal scheduling algorithm.

Proof.

Let vector represent the state of the virtual queueing network in frame . Note that the state changes over frames but its probability distribution is i.i.d., according to the assumption in Section II-A. Following the standard argument of the Lyapunov theory in [13, Chapter 4] along with the i.i.d. property of the state, we can prove that for any requirement  interior of , all queues in the virtual queueing network (associated with Alg. 1) are stable. That is, Alg. 1 can fulfill the requirement  by Propositions 3 and 5. Thus, Alg. 1 is feasibility-optimal. ∎

Note that Alg. 1 involves a combinatorial optimization problem in Line 1. In the next section, we will investigate the computational complexity for solving the combinatorial optimization problem.

Iii-C Tractable approximate scheduling algorithm

We show (in the next lemma) that the combinatorial optimization problem in Line 1 of Alg. 1 is NP-hard. Therefore, Alg. 1 is computationally intractable.

Lemma 8.

The combinatorial optimization problem in Alg. 1 in frame  is NP-hard, for all .

Proof.

We construct a reduction from the set packing problem [6]. See Appendix A for details. ∎

/* At the beginning of frame , perform as follows: */
1   for all ;
/* At the beginning of each frame , perform as follows: */
2   for all ;
3 ;
4 ;
5 Sort all jobs according to the values of
(3)
to obtain the sorted jobs ;
6 for  to  do
7       if  and  then
8             ;
9             ;
10            
11       end if
12      
13 end for
14Perform the decision ;
/* At the end of each frame , perform as follows: */
15 for  to  do
16       if , , and all its workers complete their respective tasks then
17             ;
18            
19       end if
20      
21 end for
Algorithm 2 Approximate scheduling algorithm.

To study the NP-hard problem, we define two notions of approximation ratios as follows. While Definition 9 studies the resulting value in Eq. (2), Definition 10 investigates the resulting region of achievable requirements.

Definition 9.

Given queue size vector in frame . Let be the value in Eq. (2) computed by Alg. 1 in frame . Let be the value in Eq. (2) computed by scheduling algorithm  in frame . Then, the scheduling algorithm  is called a -approximate scheduling algorithm to Eq. (2) if for all possible and .

Definition 10.

A scheduling algorithm  is called a -approximate scheduling algorithm to if, for any requirement interior of , requirement can be fulfilled by the scheduling algorithm .

In this paper, we propose an approximate scheduling algorithm in Alg. 2. The procedure of Alg. 2 is similar to that of Alg. 1; hence, we point out key differences in the following.

Unlike Alg. 1 solving the combinatorial optimization problem, Alg. 2 (in Line 2) simply sorts all jobs according to the values computed by Eq. (3). Let (in Line 2) denote the sorted jobs in frame  in descending order of the values from Eq. (3). In addition, let (in Line 2) indicate if job has a task for worker in frame . While the numerator of Eq. (3) indicates the weight in Eq. (2) for job , the denominator of that reflects the maximum number of jobs interfered by job . The underlying idea of Alg. 2 is to consider jobs in order, for achieving a higher value of Eq. (2) and at the same time keeping the interference as low as possible.

More precisely, Alg. 2 uses a set to record (in Line 2) the available workers that are not allocated yet, where set is initialized to be in Line 2. Then, at the -th iteration of Line 2, Alg. 2 checks if job satisfies the two conditions in Line 2: the first condition means that job is generated and the second condition means that its required workers are all available. If job meets the conditions, then it is scheduled as in Line 2. In addition, if job is scheduled, then set is updated as in Line 2 by removing the workers allocated to job . After deciding , Alg. 2 performs the decision in Line 2 for frame , followed by updating the queue sizes in Line 2.

Example 11.

Follow Ex. 6. According to Eq. (3), Alg. 2 calculates and . Thus, Alg. 2 decides to compute for frame 1. Note that the decision is different from that in Ex. 6.

Next, we establish the approximation ratio of Alg. 2 to Eq. (2).

Lemma 12.

Alg. 2 is a -approximate scheduling algorithm to Eq. (2).

Proof.

See Appendix B. ∎

Remark 13.

We remark that the approximation ratio of is the best approximation ratio to Eq. (2). That is because the combinatorial optimization problem in Alg. 1 is computationally harder than the set packing problem (see Lemma 8) and the best approximation ratio to the set packing problem is the square root (see [6]).

With Lemma 12, we can further establish the approximation ratio of Alg. 2 to .

Theorem 14.

Alg. 2 is a -approximate scheduling algorithm to .

Proof.

See Appendix C. ∎

The computational complexity of Alg. 2 is primarily caused by sorting all queues in Line 2. Thus, Alg. 2 is tractable when the number of applications is large.

Remark 15.

We remark that our methodology can apply to the case of time-varying workloads. Let be the workload generated by application for worker in frame . We just need to revise the constant task completion probability in Algs. 1 and 2 to be the probability of completing workload . If workload is i.i.d. over frames  for all and , then Alg. 1 is still a feasibility-optimal scheduling algorithm and Alg. 2 is still a -approximate scheduling algorithm.

Iv Numerical results

Fig. 3: Regions of achievable requirements by Algs. 1 and 2 for various task generation probabilities by application , i.e., or for all and .
Fig. 4: Regions of achievable requirements by Algs. 1 and 2 for various task completion probabilities by application , i.e., or for all .
Fig. 5: Maximum achievable requirement (for the case of for all ) versus number of applications, for various task generation probabilities, i.e., , , or , for all , , and .

In this section, we investigate Algs. 1 and 2 via computer simulations. First, we consider two applications and two workers. Fig. 3 displays the regions of achievable requirements by both scheduling algorithms for various task generation probabilities by application , when and for all , , and are fixed. Fig. 4 displays the regions of achievable requirements by both scheduling algorithms for various task completion probabilities by worker , when and for all , , and  are fixed. Each result marked in Figs. 3 or 4 is the requirement such that the average number of completed jobs in 10,000 frames for application is at least and that for application is at least . The both figures reflect that Alg. 2 is not only computationally efficient but also can fulfill almost all requirements within (achievable by Alg. 1).

Second, we consider more applications and more workers with the same quantities, i.e., . Moreover, all task completion probabilities are fixed to be 0.9, i.e., for all and . Then, Fig. 5 displays the maximum achievable requirements  (for the case of for all ) by Alg. 2, when all task generation probabilities are the same. In this case, an application generates a job in a frame with probability . When in Fig. 5, the lower task generation probability the lower achievable requirement, because a lower task generation probability generates fewer jobs. In contrast, when in Fig. 5, the lower task generation probability the higher achievable requirement, because fewer jobs cause less interference. In other words, the interference becomes severe when . Moreover, from Fig. 5, the maximum achievable requirement by Alg. 2 appears to decrease super-linearly with the number of applications.

V Concluding remarks

In this paper, we provided a framework for studying stochastic real-time jobs in unreliable workers with specialized functions. In particular, we developed two algorithms for scheduling real-time jobs in shared unreliable workers. While the proposed feasibility-optimal scheduling algorithm can support the largest region of applications’ requirements, it has the notorious NP-hard issue. In contrast, the proposed approximate scheduling algorithm is not only simple, but also has a provable guarantee for the region of achievable requirements. Moreover, we note that coding techniques have been exploited to alleviate stragglers in distributed computing networks, e.g., [9, 18]. Including coding design into our framework is promising.

Appendix A Proof of Lemma 8

We show a reduction from the set packing problem [6], where given a collection of non-empty sets over a universal set for some positive integers and , the objective is to identify a sub-collection of disjoint sets in that collection such that the number of sets in the sub-collection is maximized.

For the given instance of the set packing problem, we construct applications and workers in the distributed computing network. Consider a fixed frame . In frame , application generates job . With the transformation, the set packing problem is equivalent to identifying a set of interference-free jobs in frame  such that number of jobs in that set is maximized.

Moreover, consider no job until frame , identical requirements for all , and identical task completion probabilities for all and . In this context, Eq. (2) in frame  becomes

(4)

because , (due to non-empty sets for all ), and . As a result of the constant in Eq. (4), the objective of the combinatorial optimization problem in Alg. 1 in frame  becomes identifying a set of interference-free jobs such that the number of jobs in that set is maximized.

Suppose there exists an algorithm such that the combinatorial optimization problem in Alg. 1 in frame  can be solved in polynomial time. Then, the polynomial-time algorithm can identify a set for maximizing the value in Eq (4); in turn, solves the set packing problem. That contradicts to the NP-hardness of the set packing problem.

Because the above argument is true for all frames , we conclude that the combinatorial optimization problem in Alg. 1 in frame  is NP-hard, for all .

Appendix B Proof of Lemma 12

Consider a fixed queue size vector in a fixed frame . Let for all . Without loss of generality, we can assume that for all and further assume that (by reordering the job indices), i.e., Alg. 2 processes job at the -th iteration of Line 2. Let be the decision of Alg. 2 in frame  for the given queue size vector . Then, we can express the value of Eq. (2) computed by Alg. 2 as

(5)

Let be the decision of Alg. 1 in frame  for the given queue size vector . If the conditions in Line 2 of Alg. 2 hold for the -th iteration (i.e., ), then we let 444Here, we use to represent the set of common workers for jobs and . be a set of jobs. The set has the following properties:

  • For job , we have

    (6)

    since .

  • All jobs in are interference-free, i.e., they need different workers, since . Moreover, job needs at least one of the workers for (i.e., ). Thus, we have

    (7)
  • Since all jobs in need different workers, and there are workers, we have

    (8)

Note that . Thus, we can bound computed by Alg. 1 by

(9)

where for all .

Furthermore, we can bound for each by

(10)

where (a) follows Eq. (6); (b) is due to the Cauchy-Schwarz inequality; (c) follows Eqs. (7) and (8).

Then, we can bound by

where (a) follows Eq. (9); (b) follows Eq. (10); (c) follows Eq. (5). Because the above argument is true for all and , the approximation ratio is .

Appendix C Proof of Theorem 14

The proof of Theorem 14 needs the following technical lemma, whose proof follows the line of [13, Appendix 4.A] along with the i.i.d. property of state (as discussed in the proof of Theorem 7) and the constant task completion probabilities .

Lemma 16.

There exists a stationary scheduling algorithm (i.e., decision depends on the state in frame  only) such that, for any requirement  interior of , all queues in the virtual queueing network are stable, i.e., the stationary scheduling algorithm can fulfill the requirement .

Moreover, we need the Lyapunov theory [13, Thoereom 4.1] as stated in the following lemma, where we consider the Lyapunov function .

Lemma 17.

Given scheduling algorithm  and requirement , if there exist constants and such that

for all frames , then all queues in the virtual queueing network are stable, i.e., the scheduling algorithm  can fulfill the requirement .

Then, we are ready to prove Theorem 14. Suppose that requirement is interior of . By Lemma 16, there exists a stationary scheduling algorithm that can fulfill the requirement . We denote that stationary scheduling algorithm by . Moreover, since requirement is interior of , requirement for some is also interior of . By Lemma 16 again, the stationary scheduling algorithm  can fulfill requirement , i.e.,

(11)

for all .

Consider requirement where for all . Next, applying Lemma 17 to Alg. 2, we conclude that Alg. 2 can fulfill requirement because

where (a) follows [13, Chapter 4] with some constant ; (b) is because and the approximation ratio of Alg. 2 to Eq. (2) is (as stated in Lemma 12); (c) is because Alg. 1 (in Line 1) maximizes the value of among all possible scheduling algorithms ; (d) is because decision under stationary scheduling algorithm  depends on the state only (regardless of the queue sizes) and also the state is i.i.d. over frames, yielding for all and ; (e) follows Eq. (11).

References

  • [1] G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica (2013) Effective Straggler Mitigation: Attack of the Clones. Proc. of NSDI, pp. 185–198. Cited by: §I.
  • [2] M. Chowdhury and I. Stoica (2012) Coflow: A Networking Abstraction for Cluster Applications.. Proc. of ACM HotNets, pp. 31–36. Cited by: §I.
  • [3] M. Chowdhury, Y. Zhong, and I. Stoica (2014) Efficient Coflow Scheduling with Varys. Proc. of ACM SIGCOMM 44 (4), pp. 443–454. Cited by: §I.
  • [4] J. Dean and S. Ghemawat (2008) MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM 51 (1), pp. 107–113. Cited by: §I.
  • [5] L. Georgiadis, M. J. Neely, and L. Tassiulas (2006) Resource Allocation and Cross-Layer Control in Wireless Networks. Vol. 1, Now Publishers, Inc.. Cited by: Proposition 5.
  • [6] M. M. Halldórsson, J. Kratochvíl, and J. A. Telle (1998) Independent Sets with Domination Constraints. Proc. of ICALP, pp. 176–187. Cited by: Appendix A, §III-C, Remark 13.
  • [7] I-H. Hou and P. R. Kumar (2013) Packets with Deadlines: A Framework for Real-Time Wireless Networks. Vol. 6, Morgan & Claypool Publishers. Cited by: §II-A, §III-A, footnote 1.
  • [8] S. Im, B. Moseley, K. Pruhs, and M. Purohit (2019) Matroid Coflow Scheduling. Proc. of ICALP, pp. 1–14. Cited by: §I.
  • [9] K. Lee, M. Lam, R. Pedarsani, D. Papailiopoulos, and K. Ramchandran (2017)

    Speeding Up Distributed Machine Learning using Codes

    .
    ieee_j_it 64 (3), pp. 1514–1529. Cited by: §V.
  • [10] B. Li, Z. Shi, and A. Eryilmaz (2018) Efficient Scheduling for Synchronized Demands in Stochastic Networks. Proc. of IEEE WiOpt, pp. 1–8. Cited by: §I.
  • [11] Y. Li, S. H. Jiang, H. Tan, C. Zhang, G. Chen, J. Zhou, and F. Lau (2016) Efficient Online Coflow Routing and Scheduling. Proc. of ACM MobiHoc, pp. 161–170. Cited by: §I.
  • [12] Q. Liang and E. Modiano (2017) Coflow Scheduling in Input-Queued Switches: Optimal Delay Scaling and Algorithms. Proc. of IEEE INFOCOM, pp. 1–9. Cited by: §I.
  • [13] M. J. Neely (2010) Stochastic Network Optimization with Application to Communication and Queueing Systems. Vol. 3, Morgan & Claypool Publishers. Cited by: Appendix C, Appendix C, Appendix C, §III-A, §III-B, §III-B, footnote 1.
  • [14] M. Shafiee and J. Ghaderi (2018) An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters. ieee_j_net 26 (4), pp. 1674–1687. Cited by: §I.
  • [15] S.-H. Tseng and A. Tang (2018) Coflow Deadline Scheduling via Network-Aware Optimization. Proc. of Allerton, pp. 829–833. Cited by: §I.
  • [16] A. Tveit, Ø. Rein, J. V. Iversen, and M. Matskin Scalable Agent-Based Simulation of Players in Massively Multiplayer Online Games. Proc. of SCAI, pp. 80–89. Cited by: §I.
  • [17] S. Wang, J. Zhang, T. Huang, J. Liu, and Y. Liu (2018) A Survey of Coflow Scheduling Schemes for Data Center Networks. ieee_m_com 56 (6), pp. 179–185. Cited by: §I.
  • [18] C.-S. Yang, R. Pedarsani, and A. S. Avestimehr (2019) Timely-Throughput Optimal Coded Computing over Cloud Networks. Proc of ACM MobiHoc, pp. 301–310. Cited by: §V.