DeepAI
Log In Sign Up

Allocation of Graph Jobs in Geo-Distributed Cloud Networks

08/13/2018
by   Seyyedali Hosseinalipour, et al.
NC State University
0

Recently, processing of big-data has drawn tremendous attention, where cloud computing is a natural platform. Big-data tasks can be represented as a graph job consisting of multiple sub-jobs, which must be executed in parallel with predefined communication constraints. This work develops the foundation for allocating graph jobs in geo-distributed cloud networks (GCs) with various scales. The problem of graph job allocation in GCs is formulated with respect to the incurred power consumption of the tasks, which turns out to be an integer programming problem. Considering the intractability of the solution, a sub-optimal analytical approach, suitable for small-scale GCs, is proposed. Further, in medium scale GDCNs, we address the graph-job allocation problem using a distributed algorithm. Finally, for large-scale GCs, given the intractability of the feasible set of allocations, we propose a novel algorithm called cloud crawling, which consists of a decentralized crawler exploring the network to determine the "potentially good" feasible allocations for the graph jobs. Based on the suggested strategies of the cloud crawler, we address the problem of graph job allocation from the proxy agents' perspective eliminating the burden of job allocation from the cloud datacenters (DCs). We address this problem under robust and adaptive pricing of the DCs, for each of which we propose an effective online learning algorithm.

READ FULL TEXT VIEW PDF
04/04/2020

Energy-aware Allocation of Graph Jobs in Vehicular Cloud Computing-enabled Software-defined IoV

Software-defined internet of vehicles (SDIoV) has emerged as a promising...
10/05/2021

Phoebe: A Learning-based Checkpoint Optimizer

Easy-to-use programming interfaces paired with cloud-scale processing en...
03/07/2019

Allocation of Computation-Intensive Graph Jobs over Vehicular Clouds

Recent years have witnessed dramatic growth in smart vehicles and comput...
04/12/2020

QoS-Driven Job Scheduling: Multi-Tier Dependency Considerations

For a cloud service provider, delivering optimal system performance whil...
03/27/2020

A Truthful Auction for Graph Job Allocation in Vehicular Cloud-assisted Networks

Vehicular cloud computing has emerged as a promising solution to fulfill...
12/05/2016

Support vector regression model for BigData systems

Nowadays Big Data are becoming more and more important. Many sectors of ...
08/03/2020

Energy-aware Graph Job Allocation in Software Defined Air-Ground Integrated Vehicular Networks

The software defined air-ground integrated vehicular (SD-AGV) networks h...

1 Introduction

Recently, the demand for big-data processing has promoted the popularity of cloud computing platforms due to their reliability, scalability and security [1, 2, 3]. Handling Big-data applications requires unique system-level design since these applications, more than often, cannot be processed via a single PC, server, or even a datacenter (DC). To this end, modern parallel and distributed processing systems (e.g., [4, 5, 6]) are developed. In this work, we propose a framework for allocating big-data applications represented via graph jobs in geo-distributed cloud networks (GDCNs), explicitly considering the power consumption of the DCs. In the graph job model, each node denotes a sub-task of a big-data application while the edges impose the required communication constraints among the sub-tasks, further discussed later.

1.1 Related Work

There is a body of literature devoted to task and resource allocation in contemporary cloud networks, e.g., [7, 8, 9, 10, 11, 12]. In [7], the task placement and resource allocation plan for embarrassingly parallel jobs, which are composed of a set of independent tasks, is addressed to minimize the job completion time. To this end, three algorithms named TaPRA, TaPRA-fast, and OnTaPRA are proposed, which significantly reduce the job execution time as compared to the state-of-the-art algorithms. In [8], the multi-resource allocation problem in cloud computing systems is addressed through a mechanism called DRFH, where the resource pool is constructed from a large number of heterogeneous servers containing various number of slots. It is shown that DRFH leads to much higher resource utilization with considerably shorter job completion times. In [9], a resource allocation scheme is proposed resulting in efficient utilization of the resources while increasing the revenue of the mobile cloud service providers. One of the pioneer works addressing resource allocation in GDCNs is [10], where a distributed algorithm, called DGLB, is proposed for real-time geographical load balancing. None of the above works has considered allocation of big-data jobs composed of multiple sub-tasks requiring certain communication constraints among their sub-tasks.

Allocation of big-data jobs represented by graph structures is a complicated process entailing more delicate analysis. Among limited literature [11, 12] are most relevant, in which randomized algorithms are developed capable of matching the vertices of the graph jobs to the idle slots of the cloud servers, considering the cost of using the communication infrastructure of the network to handle the data flows among the sub-tasks. These algorithms are developed for a fixed network cost configuration, i.e., the cost of job execution using the same allocation strategy is fixed throughout the time. As mentioned in [13], these randomized algorithms suffer from long convergence time. Due to this fact, these algorithms are impractical in scenarios that i) the job allocation needs to be performed with respect to a time varying network cost configuration, ii) the network size is large leading to an enormous size of the strategy set (see Section 5). In GDCNs, the execution cost is mainly determined by the real-time power consumption of the DCs [14]. Hence, an applicable allocation framework should be capable of fast allocation of incoming graph jobs to the GDCNs considering the effect of allocation on the current DCs’ power consumption state. Also, with the rapid growth in the size of cloud networks, adaptability to large-scale GDCNs is a must for such a framework. These are the main motivations behind this work.

1.2 Contributions

The main goal of this paper is to provide a framework for graph job allocation in GDCNs with various scales. Our main contributions can be summarized as follows:

i) We formulate the problem of graph job allocation in GDCNs considering the incurred power consumption on the cloud network.

ii) We propose a centralized approach to solve the problem suitable for small-scale cloud networks.

iii) We design a distributed algorithm for allocation of graph jobs in medium-scale GDCNs, using the DCs’ processing power in parallel.

iv) For large-scale GDCNs, given the huge size of the strategy set, and extremely slow convergence of the distributed algorithm, we introduce the idea of cloud crawling. In particular, we propose a fast method to address the NP-complete sub-graph isomorphism problem, which is one of the major challenges for graph job allocation in cloud networks. Also, we propose a novel decentralized sub-graph isomorphism extraction algorithm for a cloud crawler to identify “potentially good” strategies for customers while traversing a GDCN.

v) For large-scale GDCNs, considering the suggested strategies of cloud crawlers, we find the best suggested strategies for the customers under adaptive and fixed pricing of the DCs in a distributed fashion. To this end, we model proxy agents’ behavior in a GDCN, based on which we propose two online learning algorithms inspired by the concept of “regret” in the bandit problem [15, 16].

This paper is organized as follows. Section 2 includes system model. Section 3 contains a sub-optimal approach for graph job allocation in small-scale GDCNs. A distributed graph job allocation mechanism for medium-scale GDCNs is presented in Section 4. Cloud crawling along with online learning algorithms for large-scale GDCNs are presented in Section 5. Simulation results are given in section 6. Finally, Section 7 concludes the paper.

2 System Model

A GDCN comprises various DCs connected through communication links. Inside each DC, there is a set of fully-connected cloud servers each consisting of multiple fully-connected slots. Without loss of generality, we assume that all the cloud servers have the same number of slots. Each slot corresponds to the same bundle of processing resources which can be utilized independently. Since all the slots belonging to the same DC are fully-connected, we consider a DC as a collection of slots directly in our study.111The number of cloud servers does not play a major role in our study except in the energy consumption models. It is assumed that a DC provider (DCP) is in charge of DC management. Abstracting each DC to a node and a communication link between two DCs as an edge, a GDCN with DCs can be represented as a graph , where denotes the set of nodes and represents the set of edges. Henceforth, is assumed to be connected; however, due to the geographical constraints, may not be a complete graph.

Let denote the set of slots belonging to DC . Connection between two DCs enables the communication capability between all the slots of them. Consequently, two slots are called adjacent if and only if both belong to the same DC or there exists a link between their corresponding DCs. Let denote the set of edges between the adjacent slots, where if and only if or . We define the aggregated network graph as , where and .

Let , denote the set of all possible types of the graph jobs in the system, each of which is considered as a graph . Each node of a graph job requires one slot from a DC to get executed. It is assumed that , and , if and only if the nodes and need to be executed using two adjacent slots of the GDCN.

The system model is depicted in Fig. 2. For the small- and medium-scale GDCNs, the GDCN network is assumed to be in charge of finding adequate allocations for the incoming graph jobs from proxy agents (PAs) ( [17, 18]), which act as trusted parties between the GDCN and the customers. In these cases, each graph job is allocated through either a centralized controller or a distributed algorithm utilizing the communication infrastructure between the DCs (see Section 4). For large-scale GDCNs, cloud crawlers are introduced to explore the GDCN to provide a set of suggested strategies for the PAs. Afterward, PAs allocate their graph jobs with respect to the utility of the suggested strategies (see Section 5). The following definitions are introduced to facilitate our subsequent derivations.

Definition 1.

A feasible mapping between a and the GDCN is defined as a mapping , which satisfies the communication constraints of the graph job. This implies that , if , then . Let denote the set of all feasible mappings for the .

Definition 2.

For a , a mapping vector associated with a feasible mapping

is defined as a vector

, where denotes the number of used slots from DC . Mathematically, , where represents the indicator function. Let denote the set of all mapping vectors for the .

Finding a feasible allocation/mapping between a graph job and a GDCN is similar to the sub-graph isomorphism problem in graph theory [19]. Some examples of feasible allocations for a graph job with three nodes considering a GDCN with four DCs each consisting of four slots is depicted in Fig. 2.

Remark 1.

Our aim is to allocate big-data driven applications, e.g., data streams [11], to GDCNs. Due to the nature of these applications, the jobs usually stay in the system so long as they are not terminated. This work can be considered as a real-time allocation of graph jobs to the system, where we find the best currently possible assignment considering the current network status. Hence, we deliberately omit the time index from the following discussions.

Inspired by [20, 14], we model the power consumption upon utilizing slots of comprising cloud servers each with idle power consumption as:

(1)

In this model, is the so-called Power Usage Effectiveness which is the ratio between the power consumed by the IT-equipment and the total power usage, including cooling, lights, UPS, etc., of a DC, and is chosen in such a way that determines the peak power consumption of a cloud sever inside . Also, is a DC-related constant. Subsequently, we define the incurred cost of executing a graph job with type allocated according to the feasible mapping vector as follows:

(2)

where is the original load of DC , indicates the I/O incurred power of using the communication infrastructure of DC per slot, and is the ratio between the cost and power consumption, which is dependent on the DC’s location and infrastructure design. The I/O cost is considered to be proportional to the number of used slots since the data generated at each DC is correlated with that number, and that data should be exchanged using the I/O infrastructure either among adjacent DCs or between DCs and the users.

Fig. 1: System model for graph job allocation in GDCNs with various scales.
Fig. 2: Examples of graph job allocation. The green (blue) color denotes busy (idle) slots. The red color indicates the utilized slots upon allocation.

2.1 Problem Formulation

Our goal is to find an allocation for each arriving graph job to minimize the total incurred cost on the network. Due to the inherent relation between the cost and loads of the DCs, minimizing the cost is coupled with balancing the loads of the DCs. In a GDCN, let denote the number of in the system demanded for execution. Let denote the matrix of mapping vectors of these graph jobs defined as follows:

We formulate the optimal graph job allocation as the following optimization problem ():

(3)
(4)
(5)

In , the objective function is the total incurred cost of execution, the first condition ensures the stability of the DCs, and the second constraint guarantees the feasibility of the assignment. There are two main difficulties in obtaining the solution: i) Identifying the feasible mappings (-s) requires solving the sub-graph isomorphism problem between the graph jobs’ topology and the aggregated network graph, which is categorized as NP-complete [19]. Hence, we only assume the knowledge of -s in the small- and medium-scale GDCNs. In the large-scale GDCNs, we propose a low-complexity decentralized approach to extract isomorphic sub-graphs to a graph job and implement it in our proposed cloud crawlers. ii) is a nonlinear integer programming problem, which is known to be NP-hard. In small- and medium-scale GDCNs, we tackle this problem considering a convex relaxed version of it. However, for large-scale GDCNs, we find a “potentially good” subset of feasible mappings as the cloud crawlers traverse the network. Afterward, the strategy selection is carried out using the computing power of the PAs in a decentralized fashion.

Symbol Definition
The GDCN graph
Set of DCs in the GDCN
The DC with index
Number of DCs in the GDCN
Set of slots of DC
Aggregated graph of the GDCN
Set of slots of the entire GDCN
Set of edges between adjacent slots of a GDCN
Set of graph jobs in the system
Number of different types of jobs in the system
Associated graph to the graph job with type
Number of jobs with type in the system
Set of nodes of the graph job with type
Set of edges of the graph job with type
Load of DC
Number of cloud servers in DC
Set of all the mapping vectors for
Set of PAs in the system
Set of cloud crawler’s suggested strategies for
Probability of selection of strategy
TABLE I: Major notations.

3 Graph Job Allocation in Small-Scale GDCNs: Centralized Approach

Solving requires solving an integer programming problem in dimensions. For a small GDCN with three types of graph jobs (), DCs (), and graph jobs of each type in the system, the dimension of the solution becomes rendering the computations impractical. To alleviate this issue, we solve in a sequential manner for available graph jobs in the system. In our approach, at each stage, the best allocation is obtained for one graph job while neglecting the presence of the rest. Afterward, the graph job is allocated to the GDCN and the loads of the utilized DCs are updated. As a result, at each stage, the dimension of the solution is ( in the above example). For a , let the available graph jobs be indexed from to according to their execution order, where preferred customers can be prioritized in practice. For a graph job with type with index , we reformulate as ():

(6)
(7)
(8)

where denotes the updated load of DC after the previous graph job allocation. The last constraint in

forces the solution to be discrete making the derivation of a tractable solution impossible. In the following, we relax this constraint and provide a tractable method to derive the solution in the set of feasible points. For the moment, we consider

. We define as an optimization problem with the same objective function as with three constraints. In this problem, the first constraint is Eq. (7), and the second and third constraints are relaxed versions of Eq. (8) described as:

(9)
(10)

where Eq. (9) ensures the assignment of all the nodes of the graph job to the GDCN, and Eq. (10) guarantees the practicality of the solution. It is easy to verify that is a convex optimization problem. We use the Lagrangian dual decomposition method [21] to solve this problem. Let , , and denote the Lagrangian multipliers associated with the first, the second, and the third constraint, respectively. The Lagrangian function associated with is then given by:

(11)

The corresponding dual function of is given by:

(12)

Finally, the dual problem can be written as ():

(13)

is a convex optimization problem with differentiable affine constraints; hence, it satisfies the constraint qualifications implying a zero duality gap. As a result, the solution of coincides with the solution of . It can be verified that the minimum of the dual function occurs at the following point:

(14)

By replacing this in the Lagrangian function, the dual function is given by: , where . The optimal Lagrangian multipliers can be obtained by solving the dual problem given by:

(15)

Given the solution of Eq. (15), the optimal allocation in is given by . The solutions of Eq. can be derived via the iterative gradient ascent algorithm [21].

Let denote the derived solution in the continuous space, we obtain the solution of by solving the following weighted mean-square problem:

(16)

where -s are the design parameters, which can be tuned to impose a certain tendency toward utilizing specific DCs.

So far, to derive the above solution, it is necessary to have a powerful centralized processor with global knowledge about the state of all the DCs. This is due to the inherent updating mechanism of the gradient ascent method [21], in which iterative update of each Lagrangian multiplier requires global knowledge of the current values of the other Lagrangian multipliers and the DCs’ loads. Obtaining this knowledge may not be feasible for a given GDCN with more than a few DCs. Moreover, multiple powerful backup processors may be needed to avoid the interruption of the allocation process in situations such as overheating of the centralized processor. In the following section, we design a distributed algorithm using the processing power of the DCs in parallel to resolve the above concerns.

4 Graph Job Allocation in Medium-Scale GDCNs: Decentralized Approach with DCs in Charge of Job Allocation

The described dual problem in Eq. (13), given the result of Eq. (14), can be written as follows:

(17)

where

(18)

In Eq. (17), each term can be associated with a DC. For , there are two private (local) variables and a public (global) variable , which is identical for all the DCs. Due to the existence of this public variable, the objective function cannot be directly written as a sum of separable functions. In the following, we propose a distributed algorithm deploying local exchange of information among adjacent DCs to obtain a unified value for the public variable across the network.

4.1 Consensus-based Graph Job Allocation

We propose the consensus-based distributed graph job allocation (CDGA) algorithm consisting of two steps to find the solution of Eq. (17): i) updating the local variables at each DC, ii) updating the global variable via forming a consensus among DCs. We consider each term of Eq. (17) as a (hypothetically) separate term and rewrite the problem as a summation of separable functions, with replaced by in :

(19)

At each iteration of the CDGA algorithm, each DC first derives the value of the following variables locally using the gradient ascent method:

(20)

where -s are the corresponding step-sizes and is a local variable. Afterward, the local copies of the global variable (-s) are derived by employing the consensus-based gradient ascent method [22]:

(21)

where , with the Laplacian matrix of and , and denotes the number of performed consensus iterations among the adjacent DCs. In this method, the adjacent DCs perform consensus iteration with local exchange of -s before updating . The pseudo-code of the CDGA algorithm is given in Algorithm 1. Since the solution is found in the continuous space, similar to Section 3, the last stage of the algorithm is obtaining the solution in the feasible set of allocations. This step requires a centralized processor with the knowledge of the feasible solutions. Nevertheless, as compared to the centralized approach (Section 3), the centralized processor is no longer in charge of deriving the optimal allocations for each graph job.

input : Convergence criteria , maximum number of iterations .
1 At each DC , choose an arbitrary initial value for .
2 for  to  do
3        At each DC , derive the values of for the next iteration (k+1) using Eq. (20).
4        At each DC , update the value of using Eq. (21).
5        if  and and and  then
6               Go to line 1.
7       
8Derive the convex relaxed solution described in Eq. (14).
Derive the allocation using Eq. (16).
Algorithm 1 CDGA: Consensus-based distributed graph job allocation

5 Graph Job Allocation in Large-Scale GDCNs: Decentralized Approach using Cloud Crawling and PAs’ Computing Resources

Large-scale GDCNs consist of an enormous number of PAs and DCs. This fact imposes three challenges for graph job allocation: i) The CDGA algorithm developed above becomes infeasible. In particular, excessive computational burden will be incurred on the DCs due to the large number of arriving jobs. Also, CDGA in large-scale GDCNs will incur a long delay (e.g., a GDCN with DCs involves Lagrangian multipliers and requires hundreds of iterations for convergence), which may render the final solution less effective for the current state of the network. Moreover, continuous communication between the DCs imposes a considerable congestion over the communication links. ii) So far, the inherent assumption in our study is a known set of feasible allocations for the graph jobs. This requires solving the NP-complete problem of sub-graph isomorphism between the graph jobs and the large-scale aggregated network graph, which may take a long time. iii) Even for a given graph job, the size of the feasible allocation set becomes prohibitively large in a large-scale network. For instance, in a fully-connected network of DCs, each with slots, the number of feasible allocations for a simple triangle graph job is . These concerns motivate us to develop cloud crawlers, based on which we address the mentioned challenges through a decentralized framework. Here, we use the term “crawler” to describe the movement between adjacent DCs. This may bear a resemblance to the term web crawler. Nevertheless, the cloud crawlers introduced here are fundamentally different from conventional web crawlers (e.g., [23, 24, 25]). Our cloud crawlers aim to extract suitable sub-graphs from GDCNs for specified graph job structures when traversing the network, while web crawlers are mainly developed to extract information from Internet URLs by looking for keywords and related documents.

5.1 Strategy Suggestion Using Cloud Crawling

We introduce a cloud crawler (CCR) which carries a collection of structured information traveling between adjacent DCs. It probes the connectivity among the DCs and status of them (power usage, load distribution, etc.), based on which it provides a set of suggested allocations for the graph jobs. For a faster network coverage, multiple CCRs for each type of graph job can be assumed. Information gleaned by the CCRs can be shared with the PAs who act as mediators between the GDCN and customers using two mechanisms: i) the CCR shares them with a central database, which PAs have access to, on a regular basis; ii) the CCR shares them with DCs as it passes through them and the DCs update the connected PAs accordingly. The goal of a CCR is to find “potentially good” feasible allocations to fulfill a graph job’s requirements considering the network status. We consider a potentially good feasible allocation as a sub-graph in the aggregated network graph which is isomorphic to the considered graph job leading to a low cost of execution. In the following, we first prove a theorem, based on which we provide a corollary aiming to describe a fast decentralized approach to solve the sub-graph isomorphism problem in large-scale GDCNs.

input : Initial server , , the center node and its maximum shortest distance to the nodes of the graph, size of the suggested strategies .
1 Initialize a BST (), as a list of list of lists, , and vector with length .
2
3
4 for  to  do
5       
6       
7 Observer the current pdf of the load of the server
8 Initialize as a list of list of lists.
9 %Completing the incomplete allocations using the slots of current DC:
10 for  to  do
11        %Obtain the last allocation done for each incomplete allocation. This is a list of elements (see line 2)
12        % #assigned nodes of the job
13        %Next neighborhood that needs to be assigned
14       
15        while  do
16               %#used slots from the current DC
17               if  then
18                     
19                      %Initialize a temporary list
20                     
21                      if  then
22                             %Add completed allocations to the BST
23                             %Algorithm 3
24                             %Algorithm 4
25                             %Algorithm 5
26                     else
27                            
28                     
29              
30       
31
32 %Assigning the nodes to the current DC:
33 for  do
34        if  then
35              
36               %Number of unassigned nodes of the job
37               if  then
38                      %The allocation corresponding to assigning all the nodes to the current server is added to the BST
39                      %Algorithm 5
40              else
41                      %A new incomplete allocation is added to as a list of list
42                     
43              
44       
45 if All the adjacent DC are in the set  then
46        Initialize a new and randomly choose one adjacent DC
47       
48else
49        Randomly choose one adjacent DC
50       
51
52
53 crawl to and go to line 2
Algorithm 2 Cloud crawling
Definition 3.

Two graphs and with vertex sets and are called isomorphic if there exists an isomorphism (bijection mapping) such that any two nodes, , are adjacent in if and only if are adjacent in .

input : A list
output : The total cost .
1
2
3 while LL[j]!=null do
4        % Sum the incurred costs on all the DCs involved
5       
return C
Algorithm 3
input : A list
output : Allocation strategy
1 Initialize as a list of lists
2
3 while LL[j]!=null do
4        %The DC’s index and its number of used slots
5       
6       
return S
Algorithm 4
input : A binary search tree , a and a , the desired size of the suggested strategy set
output : A binary search tree
1 if BST.length  then
2        BST=BST.Insert(,)
3else if  BST.get_max().key  then
4        BST=BST.Delete(BST.get_max())
5        BST=BST.Insert(,)
return BST
Algorithm 5
Theorem 1.

Consider graphs and with vertex sets and , respectively, where . Assume that can be partitioned into multiple complete sub-graphs , , with vertex sets , where , and all the nodes in each pair of sub-graphs with consecutive indices are connected to each other. Consider node and let denote the length of the longest shortest path between and nodes in . Define , , where denotes the length of the shortest path between the two input nodes. Let be a sequence of integer numbers that satisfy the following conditions:

(22)
(23)
(24)

For such a sequence , there is at least an isomorphic sub-graph to , called , in with the corresponding isomorphism mapping , for which at least one of the nodes of , , belongs to , if the following set of conditions is satisfied:

(25)
Proof.

The key to prove this theorem is considering the following mapping between the nodes of and the sub-graphs in :

(26)

Under this mapping, the mapped nodes form an isomorphic graph to since the connection between all the adjacent nodes in is met in . That is because they are either placed at the same (fully-connected) , or in (fully-connected) adjacent -s, . With a similar justification, it can be proved that concatenation of the mapped nodes to the adjacent -s, , in Eq. (26) preserves the isomorphic property. For instance, all the following mappings form isomorphic graphs to in :

(27)
(28)
(29)
(30)

It can be seen that conditions stated in Eq. (22)-(24) denote the feasible concatenation strategies, where each denotes the number of neighborhoods mapped to , . Also, Eq. (25) ensures the feasibility of the corresponding mappings. ∎

Corollary 1.

For , assume a CCR located at DC allocating at least one node of , , to one slot at , where the length of longest shortest path between and nodes in is . Assume that the CCR’s near future path can be represented as , where . Considering as in Theorem 1, for each realization of the sequence satisfying Eq. (22)-(25), the following described allocation is feasible and is isomorphic to .

(31)

Using our method described in the above corollary, it can be verified that the complexity of obtaining an isomorphic sub-graph to a graph job for a CCR becomes , where is the diameter of the graph job. Henceforth, we recall defined in Corollary 1 as the center node, which can be chosen arbitrarily from the graph job’s nodes. The pseudo-code of our algorithm implemented in a CCR is given in Algorithm 2. We use the binary search tree (BST) data structure [26] to structurize the carrying suggested strategies. To handle the large number of feasible allocations, we limit the capability of a CCR in carrying potentially good strategies (size of the BST) to a finite number for . Some important parts of Algorithm 2 are further illustrated in the following.

A) Initialization: A CCR is initialized at a DC for a certain graph job, , and a specified number of suggested strategies () to be carried.222Note that using a simple extension of this algorithm, a CCR can handle the extraction of suggested strategies for multiple graph jobs at the same time. Each CCR carries a BST, a list [27] of incomplete allocations () and a set of visited neighbors () (can be implemented as a list). In Fig. 3-a, topology of a graph job is shown along with three DCs where each square denotes a slot in a DC. The CCR is initialized at traversing the path .

B) Determining the Graph Job Topology Constraints (lines: 2-2): For a given center node of a graph job, i.e., in Corollary 1, the algorithm calculates the feasible number of nodes allocated to DCs according to Corollary 1. In Fig. 3-a, the center node is denoted by and different set of neighbors located in various shortest paths to are demonstrated.

C) Allocation Initialization and Completion (lines: 2-2):

Fig. 3: a: The graph job topology and the neighboring nodes to the center node (left); three DCs along with the carried incomplete and complete allocations of the CCR upon arriving at each DC (right). b: Some examples of completed allocations.