Hierarchical Multi-Agent Optimization for Resource Allocation in Cloud Computing

01/12/2020 ∙ by Xiangqiang Gao, et al. ∙ 0

In cloud computing, an important concern is to allocate the available resources of service nodes to the requested tasks on demand and to make the objective function optimum, i.e., maximizing resource utilization, payoffs and available bandwidth. This paper proposes a hierarchical multi-agent optimization (HMAO) algorithm in order to maximize the resource utilization and make the bandwidth cost minimum for cloud computing. The proposed HMAO algorithm is a combination of the genetic algorithm (GA) and the multi-agent optimization (MAO) algorithm. With maximizing the resource utilization, an improved GA is implemented to find a set of service nodes that are used to deploy the requested tasks. A decentralized-based MAO algorithm is presented to minimize the bandwidth cost. We study the effect of key parameters of the HMAO algorithm by the Taguchi method and evaluate the performance results. When compared with genetic algorithm (GA) and fast elitist non-dominated sorting genetic (NSGA-II) algorithm, the simulation results demonstrate that the HMAO algorithm is more effective than the existing solutions to solve the problem of resource allocation with a large number of the requested tasks. Furthermore, we provide the performance comparison of the HMAO algorithm with the first-fit greedy approach in on-line resource allocation.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

For some electronic devices, which are composed of dedicated hardware equipments, i.e., field programmable gate array (FPGA), digital signal processor (DSP) and integrated circuit (IC), the compatibilities for different requested tasks are difficult to guarantee and the systems will be more complicated with the increase in the number of the requested tasks. The software defined network (SDN) and virtualization technology are the foundations of the cloud computing, and provide a promising and flexible approach to facilitate resource allocation [1, 2, 3]. Cloud service providers can allocate the available resources related to service nodes to the requested tasks depending on demand and supply. When a task consists of multiple sub-tasks, these sub-tasks could be deployed on several service nodes and form a service chain, which is a data flow through the service nodes in sequence and can be presented as a directed acyclic graph (DAG) [4]. Each sub-task needs the physical resources for central processing unit (CPU), memory, or graphic processing unit (GPU). Besides, there are bandwidth costs to transfer data on different service nodes. For example, in case of data transmission, it includes five sub-tasks and the service chain about these sub-tasks can be represented as: network receiving capture tracking synchronization

decoding, where each functional module is achieved by software programming and can run on a commonly used computer system. The complexity and development cost of a system can be effectively reduced by cloud computing, and the flexibility and scalability can also be improved. However, a new challenge in cloud computing is how to effectively allocate the available resources related to service nodes to the requested tasks, which leads to a combinatorial optimization problem

[5, 6].

1.1 Literature Review

The optimization problems for resource allocation in cloud computing have been widely studied [7, 8, 9, 10, 11], which are proved to be NP-hard and the complexities are analyzed in [12, 13, 14]

. Meta-heuristic algorithms are effective optimization approaches for solving these resource allocation problems. Several variants of genetic algorithm (GA) are developed to improve the performance of the resource allocation solution in

[7, 8, 9] and the fast elitist non-dominated sorting genetic algorithm (NSGA-II), which is described in detail in [15], is also used to tackle this problem [10, 11]

. Two modified particle swarm optimizations (MPSO) are proposed to reallocate the migrated virtual machines and achieve the resource management based on a flexible cost in

[16] and [17], respectively. Moreover, an ant colony optimization (ACO) for dealing with the nonlinear resource allocation problem is presented in [18]. To enhance the efficiency in terms of seeking the optimal solution, the authors in [19] introduce a hybrid optimization algorithm of simulated annealing and artificial bee colony (ABC-SA).

However, as the scale of the optimization problem grows, a large feasible solution space needs to be searched and the computational complexity order of seeking the optimal solution increases. Hence, the performance of solving the problem could be reduced by using meta-heuristic algorithms [20, 21]. To further solve this issue, some optimization algorithms based on decomposition and cooperative co-evolutionary method are introduced, where a large-scale problem is divided into several small-scale problems and global optimal solution can be obtained by addressing these sub-problems with cooperative co-evolutionary method [22, 23]. Reference [24] proposes a new cooperative co-evolution framework (CCFR), which can efficiently allocate computational resources based on the contributions of different sub-populations, to address a large-scale optimization problem.

Compared with centralized optimization methods, distributed optimization algorithms based on multi-agent systems (MAS) [25] have an explicit potential advantage for solving the task deployment and resource allocation in cloud computing [26, 27, 28]. In [26], the product allocation problem for supply chain market is considered as a discrete resource allocation and solved by a multi-agent based distributed optimization algorithm. In addition, an efficient greedy algorithm with multi-agent is proposed to address the task allocation problem in social networks [27], where the agents just require their local information about tasks and resources, and provide the resources for the tasks by an auction mechanism. Another auction-based virtual machine resource allocation approach with the multi-agent system is presented to save energy cost, and the virtual machines assigned on different agents can be exchanged by a local negotiation-based approach in [28].

1.2 Contributions

In this paper, we assume that the information about the requested tasks and service nodes is given in advance, e.g., task types, the number of tasks, resource requirements for each sub-task and resource capacity values of service nodes. Furthermore, we formulate the problem of resource allocation to maximize the resource utilization and minimize the bandwidth cost under resource constraints. The resource utilization is defined as the ratio of the number of resources used by all the requested tasks to the total number of resources of the service nodes that are used to deploy the requested tasks. In order to address the optimization problem, we propose a hierarchical multi-agent optimization (HMAO) algorithm which is a combination of improved GA and multi-agent optimization (MAO) algorithm.

Firstly, we decompose the main objective of maximizing the resource utilization and minimizing the bandwidth cost into two sub-objectives: maximizing the resource utilization and minimizing the bandwidth cost [24]. The two sub-objective optimization problems are in conflict with each other and we assume that the former is considered with a higher priority. The improved GA is used to seek the optimal solution in order to maximize the resource utilization and the optimal solution can be expressed as a set of service nodes that are used to deploy the requested tasks. For the MAO algorithm, there are two types of agents: service agent and shared agent. Service agents are assigned to each service node to assist in resource management. A shared agent holds the information about resource allocation for all the service nodes and supports the service agents in access and update processes. The agents have environment-aware, autonomy, social behavior and load-balancing properties[29]. The service agents visit the shared agent to obtain the information about resource allocation for all the service agents, and they can migrate and swap their sub-tasks with each other by selection and exchange operators that are designed based on a probabilistic approach. In addition, a priority-based source sub-task selection mechanism for the selection operator is implemented by considering load-balancing on the service nodes and the relationships between different sub-tasks. As a result, a global optimal solution will be obtained by seeking the optimal solutions of the service agents with cooperative co-evolutionary method [23]. The proposed HMAO algorithm provides the following contributions.

  1. Considering to solve the joint problem of optimizing the resource utilization and bandwidth cost as an entire problem, it will increase the computational complexity of the optimization problem and weaken the performance of seeking the optimal solution, especially, for the high-dimensional problems. In this paper, we decompose the total optimization problem into two sub-problems. Correspondingly, a hierarchical multi-agent optimization algorithm, which combines an improved GA with MAO algorithm, is presented for solving the two optimization sub-problems. So that we can effectively reduce the computational complexity of the overall optimization problem.

  2. The set of available service nodes, which represents the optimal solution of maximizing the resource utilization, can be obtained by the improved GA. Then the MAO algorithm is proposed to solve the sub-problem of minimizing the bandwidth cost. We consider four main characteristics for the agents: environment-aware, autonomy, social behavior and load-balancing. Furthermore, we design the action sets and behavior criteria for service agents and a shared agent, the relationships between different agents are illustrated based on an organized architecture. To migrate and swap those sub-tasks on the service nodes, we implement two operators of selection and exchange, where the selection operator consists of a source sub-task selection and a target sub-task selection. Considering load-balancing on the service nodes and the relationships between different sub-tasks, the source sub-task is provided by a priority-based selection mechanism. For the MAO algorithm, a feasible solution is partitioned into several small-scale solutions and each service agent indicates one part. We can find the global optimal solution by optimizing the objectives of the service agents with cooperative co-evolutionary method.

  3. To keep diversity in feasible solutions and avoid premature convergence, the selection and exchange operators for the MAO algorithm are implemented based on a probabilistic method. For the former, we can randomly choose a sub-task from the service nodes as the target sub-task through the selection probability. Similarly, if the objective result for a service agent does not improve after the exchange, it can also continue to be executed with the use of exchange probability. By introducing a probabilistic method to the selection and exchange operators, we can further improve the performance of the proposed HMAO algorithm.

Fig. 1: Procedure for resource allocation.

Finally, several experiments for different tasks are carried out to verify the performance of the proposed HMAO algorithm. The results show that the proposed HMAO algorithm is an effective approach to solve the optimization problem of resource allocation. When compared with GA and NSGA-II, the proposed HMAO algorithm performs better for high-dimensional problems in terms of solution quality, convergence time and stability.

The remainder of this paper is organized as follows. Section 2 introduces the system model of resource allocation. In Section 3, we provide the problem formulation for the system model. A hierarchical multi-agent optimization algorithm is proposed in Section 4. Section 5 investigates the effect of key parameters for the proposed HMAO algorithm and evaluates the performance comparison with existing optimization algorithms. The conclusion of this paper is discussed in Section 6.

2 System Model

A cloud computing system can be modeled as a graph , where represents a set of service nodes with service nodes, is the -th service node. describes a set of links between these service nodes and indicates the link between service nodes and . Service node contains four resources: CPU, memory, GPU and bandwidth, whose capacities are denoted as and , respectively. We also assume that any two service nodes can communicate with each other through inter-connected networks, i.e., . Let be a set of tasks with total number of tasks and denotes the -th task. consists of sub-tasks and is expressed as , which is an ordered list and there is a precedence relationship between different sub-tasks. That is, can not be carried out until all its predecessors are finished. The resource requirements of CPU, memory, GPU and bandwidth for running on are described as , respectively. In cloud computing, an effective resource allocation approach is used to allocate the available resources of service nodes to the requested tasks. Note that the total number of resources on any service node can not be more than its capacities, and the bandwidth cost will be considered when two adjacent sub-tasks are placed on different service nodes. Moreover, we can deploy as many sub-tasks as possible to a service node in order to improve the resource utilization [7, 12, 11].

Fig. 2: Example of resource allocation with different schemes.

The procedure for resource allocation in our system model is shown in Fig. 1, where the information about the requested tasks and service nodes is obtained in advance, all sub-tasks from the requested tasks are deployed to multiple service nodes simultaneously, and the neighbouring sub-tasks can be located on the same service node or different ones. Therefore, different resource allocation schemes have an impact on the system performance [12].

Fig. 2 presents an example of resource allocation with different schemes. There are five service nodes with communication links in the network, two tasks are given as and . In Fig. 2, for task , sub-tasks are deployed on service node , sub-tasks and are on service nodes and , respectively. A service chain for is built on the service nodes and in order, and there is no bandwidth cost between sub-tasks and as they are on the same service node. Hence, the bandwidth cost for is . For task , sub-tasks and are allocated on service node , and sub-tasks and are on service nodes and , respectively. The bandwidth cost for is indicated as . Different resource allocation schemes for tasks and are described in Fig. 2. sub-tasks and are assigned on service node , and sub-tasks and are on service nodes and , respectively, and the bandwidth cost for is . Similarly, the bandwidth cost for is . As shown in Fig. 2, the bandwidth costs vary for different resource allocation schemes.

3 Problem Formulation

Symbol Definition
Set of service nodes in cloud computing.
Set of network links in cloud computing.
Maximum number of service nodes.
Set of service nodes deployed tasks.
Number of service nodes deployed tasks.
Set of tasks.
Number of tasks.
Number of sub-tasks for the -th task.
Set of sub-tasks for the -th task.
The -th sub-task for the -th task.
The -th service node.
Resource capacity for CPU on .
Resource capacity for Memory on .
Resource capacity for GPU on .
Resource capacity for Bandwidth on .
Resource requirement of for CPU on .
Resource requirement of for Memory on .
Resource requirement of for GPU on .
Resource requirement of for Bandwidth on .
Indicate whether is assigned on .
Indicate whether a bandwidth cost is available for .
Bandwidth cost for running .
Set of successors of .
Set of predecessors of .
Objective function of resource utilization.
Objective function of bandwidth cost.
Total objective function.
Weight values.
TABLE I: List of Symbols

In this section, a mathematical description of resource allocation is presented for our system model. Our purpose is to maximize the resource utilization and minimize the bandwidth cost in cloud computing with several physical constraints. Moreover, we do not consider the network resource constraints in this paper, such as routers and switches. The main symbols used to formulate our problem are summarized in Table I.

To further discuss the problem, we define a binary decision variable that indicates whether sub-task is deployed on service node , i.e., means is allocated on , otherwise not. In addition, another binary decision variable is used to describe whether there is a bandwidth cost between sub-tasks and . Let us denote a set of predecessors of as and a set of successors of as . We assume that and are placed on and , respectively. When , then , or else . Note that the bandwidth cost between a sub-task and the source node is not neglected. The bandwidth cost for can be written as:

(1)

With the physical resource constraints [9], the number of resources used by the requested tasks on a service node should be less than the resource capacities. As a result, the resource constraints about CPU, memory and GPU are given as follows:

(2)

Similarly, as the amount of bandwidth used on a service node can not be more than the capacity, the bandwidth resource constraint is expressed as:

(3)

Furthermore, all the sub-tasks from the task set are deployed to the service nodes in and any sub-task can be allocated just only once. Thus, we can obtain the following:

(4)

In this paper, one of our goals is to improve the resource utilization of cloud computing by reducing the number of service nodes used. For service node , the resource utilization includes three parts: CPU utilization, memory utilization and GPU utilization, which are denoted by , respectively. They can be computed as follows:

(5)

According to the preferences of different resource types, we provide the weight values for CPU, memory and GPU as and , respectively. The resource utilization for our system model is obtained by a linear weighted sum method as follows:

(6)

where is a feasible solution for allocating the available resources of service nodes to the requested tasks in cloud computing. The parameter denotes the number of service nodes that are used to deploy the requested tasks. In addition, .

Another goal is to optimize the bandwidth cost by deploying the adjacent sub-tasks to the same service node. Let us denote the bandwidth cost as which can be expressed as follows:

(7)

Next, we need to simultaneously optimize these two objectives which are and . Specifically, our purpose is to seek an optimal solution which maximizes the resource utilization and minimizes . Thus the combined objective which involves maximizing and minimizing can be expressed as:

(8)

where and are the weight values for and , respectively. The preferences of different objectives can be adjusted by varying and , and we consider . However, this optimization problem is regarded as NP-hard problem, which means a high computational complexity order for finding an optimal solution. Meta-heuristic optimization algorithms are effective approaches to solve the combinatorial optimization problem, but the performance of solution quality and convergence time will degrade with the increase in the scale of the problem. Distributed optimization algorithms, which are based on decomposition and multi-agent systems, can improve the optimization solution effectively.

In the next section, we propose the HMAO algorithm, which is an optimization approach using hierarchical multi-agent framework, to address the resource allocation problem in cloud computing.

Fig. 3: Proposed hierarchical multi-agent framework.

4 Hierarchical Multi-agent Optimization

In this paper, we address a joint optimization problem of maximizing the resource utilization in cloud computing systems and reducing the bandwidth cost. To reduce the computational complexity of the optimization problem, we decompose the total objective into two optimization sub-problems: maximizing and minimizing , and these two optimization sub-problems will be solved, accordingly.

Firstly, an improved GA is introduced to find the optimal solution which maximizes . All sub-tasks from make up an ordered list as an individual that represents a feasible solution. We use a roulette wheel selection approach [30] to obtain those individuals with higher fitness values. Besides, two-point crossover and signal-point mutation are also used [31]. A set of service nodes that are used to deploy the requested tasks can be provided as the optimal solution by the improved GA.

Then, a multi-agent optimization algorithm is proposed to solve the sub-problem of minimizing . We use a shared agent to hold the information of resource allocation for the service nodes, and assign service agents to each service node for assisting in resource management. Different service agents can cooperate, coordinate and compete with each other to optimize their objectives with respect to the behavior criteria [28]. To keep the diversity of feasible solutions and avoid the occurrence of premature convergence, both selection and exchange operators are achieved by a probabilistic method. In addition, a feasible solution consists of all the sub-tasks from the available service agents and those sub-tasks on a service agent are considered as part of a feasible solution. We can optimize the objectives of the service agents with cooperative co-evolutionary method to obtain a global optimal solution [23].

Fig. 4: Procedure for decoding an individual.

Fig. 3 shows the proposed hierarchical multi-agent framework. The procedure for addressing the optimization problem is divided into two steps. In the first step, the improved GA is used to find the optimal solution which maximizes , and each agent represents an individual. In the second step, the MAO algorithm is applied to seek the optimal solution which minimizes .

4.1 Improved GA

For the sub-problem of optimizing the resource utilization, the objective is to maximize and the constraints need to be satisfied. The improved GA is introduced to solve this optimization problem.

In the improved GA, a population includes individuals, an individual is encoded by a permutation representation [31, 32] and consists of genes in order, where a gene represents a sub-task. During the process of decoding, all service nodes in are sorted in ascending order by index, and the service nodes with low index have high priority to be deployed with the requested tasks. Thus, a sub-task is deployed depending upon the priority of a service node until the resource requirements are satisfied. By the decoding method, the sub-tasks from an individual can be allocated to these service nodes in in sequence, and the result of resource allocation indicates a feasible solution.

Fig. 4 illustrates the procedure for decoding an individual. There are two tasks: with 5 sub-tasks and with 4 sub-tasks, an individual is expressed as . Three service nodes are denoted as and , respectively, and their resource capacities are limited. Firstly, it is seen that sub-tasks and are allocated to service node , but the required resources for are more than the capacities after deploying to . As a result, sub-task is moved to service node . Due to the high priority of service nodes with low index, sub-task will be placed to service node . Similarly, we deploy sub-tasks and to service node , sub-tasks and to service node .

Three operators for selection, crossover and mutation are used as follows:

  • Selection operator: A roulette wheel selection [30]

    is applied to obtain the individuals with high fitness values. All individuals are sorted in ascending order by their fitness values and the cumulative probability distribution function (CDF) is computed accordingly. Then we choose the candidates through the concept of the survival of the fittest,which means an individual with high fitness is more likely to be chosen.

  • Crossover operator: Two-point crossover [31] is used as the crossover operator and executed with crossover probability . We randomly select two gene points for two individuals (one each from the mother and father) in the field to mate with each other, each offspring inherits some of the genes from their parents, respectively. An example of two-point crossover is described in Fig. 5. Two individuals are given as the parents, and two gene points are randomly generated, such as and . Firstly, an offspring inherits and from as the genes of itself. Then genes () from need to be searched from low to high, and they can not be the same as that of the offspring. Therefore, genes and from are inherited as the genes of the offspring. In the same way, the offspring can inherit genes and from . Thus the offspring is indicated as . Similarly, can represent another offspring.

    Fig. 5: Example of two-point crossover operator.
  • Mutation operator: Mutation operator [31] is carried out with mutation probability , where we randomly choose two gene points from an individual and exchange their genes to gemerate a new individual.

1:  Initialize: Population , individual , crossover probability , mutation probability , maximum number of iterations ;
2:  for  to  do
3:     Compute the fitness for all individuals ;
4:     Obtain the CDF based on the fitness values;
5:     Run selection operator to get candidates;
6:     for  to  do
7:        Randomly choose two individuals;
8:        Generate a random number ;
9:        if  then
10:           Run two-point crossover operator;
11:        end if
12:     end for
13:     Produce offspring population;
14:     for  to  do
15:        Randomly choose an individual;
16:        Generate a random number ;
17:        if  then
18:           Run mutation operator;
19:        end if
20:     end for
21:     Update population ;
22:  end for
Algorithm 1 Improved GA.

The improved GA is described in Algorithm 1. Let us denote the maximum number of iterations as . At the beginning, the initial population is randomly produced, we compute the fitness of the objective function for those individuals from population and obtain the CDF. Then the population of offsprings can be obtained by running selection, crossover and mutation operators, respectively. The optimal solution of maximizing will be given in iterative evolution. The termination criterion is met when the number of iterations is greater than .

According to the improved GA, we obtain an optimal solution which maximizes the resource utilization, where the optimal solution indicates a set of service nodes that are used to deploy the requested tasks and contains service nodes. However, the bandwidth cost and load-balancing are not considered for this sub-problem. The following section 4.2 will discuss the sub-problem of minimizing the bandwidth cost and load-balancing based on a set of service nodes.

4.2 Multi-agent Optimization

For set of service nodes, we propose a multi-agent optimization approach to address the optimization sub-problem of minimizing with the constraints. For the MAO, the agents have four characteristics: environment-aware, autonomy, social behavior and load-balancing [29], which are described in detail as follows:

  1. Environment-aware: The environment in the MAO algorithm consists of all the agents and their relationships, where the agents are designed as a shared agent and service agents in advance, and we use an organized architecture to illustrate their relationships. A service agent can obtain the information about resource allocation by accessing the shared agent, and interact with other service agents to better adapt to the current environment condition, that is, to achieve a higher fitness value. A shared agent provides access services for the service agents in order to help with resource management.

  2. Autonomy: When the environment conditions are changed, an agent can autonomously make a decision for the next actions to adjust its fitness value by a set of actions and the behavior criteria. For a service agent, its action set includes four parts: access, selection, exchange and update. Firstly, it obtains the information about resource allocation by accessing the shared agent. Then selection and exchange operators, which are described later in this section, can be executed, respectively. The results of interaction on service agents will be updated on the shared agent. In addition, the action set of a shared agent can provide three services which are storing, accessing and updating the information.

  3. Social behavior: According to social behavior, the agents share the information about resource allocation, and the sub-tasks deployed on different service agents can be exchanged with each other. There are two kinds of social behavior which are between shared agent and service agents, and service agents with each other. For the former, the agents can share the information about resource allocation with all the service agents. For the latter, the aim is to exchange those sub-tasks that are deployed on different service agents to improve their objectives.

  4. Load-balancing: Load-balancing is an important issue to ensure that the service nodes are being used sufficiently. Therefore, considering load-balancing, a service agent runs the exchange operator by cooperating, coordinating and competing with other service agents.

Fig. 6: Architecture for multi-agent optimization.

In order to better describe the MAO algorithm, some important specifications are given as follows:

  • Shared agent: Shared agent, which is denoted by , holds the information about task deployment and resource allocation for all service agents, it can support the service agents to access and update the information in real-time.

  • Service agent: Service agents are assigned to the service nodes in to assist in resource management, where they can access and update the information about resource allocation on the shared agent. Moreover, the sub-tasks deployed on different service agents can be migrated and swapped with each other by selection and exchange operators. The service agent assigned to is indicated as and all the service agents make up a set of the service agents.

  • Adjacent agent: We assume that , , where , then are seen as adjacent agents.

  • Active sub-task: For a sub-task , its adjacent sub-tasks are migrated and swapped to to improve the bandwidth cost by selection and exchange operators. Sub-task is defined as an active sub-task.

  • Host service agent: If an active sub-task , the service agent is considered as the host service agent.

  • Feasible solution: All sub-tasks placed on service agent are indicated as , where is considered as part of a feasible solution. Therefore, a feasible solution can be expressed as .

  • Objective function: A feasible solution is decomposed into parts, thus the objective can be re-written as: .

In the MAO algorithm, there is a shared agent and service agents, the agents can communicate with each other by network links, such as and , and and . The service agents can share the information about resource allocation by accessing the shared agent , migrate and swap their sub-tasks with each other to reduce their bandwidth costs. To keep the diversity of feasible solutions, selection and exchange operators are implemented based on a probabilistic method which is discussed later in this section.

Fig. 7: Source sub-task candidate list.

Generally, the procedure of the MAO includes four parts: obtain the adjacent agents, choose source and target sub-tasks, execute the exchange operator, and update the shared information. Firstly, let us select sub-task as the active sub-task, all its adjacent sub-tasks and their service agents can be obtained by accessing the shared agent. Then selection and exchange operators can be carried out for each sub-task from corresponding . For the selection operator, we need to choose the source and target sub-tasks, where the target sub-task is selected by a probabilistic method, and the source sub-task from is obtained by a priority-based selection mechanism. For exchange operator, we can migrate and swap the source and target sub-tasks among different service agents with multiple constraints. Furthermore, the exchange operation can be carried out with exchange probability if the objective fitness is not improved at this instance. When the exchange is finished, the shared information will be updated.

An example of the MAO algorithm is illustrated in Fig. 6. We assume that a host service agent is and an active sub-task is . The service agent can obtain the information about the adjacent sub-tasks for by visiting the shared agent, and there are two adjacent sub-tasks . For sub-task , it is chosen as the target sub-task with selection probability or else a random sub-task is chosen to be the target sub-task. A source sub-task from is selected by a priority-based selection mechanism. Then we can run the exchange operation and update the shared information. Similarly, we can proceed with the processing for .

Next, we discuss three main parts for the MAO algorithm.

4.2.1 Target Selection

In the MAO algorithm, to optimize the bandwidth cost, our purpose is to assign the adjacent sub-tasks from a task to the same service agent as far as possible. In addition, a probabilistic method is used to select the target sub-task to avoid premature convergence.

For active sub-task , the set of its adjacent sub-tasks is , and we assume that a candidate adjacent sub-task is and a random sub-task is denoted as . Therefore, the target sub-task is with probability , otherwise . The algorithm for target selection is described in Algorithm 2.

1:  Initialize: Probability ;
2:  For , ;
3:  Generate a random number ;
4:  if  then
5:     Let be the target sub-task;
6:  else
7:     Let be the target sub-task;
8:  end if
Algorithm 2 Target selection.

4.2.2 Source Selection

Considering load-balancing and objective optimization, a priority-based source sub-task selection mechanism is proposed. In the host service agent, all available sub-tasks for active sub-task are sorted in descend order by the priority, which includes two categories: the dependence relationships of sub-tasks for the host and adjacent service agents, and the used resources. There are three sub-task types according to the dependence relationships as follows:

  • Sub-task1: Let us denote active sub-task as , an adjacent service agent as and a set of available sub-tasks as . If , make and , let be sub-task1.

  • Sub-task2: If , make and , let be sub-task2.

  • Sub-task3: If , make , let be sub-task3.

The precedence relations for three sub-task types are ranked as: sub-task1 sub-task2 sub-task3. Furthermore, for the same sub-task type, we calculate the resource utilization difference between and the average value of the system for each candidate sub-task. All the candidate sub-tasks are sorted in ascending order by the differences.

Fig. 7 describes the procedure of a source sub-task candidate list. Host service agent is indicated as , active sub-task is and an adjacent service agent consists of and . We can find that is sub-task1, is sub-task2, and are sub-task3. For sub-task3, the differences of the sub-tasks are ranked as . As a result, the task list can be indicated as .

For the sub-task candidate list, the source sub-task can be chosen by the priority, and different source sub-tasks have an influence on the performance. The algorithm for source selection is shown in Algorithm 3.

1:  For active sub-task , an adjacent service agent ;
2:  Make a set of available sub-tasks ;
3:  for  do
4:     Obtain the set of its adjacent sub-tasks ;
5:     For ;
6:     if  and  then
7:        Let be sub-task1;
8:     else if  and  then
9:        Let be sub-task2;
10:     else
11:        Let be sub-task3;
12:     end if
13:     Assume is a source sub-task and calculate the difference between the resource utilization of and the average value of the system;
14:  end for
15:  Sort the sub-task candidate list by precedence.
Algorithm 3 Source selection.

4.2.3 Exchange Procedure

For active sub-task , the target sub-task and source sub-task candidate list are given by target and source selection operations, then the service agents can cooperate, coordinate and compete with each other to migrate and swap their sub-tasks to improve the objectives through the exchange operator. In this context, there are four situations to be considered as follows:

  • Objective optimization: For the exchange procedure, our aim is to migrate and swap these sub-tasks on different service agents to reduce their bandwidth costs. That is, the objective fitness should be improved for the host service agent after running the exchange operator.

  • Resource constraints: The number of resources used for each service agent can not exceed its resource capacities.

  • Load-balancing: Load-balancing is implemented by a priority-based source sub-task selection mechanism and the procedure of migrating and swapping the sub-tasks.

  • Diversity of feasible solutions: Exchange operator is achieved by a probabilistic method, when the objective optimization is not satisfied, the exchange procedure can continue to be carried out with a lower probability.

Next, we describe the exchange procedure in detail.

Let us denote the source sub-task list by . For , we firstly ensure that the objective result is improved by the exchange operation. If it is not, the exchange procedure can keep running with exchange probability . Considering the load-balancing for all the service agents, if the load-balancing constraint in equation (9) is satisfied, we will just migrate the target sub-task from to and there will be nothing to do for the source sub-task. Besides, the migration will be also considered when there is no available source sub-task in the candidate list. The load-balancing constraint can be expressed as:

(9)

where and show the resource utilization of and after the migration, respectively. The parameter indicates the average resource utilization for all sub-task types. In most cases, we need to swap the target and source sub-tasks with each other between and . Note that the migration and exchange are satisfied with the resource constraints. The exchange procedure is shown in Algorithm 4.

1:  For target sub-task , source sub-task candidate list , exchange probability ;
2:  for  do
3:     if The objective optimization is invalid then
4:        Generate a random number ;
5:        if  then
6:           Continue;
7:        end if
8:     end if
9:     if ( or ( and hold equation (9))) and meet with resource constraints then
10:        Run the migration and break;
11:     else if  and hold resource constraints then
12:        Run the exchange and break;
13:     end if
14:  end for
15:  Update the shared information on the shared agent.
Algorithm 4 Exchange procedure.

The MAO algorithm is an iterative optimization algorithm based on multi-agent systems and Algorithm 5 describes the entire MAO algorithm. We assume that the maximum number of iterations is . During an iterative evolution, all the sub-tasks from each service agent run the selection and exchange operations, and the objective value for this service agent can be improved with high probability. All the service agents can work together with cooperative co-evolutionary method. Thus, the global optimal solution can be found by increasing the number of iterations.

1:  Input: Maximum number of iterations , selection probability , exchange probability ;
2:  for  to  do
3:     for  do
4:        for  do
5:           Obtain by visiting ;
6:           for  do
7:              Run the target selection operation;
8:              Run the source selection operation;
9:              Run the exchange procedure;
10:           end for
11:        end for
12:        Update for ;
13:     end for
14:     Get the feasible solution and compute the objective fitness value;
15:  end for
16:  Obtain an approximate optimal solution.
Algorithm 5 MAO algorithm.

4.3 HMAO Algorithm in On-line Resource Allocation

1:  Input at time : Set of the tasks coming at , set of the tasks ending at ;
2:  Release the required resources of the old tasks in and update the resource information for the service nodes;
3:  Generate the initial population for ;
4:  Run the GA and obtain the set of service nodes that are used to deploy the new tasks;
5:  Carry out the MAO algorithm for ;
6:  Obtain an approximate optimal solution of allocating the available resources to .
7:  Time: .
Algorithm 6 HMAO Algorithm in on-line resource allocation.

The proposed HMAO algorithm is easily to be applied to the on-line resource allocation by a few modifications. In this paper, we consider the scenario of allocating the available resources related to service nodes to the requested tasks in batch mode. That is, the requested tasks need to be performed are collected and will be handled at a fixed time slot. In order to improve the quality of the initial solutions for the GA, an individual is encoded by the inter-dependent relationships of those sub-tasks in sequence and the rest of the population are randomly encoded to keep diversity of feasible solutions. For the decoding procedure of the GA, we sort the service nodes that are used to deploy the requested tasks in ascending order by the resource utilization. All the sub-tasks for an individual are assigned to the available service nodes with the first-fit rule [33, 34] and a new service node would be activated when any of the available service nodes can not meet the resource requirements of the sub-task to be assigned. In each time slot, some new requested tasks appear and several old requested tasks are over. Therefore, we can release the resources used by the old tasks and re-assign them to the new requested tasks. Algorithm 6 describes the proposed HMAO algorithm in on-line resource allocation. For time slot , let us denote the set of the new requested tasks as and the set of the old requested tasks that are ending as . Firstly, we end the old tasks from and release the required resources used. Then with existing resources used and physical resource constrains, we use the proposed HMAO algorithm to allocate the available resources of service nodes to the new requested tasks.

5 Performance Evaluation

In this section, we make several experiments for different number of communication tasks to verify the performance of the proposed HMAO algorithm with computer simulation results. Meanwhile, the performance of the proposed HMAO algorithm is analyzed by comparing with two existing baseline algorithms, which are GA and NSGA-II. The experimental platform is a high performance server, which is i7-4790k CPU, 16 GB memory and windows 10. Each communication task contains 5 parts: network receiving, capture, tracking, synchronization and decoding, and their resource requirements can be observed in Table II. All service nodes are homogeneous and the resource capacities for CPU, memory, GPU and bandwidth are 2900 MHz, 96 GB, 8 and 1000 Mbps, respectively. Note that the limitation of network links is not considered in this paper, such as switches and routers. Equation (8) is considered for performance metrics comparison between the proposed HMAO algorithm and other existing baseline algorithms.

Name
CPU
(MHz)
Memory
(GB)
GPU
Bandwidth
(Mbps)
Network receiving 290 9.6 0 100
Capture 319 11.52 1 97
Tracking 435 12.48 1 95
Synchronization 638 12.48 1 92
Decoding 145 4.8 1 90
TABLE II: Resource requirements for a task.

5.1 Simulation Parameters Setup

Factor Level
1 2 3 4
0.01 0.05 0.10 0.15
0.01 0.05 0.10 0.15
250 500 750 1000
TABLE III: Parameters for Taguchi method.

Firstly, we assume that the weights in equation (8) have the same values, and can be described as and . The parameters for the improved GA are . In the MAO algorithm, there are three main parameters: selection probability , exchange probability

and the number of iterations, which have impact on the performance results. In order to better estimate this impact on the performance for different combinations of parameters, the Taguchi method of design-of-experiment (DOE) is used to generate the test cases and analyze the results of our experiments

[29]. There are 3 factors and each factor contains 4 levels, and the combinations of different parameters for Taguchi method are shown in Table III. Moreover, we develop the orthogonal table , where there are 16 cases and each case is carried out 10 times for 8, 16, 24, 32 tasks, respectively, to obtain the average results. Table IV provides the orthogonal table and the values related to the approximate solutions for different tasks.

No. Factor
0 0.01 0.01 250 0.7829 0.8008 0.8103 0.8109
1 0.01 0.05 500 0.7853 0.8032 0.8141 0.8133
2 0.01 0.10 750 0.7861 0.8079 0.8138 0.8136
3 0.01 0.15 1000 0.7869 0.8075 0.8152 0.8139
4 0.05 0.01 500 0.7869 0.8088 0.8156 0.8146
5 0.05 0.05 250 0.7869 0.8062 0.8144 0.8136
6 0.05 0.10 1000 0.7869 0.8088 0.8167 0.8148
7 0.05 0.15 750 0.7869 0.8088 0.8164 0.8147
8 0.10 0.01 750 0.7869 0.8088 0.8161 0.8148
9 0.10 0.05 1000 0.7869 0.8088 0.8170 0.8148
10 0.10 0.10 250 0.7869 0.8088 0.8161 0.8141
11 0.10 0.15 500 0.7869 0.8088 0.8170 0.8143
12 0.15 0.01 1000 0.7869 0.8088 0.8170 0.8148
13 0.15 0.05 750 0.7869 0.8088 0.8170 0.8148
14 0.15 0.10 500 0.7869 0.8088 0.8170 0.8148
15 0.15 0.15 250 0.7869 0.8088 0.8167 0.8141
TABLE IV: Orthogonal table and the optimal solutions for .
(a) Mean for
(b) Mean for
(c) Mean for
(d) Mean for
Fig. 8: Mean for .

Fig. 8 shows the main effects plot for means with different tasks. It can be obvious that the objective values are improved as and the number of iterations increase. For the proposed HMAO algorithm, we can adjust the values of to the diversity of feasible solutions, however, it will cost more time to seek an approximate solution for . The ideal value of for 8, 16, 24, 32 tasks is 0.15, the ideal value of is 0.1 for 8, 24 tasks and 0.15 for 16, 32 tasks.

5.2 Numerical Simulation

Based on the parameters above, the maximum number of iterations is set as , we run several experiments for 8, 16, 24, 32 tasks to evaluate convergence of the proposed HMAO algorithm. As it is shown in Fig. 9, we illustrate the relationships between iterations and the bandwidth utilization for different tasks. Fig. 9(a) and Fig. 9(c) indicate the procedure for seeking the optimal solution of bandwidth utilization for with , , and the best values 0.1640 and 0.0.2076 can be observed at 49 and 176 iterations, respectively. For , , , their results for optimizing the bandwidth utilization are depicted in Fig. 9(b) and Fig. 9(d), and we can obtain the optimal solution results 0.1872 and 0.2137 at 73 and 121 iterations, respectively.

(a)
(b)
(c)
(d)
Fig. 9: Bandwidth utilization for .
Min Max Mean Std Time(min)
8 250 0.7869 0.7869 0.7869 0.0 0.2375
16 0.8046 0.8088 0.8084 0.0013 0.4777
24 0.8141 0.8170 0.8164 0.0012 0.7205
32 0.8104 0.8148 0.8137 0.0015 0.9703
8 500 0.7869 0.7869 0.7869 0.0 0.4995
16 0.8088 0.8088 0.8088 0.0 1.0125
24 0.8170 0.8170 0.8170 0.0 1.5177
32 0.8146 0.8148 0.8147 0.0001 2.0164
TABLE V: Results of simulation for the HMAO algorithm.

Similarly, Fig. 10 describes the evolutionary plots of the iterative optimal solutions for different tasks, which include the best and real-time objective values. Fig. 10(a) and Fig. 10(c) show the simulation results of 8, 24 tasks with , , Fig. 10(b) and Fig. 10(d) show the simulation results of 16, 32 tasks with , . It can be observed that the objective values for 8, 16, 24 and 32 tasks converge to the approximate solution results as 0.7869, 0.8088, 0.8112 and 0.8148 at 49, 73, 176 and 121 iterations, respectively. Therefore, it can be depicted that the proposed HMAO algorithm is an effective optimization algorithm to solve the problem of resource allocation in cloud computing and has a better convergence performance.

As the number of iterations have impact on the performance of the proposed HMAO algorithm, therefore, some experiments with 250 and 500 iterations are carried out for different tasks. Each case runs 10 times, we can obtain minimum, maximum, mean and standard deviation of the optimal solutions and the average computing time, which are shown in Table

V. It can be observed from Table V that the performance of the proposed HMAO algorithm is high when the number of iterations increases, i.e., the means for 8, 16, 24 and 32 tasks with 250 iterations are 0.7869, 0.8084, 0.8164 and 0.8137, respectively, while the means are 0.7869, 0.8088, 0.8170 and 0.8147 with 500 iterations, respectively. That is, the potential optimal solution for is more likely to be sought by exploring and exploiting the solution space iteratively. Besides, the results indicate that the computing time of the proposed HMAO algorithm increases almost linearly with the increase in the number of iterations.

(a)
(b)
(c)
(d)
Fig. 10: Objective values for .

5.3 Performance Comparison with the Baseline Algorithms

In order to further discuss the effectiveness of the proposed HMAO algorithm, we compare the proposed HMAO algorithm for different tasks with two existing algorithms, which are GA and NSGA-II.

GA: The genetic algorithm described in [9] is used in our paper. We randomly select two individuals from the population to be the parents, and they can mate with each other by the two-point crossover operator with crossover probability to generate their offsprings. If the fitness value of an offspring is superior to its parent, we will consider it as a candidate in the next generation. Otherwise, the mutation operator is carried out with mutation probability . An individual that has a better fitness value between the offspring and its parent will be seen as a new individual in the next generation.

NSGA-II: We introduce NSGA-II in [15] to address our problem, two optimization sub-problems are and , respectively. A binary tournament selection method [30] is applied to make decision for choosing the parents from the population to mate with each other. In crossover operator, we randomly select two gene points of one individual to exchange their genes equivalently with the other individual under crossover probability . A bitwise mutation is executed with mutation probability .

(a)
(b)
(c)
(d)
Fig. 11: Objective values for with HMAO, GA and NSGA-II.

With the above analysis of the proposed HMAO algorithm, we set the maximum iterations as 1000 for the proposed HMAO algorithm, 5000 for GA and NSGA-II, the population is set as 16 for the proposed HMAO algorithm, 100 for GA and NSGA-II. Moreover, we have , , . The number of tasks is from 4 to 16 corresponding to the number of initial service nodes , respectively. We run each test case 10 times and compute the average results. The parameters used for the proposed HMAO algorithm, GA and NSGA-II are summarized in Table VI.

Parameters Value
Number of tasks
Number of service nodes
Maximum iterations HMAO: 1000; GA,NSGA-II: 5000
Population size HMAO: 16; GA,NSGA-II: 100
Selective probability 0.15
Exchange probability 0.15
Crossover probability 1.0
Mutation probability 0.1
Running time 10
TABLE VI: Parameter setting for HMAO, GA and NSGA-II.

Firstly, we simulate all test cases by the proposed HMAO algorithm, GA and NSGA-II with the specified parameters in Table VI and obtain the average results of these three algorithms, including resource utilization, bandwidth utilization, objective value and time cost. Note that the number of initial service nodes can influence the results of GA and NSGA-II.

To investigate the evolution of the optimal solution for the proposed HMAO algorithm over time, we compare the evolutionary performance of the proposed HMAO algorithm, GA and NSGA-II for and the results of these three algorithms are shown in Fig. 11. It can be observed that the performance of the proposed HMAO algorithm is similar to that of GA for , NSGA-II for . Hence, the effectiveness of the proposed HMAO algorithm is verified by comparing the performance of the proposed HMAO algorithm with GA and NSGA-II algorithms. Moreover, the convergence of the proposed HMAO algorithm is better than GA and NSGA-II as the number of tasks grows.

The reason is that the selection and exchange operators for the proposed HMAO algorithm are executed with a finer-grained sub-task level and the optimization objective tends to a good result with high probability during each iterative evolution. Besides, the probabilistic-based selection and exchange operators are used to achieve the diversity of feasible solutions and avoid early entering into the local optimal solution. However, the space of feasible solutions for GA and NSGA-II becomes larger with the increase in the number of tasks, i.e., it will take more time to search the optimal solution. In addition, Fig. 11 shows that GA outperforms NSGA-II for , because each offspring generated by GA, which has a better fitness value than its parent, will be selected as a new individual in the next generation and that can guarantee that the performance of solving the optimization problem is improved in each iterative evolution.

(a) Resource utilization
(b) Bandwidth utilization
(c) Optimal solution
(d) Convergence time
Fig. 12: Performance comparisons for HMAO, GA and NSGA-II.

In order to further analyze the performance of the proposed HMAO algorithm, we provide the maximum, minimum, mean, standard deviation and average convergence time of the simulation results for the proposed HMAO algorithm, GA and NSGA-II, and the detailed information can be observed from Table VII. In all test cases except , the maximum, minimum, mean and standard deviation of the optimal solutions obtained by the proposed HMAO algorithm are better than or equal to GA and NSGA-II, respectively. Furthermore, the convergence time of the proposed HMAO algorithm is slower than GA, NSGA-II for but faster for the other cases, the difference is more evident with increasing the number of tasks. For example, in the case of , we can observe that the performance of the proposed HMAO algorithm increases by , and , the convergence time reduces by and for GA and NSGA-II, respectively. For , the mean results of the proposed HMAO algorithm improve by and , the convergence time decreases by and for GA and NSGA-II, respectively. The results can demonstrate that the proposed HMAO algorithm is an effective optimization approach for resource allocation and has a better performance in terms of solution quality, robustness and convergence.

L HMAO GA NSGA-II
Min Max Mean Std Time(min) Min Max Mean Std Time(min) Min Max Mean Std Time(min)
4 0.7869 0.7869 0.7869 0.0000 0.2737 0.7869 0.7869 0.7869 0.0000 0.0231 0.7869 0.7869 0.7869 0.0000 0.0569
5 0.7718 0.7718 0.7718 0.0000 0.0520 0.7718 0.7718 0.7718 0.0000 0.0481 0.7603 0.7718 0.7706 0.0034 0.1408
6 0.8170 0.8170 0.8170 0.0000 0.4752 0.817 0.817 0.817 0.0000 0.4573 0.7628 0.8170 0.7953 0.0265 0.4666
7 0.7989 0.7989 0.7989 0.0000 0.0645 0.7989 0.7989 0.7989 0.0000 0.2841 0.7567 0.7989 0.7937 0.0126 1.0380
8 0.7869 0.7869 0.7869 0.0000 0.1903 0.7869 0.7869 0.7869 0.0000 0.5710 0.7869 0.7869 0.7869 0.0000 0.8126
9 0.8170 0.8170 0.8170 0.0000 0.5219 0.8170 0.8170 0.8170 0.0000 2.3043 0.7782 0.8170 0.7899 0.0177 2.1777
10 0.8041 0.8041 0.8041 0.0000 0.8174 0.7975 0.8041 0.8034 0.0020 1.8199 0.7718 0.8041 0.7976 0.0129 2.1903
11 0.7368 0.8102 0.7874 0.0283 0.8174 0.7887 0.7944 0.7938 0.0017 1.7007 0.7771 0.7944 0.7909 0.0052 2.3302
12 0.8170 0.8170 0.8170 0.0000 0.2446 0.7818 0.8170 0.8045 0.0135 3.1743 0.7628 0.8170 0.7929 0.0161 4.0658
13 0.8070 0.8070 0.8070 0.0000 1.0464 0.7715 0.8069 0.7973 0.0104 3.8402 0.7762 0.8070 0.7977 0.0131 4.2793
14 0.8116 0.8119 0.8118 0.0001 2.1852 0.7677 0.7989 0.7861 0.0127 3.3968 0.7675 0.7989 0.7848 0.0119 5.2505
15 0.8170 0.8170 0.8170 0.0000 0.9704 0.7590 0.7923 0.7811 0.0099 4.3592 0.7680 0.8075 0.7875 0.0115 7.7129
16 0.8088 0.8088 0.8088 0.0000 1.0624 0.7576 0.7869 0.7757 0.0092 4.8812 0.7714 0.7868 0.7815 0.0049 9.3008
TABLE VII: Results of simulation for HMAO, GA and NSGA-II.

Furthermore, we compare the results of the proposed HMAO algorithm with GA and NSGA-II in Fig. 12, including resource and bandwidth utilization, optimal solution and average convergence time. Fig. 12(a) shows the resource utilization for different tasks, the results obtained by the proposed HMAO algorithm are equal to GA for , NSGA-II for , and better than GA and NSGA-II for the other cases. The solution with a lower resource utilization implies that there are more service nodes to be used to deploy the requested tasks. The results of bandwidth utilization are described in Fig. 12(b). We can observe that the performance of the proposed HMAO algorithm is better than or equal to GA and NSGA-II for . For , it is due to the fact that the number of service nodes for the proposed HMAO algorithm is less than that of the other two baseline algorithms. Fig. 12(c) illustrates the results of optimal solutions, the performance for the proposed HMAO algorithm, GA and NSGA-II is approximately equivalent with a lower number of tasks, and the proposed HMAO algorithm performs better than GA and NSGA-II as the number of tasks increases. The average convergence time of three algorithms can be observed from Fig. 12(d), the time taken to converge by the proposed HMAO algorithm is worse than that of GA and NSGA-II as the number of tasks is small, but the proposed HMAO algorithm outperforms GA and NSGA-II as the number of tasks increases.

5.4 Evaluate HMAO in On-line Resource Allocation

To further investigate the performance of the proposed HMAO algorithm in on-line resource allocation, we implement the resource allocation model in a dynamic environment and design several experiments of dynamically allocating the available resources to the requested tasks with the parameters given in Table VI. We assume that there are new requested tasks are appearing and old requested tasks are ending in each time slot, whose numbers are randomly produced from