AED: An Anytime Evolutionary DCOP Algorithm

09/13/2019 ∙ by Saaduddin Mahmud, et al. ∙ University of Southampton University of Dhaka Imperial College London 0

Evolutionary optimization is a generic population-based metaheuristic that can be adapted to solve a wide variety of optimization problems and has proven very effective for combinatorial optimization problems. However, the potential of this metaheuristic has not been utilized in Distributed Constraint Optimization Problems (DCOPs), a well-known class of combinatorial optimization problems. In this paper, we present a new population-based algorithm, namely Anytime Evolutionary DCOP (AED), that adapts evolutionary optimization to solve DCOPs. In AED, the agents cooperatively construct an initial set of random solutions and gradually improve them through a new mechanism that considers the optimistic approximation of local benefits. Moreover, we propose a new anytime update mechanism for AED that identifies the best among a distributed set of candidate solutions and notifies all the agents when a new best is found. In our theoretical analysis, we prove that AED is anytime. Finally, we present empirical results indicating AED outperforms the state-of-the-art DCOP algorithms in terms of solution quality.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

Introduction

Distributed Constraint Optimization Problems (DCOPs) are a widely used framework to model constraint handling problems in cooperative Multi-agent Systems (MAS). In particular, agents in this framework need to coordinate value assignments to their variables in such a way that minimize constraint violations by optimizing their aggregated costs [20]. This framework has been applied to various area of multi-agent coordination including distributed meeting scheduling [13], sensor networks [5] and smart grids [7].

Over the last two decades, a number of algorithms have been proposed to solve DCOPs, and they can be broadly classified into two classes: exact and non-exact. The former always provide an optimal solution of a given DCOP. Among the exact algorithms, ADOPT

[14], DPOP [17], and PT-FB [11] are widely used. Since solving DCOPs optimally is NP-hard, scalability becomes an issue as the system grows. In contrast, non-exact algorithms compromise some solution quality for scalability. As a consequence, diverse classes of non-exact algorithms have been developed to deal with large-scale DCOPs. Among them, local search based algorithms are generally most inexpensive in terms of computational and communication cost. Some well-known algorithms of this class are DSA [21], MGM & MGM2 [12], and GDBA [15]. Also, in order to further enhance solution quality and incorporate anytime property in local search based algorithms, Anytime Local Search (ALS) framework [22] is introduced. While inference based non-exact approaches such as Max-Sum [6] and Max-Sum_ADVP [23] have also gained attention due to their ability to explicitly handle n-ary constraints and guarantee optimality on an acyclic constraint graphical representations of DCOPs. The third class of non-exact approaches that have been developed are sample-based algorithms (e.g. DUCT [16] and PD-Gibbs [18]) in which the cooperative agents sample the search space in a decentralized manner to solve DCOPs.

More recently, a new class of non-exact DCOP algorithms have emerged in the literature through the introduction of a population-based algorithm ACO_DCOP [2]. ACO_DCOP is derived from a centralized population-based approach called Ant Colony Optimization (ACO) [3]

. It has been empirically shown that ACO_DCOP produces solution with better quality than the state-of-the-art DCOP solvers of the previous three classes. It is worth noting that although a wide variety of centralized population-based algorithms exist, ACO is the only such method that has been adapted to solve DCOPs. Among the remaining centralized population-based algorithms, a large portion is considered as evolutionary optimization techniques (e.g. Genetic Algorithm

[10], Evolutionary Programming [8]). Evolutionary optimization, as a population-based metaheuristic, has proven very effective in solving combinatorial optimization problems such as Traveling Salesman Problem [9], Constraint Satisfaction Problem [19], and many other besides. However, no prior work exists that adapts evolutionary optimization techniques to solve DCOPs. Considering the effectiveness of evolutionary optimization techniques in solving combinatorial optimization problems along with the potential of population-based DCOP solver demonstrated by ACO_DCOP motivates us to explore this unexplored area.

Against this background, this paper proposes a new population-based algorithm that adapts evolutionary optimization to solve DCOPs, that we call Anytime Evolutionary DCOP (AED). In more detail, AED maintains a set of candidate solutions that are distributed among the agents, and the agents search for new improved solutions by modifying the candidate solutions. This modification is done through a new mechanism that considers optimistic approximation of local benefits and utilizes cooperative nature of the agents. Moreover, we introduce a new anytime update mechanism in order to identify the best among this distributed set of candidate solutions and help the agents to coordinate value assignments to their variables based on the best candidate solution. Additionally, our theoretical analysis proves the anytime property of AED. Finally, we empirically evaluate AED that shows superior solution quality compared to the state-of-the-art non-exact DCOP algorithms.

Background

In this section, we first describe DCOPs and Evolutionary Optimization in more details. Then, we discuss challenges that need to be addressed in order to effectively adapt evolutionary optimization in the context of DCOPs.

Distributed Constraint Optimization Problems

Formally, a DCOP is defined by a tuple [14] where,

  • A is a set of agents .

  • X is a set of discrete variables , which are being controlled by the set of agents A.

  • D is a set of discrete and finite variable domains , where each is a set containing values which may be assigned to its associated variable .

  • F is a set of constraints , where is a function of a subset of variables defining the relationship among the variables in . Thus, the function denotes the cost for each possible assignment of the variables in .

  • is a variable-to-agent mapping function which assigns the control of each variable to an agent of . Each variable is controlled by a single agent. However, each agent can hold several variables.

Within the framework, the objective of a DCOP algorithm is to produce ; an assignment to all the variables that minimize111For a maximization problem the operator should be replaced with in Equation 1. the aggregated cost of the constraints as shown in Equation 1.

(1)

For ease of understanding, we assume that each agent controls one variable and constraints are binary. Thus, the terms ‘variable’ and ‘agent’ are used interchangeably throughout this paper. Figure 1a illustrates a sample example of DCOP using a constraint graph where each node represents an agent labeled by a variable that it controls and each edge represents a function connecting all . Figure 1b shows corresponding cost tables.

(a) A constraint Graph

[width=2.5em] 1 2
1 7 12
2 3 15

[width=2.5em] 1 2
1 2 7
2 11 18

[width=2.5em] 1 2
1 8 4
2 15 6

[width=2.5em] 1 2
1 9 13
2 12 5

(b) Cost Tables
Figure 1: Example of a sample DCOP

Evolutionary Optimization

Evolutionary optimization is a generic population-based metaheuristic inspired by biological evolutionary mechanisms such as Selection, Reproduction, Migration. The core mechanism of evolutionary optimization techniques can be summarized in three steps. In the first step, an initial population is generated randomly. A population is a set of ‘individuals’, each of which is a candidate solution of the corresponding optimization problem. In addition, a fitness function is defined to evaluate the quality of an individual with respect to global objective. Fitness of all the individuals in the initial population is also calculated. In the second step, a subset of the population is selected based on their fitness to reproduce new individuals. This process is known as Selection. In the final step, new individuals are created using the selected subset of the population and their fitness are evaluated. A subset of the old individuals is then replaced by the new individuals. Evolutionary optimization performs both the second and the third steps iteratively, which results in a gradual improvement in the quality of individuals. An additional step is performed at regular intervals by some parallel/distributed evolutionary optimization models that concurrently maintain multiple sub-populations instead of a single population. In this step, individuals are exchanged between sub-populations. This process is known as Migration and this interval is known as the Migration Interval.

Challenges

We need to address the following challenges in order to develop an effective anytime algorithm that adapts evolutionary optimization to solve DCOPs:

  • Individual and fitness: We need to define an individual that represents a solution of a DCOP along with a fitness function to evaluate its quality with respect to Equation 1. We also need to provide a method for calculating this fitness function in a distributed manner.

  • Population: We need to provide a strategy to maintain the population collectively among the agents. Although creating an initial random population is a trivial task for centralized problems, we need to find a distributed method to construct an initial random population for a DCOP.

  • Reproduction mechanism: In the DCOP framework, information related to the entire problem is not available to any single agent. So it is necessary to design a Reproduction method that can utilize information available to a single agent along with the cooperative nature of the agents.

  • Anytime update mechanism: We need to design an anytime update mechanism that can successfully perform following tasks – (i) Identify the best individual in a population that is distributed among the agents. (ii) Notify all the agents when a new best individual is found. (iii) Help coordinate the variable assignment decision based on the best individual in a population.

In the following section, we describe our method that addresses the above challenges.

Anytime Evolutionary DCOP Algorithm

AED is a synchronous iterative algorithm that consists of two phases: Initialization and Optimization. During the former, agents initially order themselves into a pseudo-tree, then initialize necessary variables and parameters. Finally, they make a random assignment to the variables they control and cooperatively construct the initial population. During the latter phase, agents iteratively improve this initial set of solutions. When an agent detects a better solution, it notifies other agents. Moreover, all the agents synchronously update their assignments based on the best of the individuals reported to them so far, resulting in a monotonous improvement of the global objective. Algorithm 1 shows the pseudo-code for AED. For ease of understanding, we show the process of initialization and anytime update separately in Procedure 1 and Procedure 2, respectively. Note that the initialization phase addresses the first two of our challenges, while optimization phase addresses the rest.

Figure 2: BFS Pseudo-Tree

Initialization Phase of AED consists of two parts; pseudo-tree construction and running INIT (Procedure 1) that initializes the population, parameters and variables (Algorithm 1: Line 1-2). This phase starts by ordering the agents into a Pseudo-Tree. This ordering serves two purposes helps in the construction of the initial population and facilitates ANYTIME-UPDATE (Procedure 2) during the optimization phase. Even though either of the BFS or DFS pseudo-tree can be used, AED uses BFS Pseudo-tree. This is because it generally, produce pseudo-tree with smaller height [1], which improves performance of ANYTIME-UPDATE (see Theoretical Analysis for details). Figure 2 shows an example of a BFS pseudo-tree constructed from constraint graph shown in Figure 1a having as root. Here, the height222Length of the longest path in the pseudo-tree. (i.e. H = 2) of this pseudo-tree is calculated during the time of construction and is maintained by all agents. From this point, refers to the set of neighbours; refers to the set of child nodes and refers to parent of an agent in the pseudo-tree. For instance, we can see in Figure 2 that , and for agent . After the pseudo-tree construction, all the agents synchronously call the procedure INIT (Algorithm 1: Line 2).

1 Construct pseudo-tree Every agent calls INIT( ) while Stop condition not met each agent  do
2        Partition into equal size subset for  do
3               Modify individuals in by Equation 5, 6, 8 Send message to
4       for  received from  do
5               Modify individuals in by Equation 7, 8 Send to
6        if  =  then
7               for  do
8                      Send to
9              for  received from  do
10                     
11              
12       
Algorithm 1 Anytime Evolutionary DCOP

INIT starts by initializing all the parameters and variables to their default values333AED takes a default value for each of the parameters as input. Default values of the variables have been discussed later in this section.. Then each agent sets its variable to a random value from its domain . Lines 3 to 25 of Procedure 1 describe the initial population construction process. In AED, we define population P as a set of individuals that are collectively maintained by all the agents and local population as the subset of the population maintained by agent . An individual in AED is represented by a complete assignment of variables in X and fitness calculated using a fitness function shown in Equation 2. This function calculates the aggregated cost of constraints yielded by the assignment. Hence, optimizing this fitness function results in an optimal solution for the corresponding DCOP.

(2)

Note that the fitness function can not be calculated by a single agent rather it is calculated in parts with cooperation of all the agents during the construction process. Moreover, the fitness value is added in the representation of an individual because it enables an agent to recalculate the fitness when a new individual is constructed only using local information. We take I = as an example of a complete individual from the DCOP shown in Figure 1. We use dot(.) notation to refer to a specific element of an individual. For example refers to in I. Additionally, we define a Merger operation of two individuals under construction, as Merge(). This operation constructs a new individual by aggregating the assignment and setting . We define an extended Merge operation for two ordered sets of individuals and as where is the i-th individual in a set.

At the beginning of the construction process, each agent sets to a set of empty individuals444Individuals with no assignment and fitness set to 0.. The size of the initial is defined by parameter IN. Then for each individual , agent makes a random assignment to . After that each agent executes merger operation on with each local population maintained by agents in (Procedure 1: Line 2-8). At this point, an individual consists of an assignment of variables controlled by , and agents in with fitness set to zero. For example, I = represents an individual of . The fitness of each individual is then set to the local cost according to their current assignment (Procedure 1: Line 9-10). Hence, the individual I from the previous example becomes . In the next step, each agent executes a merger operation on with each local population that are maintained by agent in . Then each agent sends to apart from root (Procedure 1: Line 11-18). At the end of this step, the local population maintained by root consists of complete individuals. However, their fitness is twice its actual value since each constraint is calculated twice. Therefore, the root agent at this stage corrects all the fitness values (Procedure 1: Lines 20-21). Finally, the local population of the root agent is distributed through the network so that agents can initialize their local population (Procedure 1: Line 22-25). This concludes the initialization phase and after that, all the agents synchronously start the optimization phase in order to iteratively improve this initial set of population.

algocf[t]    

Optimization Phase of AED consists of five steps, namely Selection, Reproduction, ANYTIME-UPDATE, local population update and Migration. An agent begins an iteration of this phase by selecting individuals from for the Reproduction step (Algorithm 1: Line 4). Prior to this selection, all the individuals are ranked from based on their relative fitness in the local population . The rank of a individual is calculated using Equation 3. Here, and are the individuals with the lowest and highest fitness in respectively555For minimization problems, lower value of fitness is better.. We define as the process of taking a sample with replacement666Any individual can be selected more than once. of size S from population

based on the probability calculated using Equation 4. As

increases in Equation 4, the fitness vs. selection probability curve gets steeper. As a consequence, individuals with better fitness get selected more often. In this way, controls the exploration and exploitation dynamics in the Selection mechanism. For example, assume consist of 3 individuals with fitness 16, 30, 40 respectively and . Then Equations 3 and 4 will yield, if and if . During this step, each agent selects individuals from which we define as .

(3)
(4)

Now, lines 5 to 11 of Algorithm 1 illustrate our proposed Reproduction mechanism. Agents start this step by partitioning into subsets of size ER. Then each subset is randomly assigninged to an unique neighbour. The subset assigned to is denoted by . An agent creates new individual from each with cooperation of neighbour . Initially, agent changes assignment by sampling from its domain using Equations 5 and 6. Then, is sent to . Agent updates its assignment of for each (i.e. ) using Equation 7. Additionally, both agents and update the fitness of the individual I by adding and to I.fitness, respectively. Here, is calculated using Equation 8 where and are the old and new values of , respectively.

(5)
(6)
(7)
(8)

For example, agent of Figure 1 creates a new individual from with the help of neighbour . Here, the domain of agent and is . Initially, agent calculates P(1) = 0.645 and P(2) = 0.355 using Equation 5, 6 (). It then updates

by sampling this probability distribution. The finesse is also updated by adding

(= -11). Let the updated I be , it is then sent to . Based on Equation 7, the new value of should be 1. Now, agent updates along with the fitness by adding (= -16) and sends I back to . Hence, Agent receives .

To summarize the Reproduction mechanism, each agent picks a neighbour randomly for each . Agent then updates by sampling based on the most optimistic cost (i.e. the lowest cost) of the constraint between and and aggregated cost of the remaining local constraints. This cost represents the optimistic local benefit for each domain value. Then sets to a value that complements the optimistic change in most. The key insight of this mechanism is that it not only takes into account the improvement in fitness that the change in will bring but also considers the potential improvement the change in will bring. Moreover, note that the parameter in Equation 6 plays a similar role as parameter in Equation 3. After the newly constructed individuals added to , the best individual B is sent for ANYTIME-UPDATE (Algorithm 1: Line 12-14).

To facilitate the anytime update mechanism, each agent maintains four variables LB, GB, FM, UM. LB (Local Best) and GB (Global Best) is initialized to empty individuals with fitness set to infinity. FM and UM are initialized to . Additionally, GB is stored with a version tag and each agent maintains previous versions of GB having version tags in the range (see theoretical section for details). Here, Itr refers to the current iteration number. We use to refer to the latest version of GB with version tag not exceeding j. Our proposed anytime update mechanism works as follows. Each agent keeps track of two different best, LB and GB. Whenever fitness of LB becomes less than GB it has the potential to be the global best solution. So it gets reported to the root through the propagation of a Found message up to the pseudo-tree. Since the root gets reports from all the agents, it can identify the true global best solution, and notify all the agents by propagating an Update message down to the pseudo tree. The root also adds the version tag in the Update message to help coordinate variable assignment. Now, ANYTIME-UPDATE starts by keeping LB updated with the best individual B in . In line 3 of Procedure 2, agents try to identify whether LB is the potential global best. When identified and if the identifying agent is the root, it is the true global best and an Update message UM is constructed. If the agent is not root, it is a potential global best and a Found message FM is constructed (Procedure 2: Lines 4-8). Each agent forwards the message UM to agents in and the message FM to the . Upon receiving these messages, an agent takes the following actions:

  • If an Update message is received then an agent updates both its GB and LB. Additionally, the agent saves the Update message in UM and sends it to all the agents in during the next iteration (Procedure 2: Lines 12-15).

  • If a Found message is received and it is better than LB, only LB is updated. If this remains a potential global best it will be sent to during next iteration (Procedure 2: Lines 16-17).

algocf[t]     An agent then updates the assignment of using (Procedure 2: Lines 18-19). Agents make decisions based on instead of potentially newer so that decisions are made based on the same version of GB. will be same for all agents since it takes at most H iterations for an Update message to propagate to all the agents. For example, assume agent from Figure 2 finds a potential best individual I at . Unless it gets replaced by a better individual, it will reach the root via agent through a Found message at . Then constructs an Update message at . This message will reach all the agents by and the agents save it as . Finally, at agents assign their variables using which is the best individual found at .

After ANYTIME-UPDATE, each agent updates their by keeping a sample of size and discard the rest. This sample is taken using which is the same as except agents sample without replacement777Each individual can be selected at most once.. This sampling method keeps population diverse by selecting a unique set of individuals.

Finally, Migration step takes place on every MI iteration which we sketch in lines 16-20 of Algorithm 1. For this step, we define as iteration number when the last Migration occurred. Migration is a simple process of local population exchange among neighbours through which individuals get to traverse around the network. Since the Reproduction mechanism only utilizes local information and local cooperation, this step plays an important role in the global optimization process. During this step, an agent selects a sample of size ER using for each , and sends a copy of those individual to that neighbour. Upon collecting individuals from all the neighbours, an agent adds them to its local population . This concludes an iteration of the optimization phase and every step repeats during the following iteration.

Theoretical Analysis

As discussed in the previous section, we introduce a new mechanism that incorporates anytime property with AED. In this section, we first prove that AED is anytime, that is the quality of solutions found by AED increase monotonically. Then we analyze the complexity of AED in terms of communication, computation, and memory requirements.

Lemma 1. At iteration i+H, the root agent is aware of the best individual in P at least up to iteration i.

Suppose, the best individual up to iteration i is found at iteration by agent at level . Afterwards, one of the following 2 cases will occur at each iteration.

  • Case 1. This individual will be reported to the parent of the current agent through a Found message.

  • Case 2. This individual gets replaced by a better individual on its way to root at iteration by agent at level

When only case 1 occurs, the individual will reach the root at iteration (since can be at most H). If case 2 occurs, the replaced individual will reach the root agent by . The same can be shown when the new individual gets replaced. In either case, at iteration i+H, the root will become aware of the best individual in P up to iteration i or will become aware of a better individual in P found at iteration ; meaning root will be aware of the best individual in P at least up to iteration i.

Lemma 2. The variable assignment decision taken by all the agents at iteration i+2H-1 yield a global cost equal to fitness of the best individual in P at least up to iteration i.

At iteration i+2H-1, all the agents take decision about variable assignment using . However, is the best individual known to the root up to iteration i+H. We know from Lemma 1 that, at iteration i+H, the root is aware of the best individual in P at least up to iteration i. Hence, the fitness of is at least equal to the best individual in P up to iteration i. Hence, at iteration i+2H-1, it yields global cost equal to the fitness of the best individual in P at least up to iteration i.

Proposition 1. AED is anytime.

From Lemma 2, the decision regarding the variable assignment at iteration and yield global cost equal to the fitness of the best individual in P at least up to iteration i and (), respectively. Now, the fitness of the best individual in P up to iteration iteration . So the global cost at iteration is less than equal to the same cost at iteration i. As a consequence, the quality of the solution monotonically improves as the number of iteration increases. Hence, AED is anytime.

Complexity Analysis

Assume, n is the number of agents, is the number of neighbours and is the domain size of an agent. In every iteration, an agent sends messages during the Reproduction step. Additionally, at most messages are passed for ANYTIME-UPDATE and Migration step. Now, can be at most n (complete graph). Hence, the total number of messages transmitted per agent during an iteration is . Since the main component of a message in AED is the set of individuals, the size of a single message can be calculated as the size of an individual multiplied by the number of individuals. During the Reproduction, Migration and ANYTIME-UPDATE steps, at most ER individuals, each of which has size , is sent in a single message. As a result, the size of a single message is . Which makes the total message size per agent during an iteration .

Before Reproduction, can be at most (if Migration occurred in the previous iteration) and Reproduction will add individuals. So the memory requirement per agent is . Finally, Reproduction using Equation 5, 6, 7, and 8 requires operations and in total individuals are reproduced during an iteration per agent. Hence, the total computation complexity per agent during an iteration is .

Experimental Results

In the previous section, we prove AED is anytime. Now, we empirically evaluate the quality of anytime solution produced by AED compared to six different state-of-the-art DCOP algorithms. We select these algorithms to represent all four classes of non-exact algorithms. Firstly, among the local search algorithms, we pick DSA (P = 0.8, this value of P yielded the best performance in our settings), MGM2 and GDBA (N,NM,T; reported to perform best [15]). Secondly, among the inference-based non-exact algorithms, we compare with Max-Sum_ADVP as it has empirically shown to perform significantly better than Max-Sum [23]. Thirdly, we consider a sampling-based algorithm, namely PD-Gibbs, which is the only such algorithm that is suitable for large-scale DCOPs [18]. Finally, we compare with ACO_DCOP (i.e. the pipe-lined version) as it is only available population-based DCOP algorithm. To evaluate ACO_DCOP, we use the same values of the parameters as recommended in [2]. For our proposed algorithm AED, we set for all experimental settings since they yielded the best results. Additionally, the ALS framework is used for non-monotonic algorithms having no anytime update mechanism.

We compare these algorithms on three different experimental settings. We consider random DCOPs for our first setting. Specifically, we set the number of agents to 70 and domain size to 10. We use Erdős-Rényi topology (i.e. random graph) to generate the constraint graphs with the value of (i.e. sparse graph) [4]. We then take constraint cost uniformly from the range . Our second setting is identical to the first setting except the value of (i.e. dense graph). For our final setting, we consider weighted graph coloring problems with the number of agents 120, 3 colors per agent, Erdős-Rényi topology with and constraint violation cost selected uniformly from . In all three settings, we run all algorithms on 50 independently generated problems and 30 times on each problem. Moreover, we stop each of the algorithms after 1000-th iteration. It is worth noting that all differences shown in Figures 3, 4, and 5 at the 1000-th iteration are statistically significant for .

Figure 3: Comparison of AED and the competing algorithms on a sparse configurations of random DCOPs

Figure 3 shows a comparison between AED and the benchmarking algorithms on the sparse random DCOP setting. The closest competitor to AED is ACO_DCOP which provides slightly better anytime solution till 230 iterations. Although both of the algorithms keep on improving the solution, the improvement rate of AED stays steadier than ACO_DCOP. This trend helps AED to produce a 1.8% better solution by 1000th iteration. On the other hand, most of the local search algorithms converge to local optima after 400 iterations - with GDBA producing the best performance. AED starts outperforming GDBA after 90 iterations and finds a 9.3% improved solution. Finally, the other two representative algorithms, Max-Sum_ADVP and PD-Gibbs are outperformed by . The superiority of AED in this experiment indicates that the Selection method along with the new Reproduction mechanism based on optimistic local benefit achieves a better balance between exploration and exploitation than other state-of-the-art algorithms. It is worth noting that we have further experimented with different values of the parameters and . With the change of and , the steepness of the curve and solution quality changes, which indicates that dynamics between exploration and exploitation can be tuned using and .

Figure 4: : Comparison of AED and the competing algorithms on a dense configurations of random DCOPs

Figure 4 shows a comparison between AED and other competing algorithms on dense random DCOP setting. It clearly shows the advantage of AED over its competitors. To be exact, it outperforms the benchmarking algorithms by a margin of . In this setting, most of the algorithms find results of similar quality with a slight variation. Among the competitors, GDBA outperforms ACO_DCOP by a slight margin, and is the closest to AED with AED improving solution quality by . PD-Gibbs fails to explore much through sampling and converge quickly while producing the largest performance difference with AED. It is also worth noting that AED produces a better solution than ACO_DCOP from the beginning in this setting.

Figure 5: Comparison of AED and the competing algorithms on weighted graph coloring problems

Figure 5

shows a comparison between AED and other competing algorithms on weighted graph coloring problems. In this experiment, AED demonstrates its excellent performance by outperforming other algorithms by a significant margin. Among the competing algorithms, ACO_DCOP comes the closest but is still outperformed by AED over 30% margin. Even though for the first 250 iterations ACO_DCOP produces better anytime cost, AED starts outperforming ACO_DCOP afterwards. This because the initial heuristic value of ACO_DCOP is able to construct good solution from the start. On the other hand, AED starts from an initial random solution and has to improve gradually. Among the local search algorithms, GDBA is the most competitive, but AED still finds solutions that are 53% better. Finally, it improves the quality of solution around

times over some of its competitors namely DSA, Max-Sum_ADVP and PD-Gibbs. Through this experiment, it is also evident that AED is an effective algorithm for DCSPs.

Conclusions

In this paper, we introduce a new algorithm called AED that effectively adapts evolutionary optimization to solve DCOPs. For Reproduction in AED, we devise a new mechanism that not only captures local benefits but also the potential benefits resulting from neighbour’s cooperation. We also develop a new anytime update mechanism to incorporate anytime property in AED. In our theoretical evaluation, we prove that AED is anytime. Finally, we present empirical results that show that AED produces about improved solutions on random DCOPs and times better solutions on weighted graph coloring problems, compared to the state-of-the-art non-exact DCOP algorithms. These results demonstrates the significance of applying evolutionary optimization techniques in solving DCOPs. In the future, we intend to investigate whether this algorithm can be applied to solve continuous-valued and multi-objective DCOPs. We would also like to explore ways to adapt evolutionary optimization to solve asymmetric DCOPs.

References

  • [1] Z. Chen, Z. He, and C. He (2017) An improved dpop algorithm based on breadth first search pseudo-tree for distributed constraint optimization. Applied Intelligence 47, pp. 607–623. Cited by: Anytime Evolutionary DCOP Algorithm.
  • [2] Z. Chen, T. Wu, Y. Deng, and C. Zhang (2018) An ant-based algorithm to solve distributed constraint optimization problems. In AAAI, Cited by: Introduction, Experimental Results.
  • [3] M. Dorigo, M. Birattari, and T. Stützle (2006) Ant colony optimization: artificial ants as a computational intelligence technique. Cited by: Introduction.
  • [4] P. Erdős and A. Rényi (1960) On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5 (1), pp. 17–60. Cited by: Experimental Results.
  • [5] A. Farinelli, A. Rogers, and N. R. Jennings (2014) Agent-based decentralised coordination for sensor networks using the max-sum algorithm. Autonomous agents and multi-agent systems 28 (3), pp. 337–380. Cited by: Introduction.
  • [6] A. Farinelli, A. Rogers, A. Petcu, and N. R. Jennings (2008) Decentralised coordination of low-power embedded devices using the max-sum algorithm. In AAMAS, Cited by: Introduction.
  • [7] F. Fioretto, W. Yeoh, E. Pontelli, Y. Ma, and S. J. Ranade (2017) A distributed constraint optimization (dcop) approach to the economic dispatch with demand response. In AAMAS, Cited by: Introduction.
  • [8] D. B. Fogel (1966) Artificial intelligence through simulated evolution. Cited by: Introduction.
  • [9] D. B. Fogel (1988) An evolutionary approach to the traveling salesman problem. Biological Cybernetics 60, pp. 139–144. Cited by: Introduction.
  • [10] J. H. Holland et al. (1975) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press. Cited by: Introduction.
  • [11] O. Litov and A. Meisels (2017) Forward bounding on pseudo-trees for dcops and adcops. Artificial Intelligence 252, pp. 83–99. Cited by: Introduction.
  • [12] R. T. Maheswaran, J. P. Pearce, and M. Tambe (2004) Distributed algorithms for dcop: a graphical-game-based approach.. In ISCA PDCS, pp. 432–439. Cited by: Introduction.
  • [13] R. T. Maheswaran, M. Tambe, E. Bowring, J. P. Pearce, and P. Varakantham (2004) Taking DCOP to the real world: efficient complete solutions for distributed multi-event scheduling. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004., pp. 310–317. Cited by: Introduction.
  • [14] P. J. Modi, W. Shen, M. Tambe, and M. Yokoo (2005) ADOPT: asynchronous distributed constraint optimization with quality guarantees. Artificial Intelligence 161 (1-2), pp. 149–180. Cited by: Introduction, Distributed Constraint Optimization Problems.
  • [15] S. Okamoto, R. Zivan, A. Nahon, et al. (2016) Distributed breakout: beyond satisfaction.. In IJCAI, pp. 447–453. Cited by: Introduction, Experimental Results.
  • [16] B. Ottens, C. Dimitrakakis, and B. Faltings (2012) DUCT: an upper confidence bound approach to distributed constraint optimization problems. ACM TIST 8, pp. 69:1–69:27. Cited by: Introduction.
  • [17] A. Petcu and B. Faltings (2005) A scalable method for multiagent constraint optimization. In IJCAI, Cited by: Introduction.
  • [18] N. Thien, W. Yeoh, H. Lau, and R. Zivan (2019-03) Distributed gibbs: a linear-space sampling-based dcop algorithm. Journal of Artificial Intelligence Research 64, pp. 705–748. External Links: Document Cited by: Introduction, Experimental Results.
  • [19] E. P. Tsang and T. Warwick (1990) Applying genetic algorithms to constraint satisfaction optimization problems. In Proceedings of the 9th European Conference on Artificial Intelligence, pp. 649–654. Cited by: Introduction.
  • [20] M. Yokoo, E. H. Durfee, T. Ishida, and K. Kuwabara (1998) The distributed constraint satisfaction problem: formalization and algorithms. IEEE Transactions on knowledge and data engineering 10 (5), pp. 673–685. Cited by: Introduction.
  • [21] W. Zhang, G. Wang, Z. Xing, and L. Wittenburg (2005) Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks. Artif. Intell. 161, pp. 55–87. Cited by: Introduction.
  • [22] R. Zivan, S. Okamoto, and H. Peled (2014) Explorative anytime local search for distributed constraint optimization. Artificial Intelligence 212, pp. 1–26. Cited by: Introduction.
  • [23] R. Zivan and H. Peled (2012) Max/min-sum distributed constraint optimization through value propagation on an alternating dag. In AAMAS, Cited by: Introduction, Experimental Results.