A Particle Swarm Inspired Approach for Continuous Distributed Constraint Optimization Problems

10/20/2020 ∙ by Moumita Choudhury, et al. ∙ 0

Distributed Constraint Optimization Problems (DCOPs) are a widely studied framework for coordinating interactions in cooperative multi-agent systems. In classical DCOPs, variables owned by agents are assumed to be discrete. However, in many applications, such as target tracking or sleep scheduling in sensor networks, continuous-valued variables are more suitable than discrete ones. To better model such applications, researchers have proposed Continuous DCOPs (C-DCOPs), an extension of DCOPs, that can explicitly model problems with continuous variables. The state-of-the-art approaches for solving C-DCOPs experience either onerous memory or computation overhead and unsuitable for non-differentiable optimization problems. To address this issue, we propose a new C-DCOP algorithm, namely Particle Swarm Optimization Based C-DCOP (PCD), which is inspired by Particle Swarm Optimization (PSO), a well-known centralized population-based approach for solving continuous optimization problems. In recent years, population-based algorithms have gained significant attention in classical DCOPs due to their ability in producing high-quality solutions. Nonetheless, to the best of our knowledge, this class of algorithms has not been utilized to solve C-DCOPs and there has been no work evaluating the potential of PSO in solving classical DCOPs or C-DCOPs. In light of this observation, we adapted PSO, a centralized algorithm, to solve C-DCOPs in a decentralized manner. The resulting PCD algorithm not only produces good-quality solutions but also finds solutions without any requirement for derivative calculations. Moreover, we design a crossover operator that can be used by PCD to further improve the quality of solutions found. Finally, we theoretically prove that PCD is an anytime algorithm and empirically evaluate PCD against the state-of-the-art C-DCOP algorithms in a wide variety of benchmarks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Distributed Constraint Optimization Problems (DCOPs) are an important constraint-handling framework for multi-agent systems in which multiple agents communicate with each other in order to optimize a global objective. The global objective is defined as the aggregation of cost functions (i.e., constraints) among the agents. Each of the cost functions involves a set of variables controlled by the corresponding agents. The structure of DCOPs has made it suitable for deploying in various real-world problems. It has been widely applied to solve a number of multi-agent coordination problems including multi-agent task scheduling sultanik2007modeling, sensor networks farinelli2014agent, multi-robot coordination Yedidsion2016ApplyingDT, etc.

Over the years, several algorithms have been proposed to solve DCOPs, and they are broadly categorized into exact and non-exact algorithms. Exact algorithms, such as ADOPT modi2005adopt, DPOP Petcu2005ASM; rashik2020speeding, and PT-FB litov2017forward are designed in such a way that they provide a global optimal solution of a given DCOP. However, since DCOPs are NP-Hard, exact algorithms experience exponential memory requirements and/or exponential computational costs as the system grows. On the contrary, non-exact algorithms such as DSA zhang2005distributed, MGM & MGM2 Maheswaran2004Distributed, Max-Sum farinelli2008decentralised; khangdp; khan2018speeding, CoCoA Leeuwen2017CoCoAAN, ACO_DCOP chen2018ant, and AED mahmud2019aed compromise some solution quality for scalability.

In general, DCOPs assume that the variables of participating agents are discrete. Nevertheless, many real-world applications (e.g., target tracking sensor orientation fitzpatrick2003distributed, sleep scheduling of wireless sensors hsin2004network) can be best modeled with continuous variables. Therefore, for discrete DCOPs to be applied in such problems, we need to discretize the continuous domains of the variables. However, the discretization process needs to be coarse for a problem to be tractable and must be sufficiently fine to find high-quality solutions of the problem stranders2009decentralised. To overcome this issue, a continuous version of DCOPs have been proposed stranders2009decentralised, which is later referred to as both Functional DCOPs choudhury2020particle and Continuous DCOPs (C-DCOPs) hoang2020new. In this paper, we will refer to it as C-DCOPs following the most popular convention. There are two main differences between C-DCOPs and DCOPs. Firstly, instead of having discrete decision variables, C-DCOPs have continuous variables that can take any value between a range. Secondly, the constraint functions are represented in functional forms in C-DCOPs rather than in the tabular forms in DCOPs.

In order to cope with the modification of the DCOP formulation, several C-DCOP algorithms have been proposed. Similar to DCOP algorithms, C-DCOP algorithms are also classified as exact and non-exact approaches (detailed discussions can be found in Section 

2). In this paper, we focus on the latter class of C-DCOP algorithms as the ensuing exponential growth of search space can make exact algorithms computationally infeasible to deploy in practice. Now, the state-of-the-art algorithms for C-DCOPs are based on either inference stranders2009decentralised; voice2010hybrid; hoang2020new or local search hoang2020new. In the inference-based C-DCOP algorithms, discrete inference-based algorithms, such as Max-Sum and DPOP, have been used in combination with continuous non-linear optimization methods. And, in the only local search-based C-DCOP algorithm, the discrete local search-based algorithm DSA has been extended with continuous optimization methods. However, continuous optimization methods, such as gradient-based optimization require derivative calculations and are thus not suitable for non-differentiable optimization problems.

Against this background, we propose a Particle Swarm Optimization (PSO) based C-DCOP algorithm called PSO-Based C-DCOP (PCD).111Preliminary versions of this research have appeared previously choudhury2020particle. This paper contains a more efficient approach and comprehensive description of the algorithm and comes with broader theoretical and experimental analysis to other state-of-the-art C-DCOP algorithms. PSO is a stochastic optimization technique inspired by the social metaphor of bird flocking eberhart1995particle. It has been successfully applied to many optimization problems such as Function Minimization shi1999empirical

, Neural Network Training 

zhang2007hybrid, and Power-System Stabilizers Design Problems abido2002optimal. However, to the best of our knowledge, no previous work has been done to incorporate PSO in distributed scenarios similar to DCOPs or C-DCOPs. In PCD, agents cooperatively keep a set of particles where each particle represents a candidate solution and iteratively updates the solutions using a series of update equations over time. Since PSO requires only primitive mathematical operators such as addition and multiplication, it is computationally less expensive (both in memory and speed) than the gradient-based optimization methods. Furthermore, PSO is a widely studied technique with a variety of parameter choices and variants developed over the years. Hence, the wide opportunity for developing PCD as a robust population-based algorithm has inspired us to analyze the challenges and opportunities of PSO in C-DCOPs. Our main contributions are as follows.

  • We develop a new algorithm PCD by tailoring PSO. In so doing, we redesign a series of update equations that utilize the communication topology in a distributed scenario.

  • We introduce a new crossover operator that further improves the quality of solutions found and name the version PCD_CrossOver.

  • We analyze the various parameter choices of PCD that balance exploration and exploitation.

  • We provide a theoretical proof of anytime convergence of our algorithm, and show empirical evaluations of PCD and PCD_CrossOver on various C-DCOP benchmarks. The results show that the proposed approach finds solutions with better quality by exploring a large search space compared to existing C-DCOP solvers.

In Section 2, we briefly review related work. In Section 3, we formulate the DCOP and C-DCOP frameworks as well as introduce PSO. Section 4 illustrates the details of our proposed PCD framework. Section 5 provides a theoretical proof of the anytime property and complexity analyses of PCD. In Section 6, we show empirical evaluations of PCD against existing C-DCOP algorithms. Finally, Section 7 concludes the findings of the paper and provides insights for future work.

2 Related Work

In this section, we discuss existing state-of-the-art exact and non-exact C-DCOP algorithms. The only exact algorithm for C-DCOP is the Exact Continuous DPOP (EC-DPOP), which only provides exact solutions to linear and quadratic cost functions and is defined over tree-structured graphs only hoang2020new. While there are several non-exact algorithms exist, the first non-exact algorithm for C-DCOP is the Continuous Max-Sum (CMS) algorithm. CMS extends the discrete Max-Sum stranders2009decentralised by approximating constraint cost functions as piece-wise linear functions. Subsequently, researchers introduced Hybrid Continuous Max-Sum (HCMS), which extends CMS by combining it with continuous non-linear optimization methods voice2010hybrid. However, continuous optimization methods, such as gradient-based optimization require derivative calculations and are thus not suitable for non-differentiable optimization problems. Finally, hoang2020new hoang2020new made the most recent contributions to this field. In their paper, the authors proposed four algorithms – one exact and three non-exact C-DCOP solvers. The exact algorithm is EC-DPOP, which we discussed earlier. The non-exact algorithms are Approximate Continuous DPOP (AC-DPOP), Clustered AC-DPOP (CAC-DPOP), and Continuous DSA (C-DSA). Both AC-DPOP and CAC-DPOP are based on the discrete DPOP algorithm with non-linear optimization techniques. The discrete DPOP algorithm Petcu2005ASM is an inference-based DCOP algorithm that performs dynamic programming on a pseudo-tree representation of the given problem. This algorithm only requires a linear number of messages but has an exponential memory requirement and sends exponentially large message sizes. Since the underlying algorithm for AC-DPOP is DPOP, it also suffers from the same exponentially large message sizes, which is a limiting factor for communication-constrained applications. Although CAC-DPOP provides a bound on the message size by limiting the number of tuples to be sent in the messages, each agent still needs to maintain the original set of tuples in their memory for better accuracy in calculation. Hence, CAC-DPOP still incurs an exponential memory requirement. Nevertheless, the authors also provide C-DSA, a local search algorithm based on DSA. Unlike the DPOP variants, C-DSA’s memory requirement is linear in the number of variables of the problem and it sends constant-size messages.

3 Background and Problem Formulation

In this section, we formulate the problem and discuss the background necessary to understand our proposed method. We first describe the general DCOP framework and then move to the C-DCOP framework, which is our problem of interest in this paper. We then discuss the centralized PSO algorithm and the challenges in incorporating PSO with the C-DCOP framework.

3.1 Distributed Constraint Optimization Problems

A Distributed Constraint Optimization Problem (DCOP) can be defined as a tuple  modi2005adopt where,

  • is a set of agents .

  • is a set of discrete variables , where each variable is controlled by one of the agents .

  • is a set of discrete domains , where each corresponds to the domain of variable .

  • is a set of cost functions , where each is defined over a subset = {, , …, } of variables , called the scope of the function, and the cost for the function is defined for every possible value assignment of , that is, : , where the arity of the function is . In this paper, we consider only binary cost functions (i.e., there are only two variables in the scope of all functions).

  • is a variable-to-agent mapping function khan2018near that assigns the control of each variable to an agent . Each agent can hold several variables. However, for the ease of understanding, we assume each agent controls only one variable in this paper.

An optimal solution of a DCOP is an assignment that minimizes the sum of cost functions as shown in Equation 1222For a maximization problem, the operator should be replaced by the operator.:

(1)

3.2 Continuous Distributed Constraint Optimization Problems

(a) Constraint Graph

(b) Cost Functions
Figure 1: Example of a C-DCOP.

Similar to the DCOP formulation, C-DCOPs can be defined as a tuple  hoang2020new. In C-DCOPs, , , and are the same as defined in DCOPs. Nonetheless, the set of variables and the set of domains are defined as follows:

  • is the set of continuous variables , where each variable is controlled by one of the agents .

  • is a set of continuous domains , where each corresponds to the domain of variable . In other words, variable can take on any value in the range of to .

As discussed in the previous section, a notable difference between DCOPs and C-DCOPs can be found in the representation of the cost functions. In DCOPs, the cost functions are conventionally represented in the form of a table, while in C-DCOPs, they are represented in the form a function hoang2020new. However, the goal of a C-DCOP remains the same as depicted in Equation 1. Figure 1 presents an example C-DCOP, where Figure 1a shows a constraint graph with four variables with each variable controlled by an agent . Each edge in the constraint graph represents a cost function and the definition of each function is shown in Figure 1b. In this particular example, the domains of all variables are the same – each variable can take values from the range .

3.3 Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a population-based optimization333For simplicity, we are going to consider the terms ‘optimization’ and ‘minimization’ interchangeably throughout the paper. technique inspired by the movement of a bird flock or a fish school eberhart1995particle. In PSO, each individual of the population is called a particle. PSO solves the problem by moving the particles in a multi-dimensional search space by adjusting the position and velocity of each particle. As shown in Algorithm 1, each particle is initially assigned a random position and velocity (Line 1). A fitness function is defined, which is used to evaluate the position of each particle. In each iteration, the movement of a particle is guided by both its local best position found so far in the search space and the global best position found by the entire swarm (Lines 1-1). The combination of the local and global best positions ensures that when a global better position is found through the search process, the particles will move closer to that position and explore the surrounding search space more thoroughly. Then, the local best position of each particle and the global best position of the entire population is updated when necessary (Lines 1-1). Over the last couple of decades, several versions of PSO have been developed. The standard PSO often converges to a sub-optimal solution since the velocity component of the global best particle tends to zero after some iterations. Consequently, the global best position stops moving, and the swarm behavior of all other particles leads them to follow the global best particle. To cope with the premature convergence property of standard PSO, Guaranteed Convergence PSO (GCPSO) has been proposed that provides convergence guarantees to a local optima van2002new.

1 Generate an -dimensional population Randomly initialize positions and velocities of each particle while termination condition is not met  do
2       foreach   do
3             calculate the current velocity calculate the next position given current velocity move to next position if fitness of current position fitness of local best  then
4                   update local best
5            if fitness of current position fitness of global best  then
6                   update global best
7            
8      
Algorithm 1 Particle Swarm Optimization

3.4 Challenges

Over the years, PSO and its improved variant Guaranteed Convergence PSO (GCPSO) have shown promising performance in centralized continuous optimization problems shi1999empirical; van2002new. Motivated by its success, we seek to explore its potential in solving C-DCOPs. However, there are several challenges that must be addressed when developing an anytime C-DCOP algorithm using GCPSO:

  • Particles and Fitness Representation: We need to define a representation for the particles where each particle represents a solution of the C-DCOPs. Moreover, a distributed method for calculating the fitness for each of the particles needs to be devised.

  • Creating the Population: In centralized optimization problems, creating the initial population is a trivial task. However, in the case of C-DCOPs, different agents control different variables. Hence, a method needs to be devised to generate the initial population cooperatively.

  • Evaluation: Centralized PSO deals with an -dimensional optimization task. In C-DCOPs, each agent holds one variable and each agent is responsible for solving the optimization task related to that variable only where the global objective is still an -dimensional optimization process. Thus, a decentralized evaluation needs to be devised.

  • Maintaining the Anytime Property: To maintain the anytime property in a C-DCOP approach, we need to identify the global best particle and the local best position for each particle. A distribution method needs to be devised to notify all the agents when a new global best particle or local best position is found. Finally, a decentralized coordination method is needed among the agents to update the position and velocity considering the current best position.

In the following section, we devise a novel method that addresses the above challenges and applies PSO to solve C-DCOPs.

Figure 2: A sample BFS pseudo-tree representation of the C-DCOP depicted in Figure 1.

4 The PCD Algorithm

Agent Agent Agent Agent
Particle
Particle
Particle
Table 1: Population Representation in PCD

We now turn to describe our proposed Particle Swarm Optimization Based C-DCOP (PCD) algorithm. To facilitate an easier understanding of the algorithm, we first describe what each particle represents in the context of C-DCOPs. Like in PSO, each particle in PCD has two attributes – position and velocity. The position of a particle corresponds to a value assignment to all variables in the C-DCOP. In other words, it is a solution to a given C-DCOP. Moreover, each agent also maintains the local best position of the particle. The velocity of a particle defines the step size that a particle takes in each iteration to change its position and is influenced by the combination of the direction of its local best and global best position. However, unlike in PSO, where a centralized entity controls all particles, each particle in PCD is controlled in a decentralized manner by all deployed agents. Specifically, for each particle, each agent controls only the position and velocity corresponding to its variable.

In PCD, we define population as a set of particles that are collectively maintained by all the agents and local population as the subset of the population maintained by an agent . For further clarification, we present an example of a population in Table 1. Here, each row represents a particle , which is the solution of the problem. Each column represents an agent and the corresponding attributes that it holds for each particle. For example, in the table, each agent holds two attributes, namely the position attribute and the velocity attribute , for each particle . Additionally, we use the following notations:

  • and to represent the complete position and velocity assignment for each particle , respectively.

  • and to represent the position and velocity assignments of each agent for all the particles, respectively.

  • to represent the fitness of particle , that is, the aggregated cost of constraints associated with the neighbors of agent .

  • and to represent the complete fitness and the fitness that agent calculates for each particle and , respectively.

  • to represent the set of for all the particles.

  • and to represent the best position of particle thus far and the fitness value of that position, respectively.

  • to represent the global best particle among all particles.

  • and to represent the position attribute of the global best particle and the fitness value of that position for each agent , respectively.

PCD is a PSO-based iterative algorithm that first constructs a Breadth First Search (BFS) pseudo-tree chen2017improved, which orders the agents, in a pre-processing step. Figure 2 illustrates a BFS pseudo-tree constructed from the constraint graph shown in Figure 1 having 444We use and interchangeably throughout the paper since each agent controls exactly one variable. as the root. From this point, we use the notation to refer to the neighboring agents of agent in the constraint graph and the notations and to refer to the parent agent and set of children agents of agent in the pseudo-tree, respectively. For example, for agent of Figure 2b, , , and .

Input :  -- Number of particles
-- Inertia weight
-- Cognitive constant
-- Social constant
-- Threshold for success count
-- Threshold for failure count
9 foreach   do
10        INITIALIZATION() while Termination condition not met do
11               EVALUATION() BEST_UPDATE() VARIABLE_UPDATE()
12       
Algorithm 2 PCD Algorithm

algocf[t]    

algocf[t]    

algocf[t]    

algocf[t]    

The pseudocode of our PCD algorithm can be found in Algorithm 2. After constructing the pseudo-tree, it runs the following three phases:

  • Initialization Phase: The agents create an initial population of particles and initialize their parameters.

  • Evaluation Phase: The agents calculate the fitness value for each particle in a distributed way.

  • Update Phase: Each agent keeps track of the best solution found so far, propagates this information to the other agents, and updates its value assignment according to that information.

The agents repeat these last two phases in a loop until some termination condition is met.

We now describe these phases in more detail. In the initialization phase, each agent executes the INITIALIZATION procedure (Procedure LABEL:proc:init), which consists of the following: It first creates a set of particles and initializes the cycle counter as well as three other variables , , and that are used to update the velocity of the particles (Lines LABEL:line:20-LABEL:line:22). It then initializes the velocity and position of each particle to 0 and a random value in , respectively (Lines LABEL:line:24-LABEL:line:25). This initialization is aimed at distributing the initial positions of the particles randomly throughout the search space. It then initializes the best position and the corresponding fitness value of each particle to null and infinity, respectively, since the position has not been evaluated yet (Lines LABEL:line:26-LABEL:line:27). Similarly, it initializes the best global position and the corresponding fitness value to null and infinity, respectively, as well (Lines LABEL:line:28-LABEL:line:29). Finally, it sends its position assignments for all particles in a VALUE message to each of its neighboring agents (Line LABEL:line:30).

Next, in the evaluation phase, the agents collectively calculate the complete fitness of each particle using the fitness function shown in Equation 2 in the EVALUATION procedure (Procedure LABEL:proc:evaluation).

(2)
(3)

Here, is the set of constraints whose scope includes (see Equation 3) and is the value assignment of the set of variables in the scope of function for each particle . Note that a single agent cannot calculate the complete fitness value. Instead, it is calculated in a decentralized way by all the agents and then accumulated up the BFS tree towards the root. Specifically, each agent is in charge of computing only for each particle . Further, note that the cost of each function is summed up twice by the two agents in its scope.555Recall that we consider only binary cost functions in this paper. Therefore, the complete fitness value is divided by two.

To calculate the complete fitness value in a decentralized way, each agent first waits for VALUE messages from its neighboring agents (Line LABEL:line:31). Upon receiving all the VALUE messages, it calculates the costs of all its functions and aggregates them in local fitness values (Line LABEL:line:33). If the agent does not have any children agent, then it assigns to for all particles (Line LABEL:line:36) and sends the set of fitness values of all particles in a COST message to its parent agent (Line LABEL:line:41). If an agent does have children agents, then it waits for COST messages from all its children agents (Line LABEL:line:34). After receiving the fitness values from all its children, it aggregates the fitness values received with its own local fitness values (Line LABEL:line:36) and sends the set of aggregated fitness values of all particles in a COST message to its parent agent (Line LABEL:line:41).

This process repeats until the root agent receives all COST messages from all its children agents and calculates the aggregated fitness values. At this point, note that the cost of each constraint is doubly counted in the aggregated fitness values because the local fitness values of both agents in the scope of the constraint are aggregated together. Thus, the root agent divides the aggregated fitness values by two (Line LABEL:line:39) before starting the next phase.

Finally, in the update phase, the agents synchronize on their best local and global particles in the BEST_UPDATE procedure (Procedure LABEL:proc:bestupdate) and update the positions and velocities of their particles in the VARIABLE_UPDATE procedure (Procedure LABEL:proc:varupdate).

To synchronize their best local and global particles, the root agent first checks if a better local position has been found for each particle (Lines LABEL:line:45-LABEL:line:47). If this is the case, it updates the best position and its corresponding fitness value before storing that particle in the set (Lines LABEL:line:48-LABEL:line:50). The root agent also checks if a better global position has been found (Line LABEL:line:51). If so, it updates the best global position and its corresponding fitness value before storing that particle in a variable (Lines LABEL:line:52-LABEL:line:54). The root agent then sends both and in a BEST message to each of its children agents (Line LABEL:line:61).

When each of its children agents receives the BEST message from the root agent, it iterates over all the particles in the set , and assigns the positions of those particles as best positions of the corresponding particles in its local copy (Lines LABEL:line:57-LABEL:line:58). Similarly, if a better global particle has been found, it assigns the position of that particle as its best global position (Lines LABEL:line:59-LABEL:line:60). It then propagates both and that it received in the BEST message to each of its children agents (Line LABEL:line:61). This process repeats down the pseudo-tree until all agents synchronize their best local and global particles. Finally, the agents increment their cycle counters by one (Line 2).

To update the positions and velocities of the particles, we adapt the update equations used by Guaranteed Convergence PSO (GCPSO) van2002new. At a high level, each agent uses Equations 4 and 5 to update the velocities of the global best particle and other particles , respectively, and use Equation 6 to update the positions of all particles (Lines LABEL:line:65-LABEL:line:68):

(4)
(5)
(6)

In these equations, the superscripts denote the value of the variables at the cycle. Here, , , and are user-defined input parameters to the algorithm; and are two random values that are uniformly sampled from the range by each agent in each cycle; and is defined using Equation 7:

(7)

where and are user-defined input parameters of the algorithm; and both and are calculated using Equations 8 and 9, respectively:

(8)
(9)

Intuitively, represents an inertia weight that defines the influence of the velocity of the previous cycle on the velocity in the current cycle. The constants and are called the cognitive and social constants, respectively, in the literature because they affect the terms and , which are called cognition and social components, respectively. The cognition component is called such because it considers the particle’s own attributes only while the social component is called such because it involves interactions between two particles. Both of the constants and define the influence of local and global best positions on the velocity of particles in the current cycle.

The parameter represents the diameter of an area around the global best particle that particles can explore. Its value is determined by the count of consecutive successes and failures . Success is defined when the fitness value of the global best particle improves, and failure is defined when the fitness value remains unchanged. When there are more consecutive successes than a threshold , the diameter doubles to increase random exploration because the current location of the best particle is promising. On the other hand, when there are more consecutive failures than a threshold , the diameter is halved to focus the search closer around the location of the best particle.

4.1 Crossover

algocf[t]     Although PCD provides reasonable anytime solution quality in several benchmark problems (see details in Section 6.3

), the scope for incorporating other genetic operators still exists. Hence, in this section, we introduce a new crossover operator that further improves the solution quality of PCD. We refer to this version of PCD with the new crossover operator as PCD_CrossOver. In centralized hybrid PSO models, arithmetic crossover of position and velocity vectors have shown promising results 

lovbjerg2001hybrid. In a centralized scenario, algorithms can execute crossover operations simultaneously for all the variables. But in a distributed scenario, either the agents need to agree in a cooperative crossover execution chen2020genetic and need to exchange information or the agents can execute crossover operation only for the variables that they hold. In this paper, we follow the latter approach to not incur additional messaging and synchronization overheads. We describe the crossover operation in Procedure LABEL:proc:crossover. Specifically, each agent uses the local fitness value of each particle in the evaluation phase (Line LABEL:line:33

) to calculate the crossover probability

for each particle using Equation 10 (Line LABEL:line:70).

(10)

Then, each agent selects two random particle and from its set according to the crossover probabilities (Line LABEL:line:71), and updates their positions using the following crossover operations (Lines LABEL:line:72-LABEL:line:73):

(11)
(12)

where is a random number from the range . If , then their velocities are also updated using the following crossover operations (Lines LABEL:line:72-LABEL:line:73):

(13)
(14)

Otherwise, the velocities are updated using the regular update operations described in Equations 4 and 5.

4.2 Example Partial Trace

We now provide a partial trace of our PCD algorithm on the example C-DCOP of Figure 1. Assume that the number of particles . In the initialization phase, the agents cooperatively build the BFS pseudo-tree shown in Figure 2, after which each agent is aware of its set of neighboring agents , its set of children agents , and its parent agent :

(15)
(16)
(17)
(18)

Each agent then creates a set of particles and initializes their position and velocity attributes and for all particles . Assume that they are initialized using the assignments below:

(19)
(20)
(21)
(22)
(23)

Then, each agent sends its position assignments in a VALUE message to each of its neighboring agents in :

  • Agent sends a VALUE() message to each of its neighboring agents , , and , where .

  • Agent sends a VALUE() message to its neighboring agent , where .

  • Agent sends a VALUE() message to each of its neighboring agents and , where .

  • Agent sends a VALUE() message to each of its neighboring agents and , where .

In the evaluation phase, each agent waits for the VALUE messages from its neighboring agents. Upon receiving the VALUE messages from all of its neighboring agents, it calculates the local fitness value for each particle . For example, after receiving VALUE() and VALUE() from agents and , respectively, agent calculates for particle as follows (see Figure 1 for the set of cost functions of our example C-DCOP):

(24)
(25)
(26)
(27)

Table 2 tabulates the values of for each particle of each agent .

Agent Agent Agent Agent
Particle -1.44 -0.44 21.00 10.00
Particle 14.00 0.00 12.00 10.00
Particle -9.00 -1.00 16.00 8.00
Particle 6.64 0.21 7.51 4.92
Table 2: Local Fitness Scores
Agent Agent Agent Agent
Particle 14.56 -0.44 21.00 10.00
Particle 18.00 0.00 12.00 10.00
Particle 7.00 -1.00 16.00 8.00
Particle 9.60 0.21 7.51 4.92
Table 3: Fitness Scores

After computing the local fitness values, since agent does not have any child agent, it assigns its local fitness value of each particle to that particle’s regular fitness value and sends that information to its parent agent in a COST message. Similarly, agents and also do the same as they too do not have any child agent. Table 3 tabulates the values of for each particle of each agent , and the COST messages sent by the agents are below:

  • Agent sends a COST() message to its parent agent , where .

  • Agent sends a COST() message to its parent agent , where .

  • Agent sends a COST() message to its parent agent , where .

As the root agent has children agents, it waits for the COST messages from its children agents. Upon receiving the COST messages from all of its children agents, it calculates the fitness value for each particle . For example, it calculates for particle as follows:

(28)
(29)
(30)

As the cost from each cost function in the C-DCOP is doubly counted, the root agent divides its fitness value of each of its particles by two. For example, it updates for particle as follows:

(31)

In the update phase, since this is the first iteration, each particle of the root agent has a better local position. Thus, the best position of each particle is updated to the particle’s current position and all the particles are added into the set . Similarly, a better global position is found. Thus, the best global position is updated to the position of the best particle and that particle is assigned to the variable . The agent then sends both and in a BEST message to its children agents , , and :

  • Agent sends a BEST(, ) message to its children agents , , and , where and .

All non-root agents , , and wait for the BEST messages from their parent agent . Upon receiving the BEST message, for each particle in the BEST message, each agent assigns the position as the best local position of the corresponding particle . Since contains all four particles, the best local positions of all four particles are updated. Similarly, each agent also updates the best global position to the position of the best particle in the BEST message. Table 4 tabulates the values of the best local positions for each particle and the best global position of each agent .

Agent Agent Agent Agent
Particle -1.00 1.20 -2.00 2.00
Particle -2.00 2.00 -1.00 1.00
Particle 0.00 1.00 2.00 -2.00
Particle 1.10 -1.00 1.50 0.50
(a) for each Agent
Agent Agent Agent Agent
Particle 0.00 1.00 2.00 -2.00
(b) for each Agent
Table 4: and of Agent