Distributed Constraint Optimization Problems (DCOPs) [petcu2005dpop] are a powerful framework for modeling cooperative multi-agent systems in which multiple agents (or robots) communicate with each other directly or indirectly. The agents act autonomously in a shared environment in order to optimize a global objective that is an aggregation of their constraint cost functions. Each of the functions is associated with a set of variables controlled by the corresponding agents. Agents in a DCOP must coordinate value assignments to their variables in order to maximize their aggregated utility or minimize the overall cost. DCOPs have been successfully applied to solve many multi-agent coordination problems, including multi-robot coordination [zivan2015distributed], sensor networks [stranders2009decentralised], smart home automation [fioretto2017multiagent, rust2016using], smart grid optimization [miller2012optimal, kumar2009distributed], cloud computingapplications[hoang2019new],etc.
In general, DCOPs assume these variables are discrete and that constraint utilities of the system are represented in tabular forms. However, several problems such as target tracking sensor orientation [fitzpatrick2003distributed], sleep scheduling of wireless sensors [hsin2004network] are better modeled with variables with continuous domains. Even though DCOPs can deal with the continuous-valued variables through discretization, for such a problem to be tractable, the discretization process must be coarse and sufficiently fine to produce high-quality solutions to the problem [stranders2009decentralised]. To address this issue, [stranders2009decentralised] suggest a framework called Continuous DCOPs (C-DCOPs), which extends the general DCOPs to operate with variables that take values from a range. Additionally, in C-DCOPs, constraint utilities are specified by functions instead of the tabular form of the traditional DCOPs.
Over the last few years, a number of algorithms have been introduced to solve C-DCOPs [stranders2009decentralised, voice2010hybrid, choudhury2020particle, hoang2020new, sarker2020c]. Initially, Continuous Max-Sum (CMS) [stranders2009decentralised] is proposed to solve C-DCOPs which is an extension of the discrete max-sum algorithm [farinelli2008decentralised]. CMS approximates the utility functions of the system with piecewise linear functions. Afterward, Hybrid CMS (HCMS) [voice2010hybrid] utilizes discrete Max-Sum as the underlying algorithmic framework with the addition of a continuous non-linear optimization method. A major issue with CMS and HCMS is that they are not capable of providing quality guarantees of the solutions. [sarker2020c]
proposed a non-iterative solution for C-DCOPs; however, it cannot provide anytime solutions. Recently, a population-based anytime algorithm based on Particle Swarm Optimization (PSO), namely PFD[choudhury2020particle] is proposed, which has shown better results than other state-of-the-art algorithms. However, scalability remains a big issue for PFD as the number of agents in the system increases.
More recently, a variety of exact and non-exact algorithms were introduced by [hoang2020new]. Specifically, the inference-based DPOP [petcu2005dpop] has been expanded to suggest three algorithms: Exact Continuous DPOP (EC-DPOP), Approximate Continuous DPOP (AC-DPOP), and Clustered AC-DPOP (CAC-DPOP). EC-DPOP presents an exact solution where a system’s agents are grouped in a tree-structured network. However, this is not a feasible assumption in most problems. AC-DPOP and CAC-DPOP, on the other hand, give approximate solutions to the C-DCOP problem for general constraint graphs. In addition, they develop Continuous DSA (C-DSA) by extending the search based Distributed Stochastic Algorithm (DSA) [zhang2005distributed]. As these three approximate algorithms use continuous optimization techniques such as gradient-based optimization, they require derivative calculations and, therefore not suitable for non-differentiable optimization problems such rectilinear data fitting [bertsekas1973descent, elhedhli1999nondifferentiable]. Moreover, recent experiments comparing against other C-DCOP algorithms also show that these algorithms produce non-exact solutions of poor quality [choudhury2020particle].
Against this background, we propose a new non-exact anytime algorithm - Distributed Artificial Bee Colony algorithm for C-DCOPs, that is inspired by a recent variant of the well-known Artificial Bee Colony (ABC) algorithm [karaboga2005idea, xiao2019improved]. Similar to the ABC algorithm, our approach is a population-based stochastic algorithm that stores multiple candidate solutions amongst the agents. It improves the global solution by iteratively adjusting the candidate solutions and discards solutions that do not improve after a certain period. Additionally, we have also designed a noble exploration mechanism for our algorithm, that we generally call ABCD-E, with the intend to further improve the solution quality. We also theoretically prove that ABCD-E is an anytime algorithm. Finally, our empirical results show that ABCD-E outperforms the state-of-the-art algorithms by upto .
In this section, we first describe the C-DCOP framework. We then briefly discuss the Artificial Bee Colony (ABC) algorithm on which our contribution is based.
Ii-a Continuous Distributed Constraint Optimization Problems
A Continuous Distributed Constraint Optimization Problem can be defined as a tuple [stranders2009decentralised, choudhury2020particle] where,
is a set of agents.
is a set of variable. In this paper, we use the terms “agent“ and “variable“ interchangeably i.e. .
is a set of continuous domains. Each variable takes values from a range .
is a set of utility functions, each is defined over a subset of variables and the utility for that function is defined for every possible value assignment of , that is, where the arity of the function is . In this paper, we only considered binary quadratic functions for the sake of simplicity.
is a mapping function that associates each variable to one agent .
The solution of a C-DCOP is an assignment that maximizes the constraint aggregated utility functions as shown in Equation 1.
Figure 1 shows an exemplary C-DCOP in which Figure 1a depicts a constraint graph where the connections among the variables are given and variable is controlled by the corresponding agent . Figure 1b shows the utility functions defined for each edge in the constraint graph. In this example, every variable takes value from its domain .
Ii-B Artificial Bee Colony Algorithm
Artificial Bee Colony (ABC) [karaboga2005idea] is a population-based stochastic algorithm that has been used to find the minimum or maximum of a multi-dimensional numeric function. It is worth noting that ABC is inspired by honey bees’ search for a better food source in nature. Algorithm 1 depicts the steps that ABC follows.
Initially, a population is created with several random solutions (Line 1). Afterwards, for each solution of , it creates new solutions (Line 3) and if better solutions are found, gets updated with (Lines 4). It also creates new solutions from particular solutions of which are selected based on their quality (Line 5). gets updated with solutions in by replacing the solutions that they have been produced from if they are better from those solutions (Line 6). Solutions that are not being updated after a certain period in are replaced with new random solutions (Line 7). These operations are running until the termination criteria are met (Line 8). Recently, a recent variant of ABC has been proposed that improves the overall solution quality by introducing elite set and dimension learning (see [xiao2019improved] for more detail).
ABC optimizes a single centralized function that operates in centralized system. On the other hand, C-DCOP is a framework designed for multi-agent environment. In this setting, the population generated by the algorithm need to be stored in a distributed among the agents. Hence, tailoring ABC for C-DCOPs is not a trivial task. Moreover, it is also challenging to incorporate the anytime property as it is necessary to identify the global best solution within the whole population. And, whenever the global best gets updated, a distributed method needs to be devised to propagate that global best to all the agents. In the following section, we describe our algorithm ABCD-E which addresses all of these challenges.
Iii The ABCD-E Algorithm
ABCD-E is a population based anytime C-DCOP algorithm that is inspired by the popular Artificial Bee Colony Algorithm111We use the recent variant of the ABC algorithm[xiao2019improved]. ABCD-E in general works with a population, referring to a set of solutions of a given problem. In a C-DCOP framework, we have to store the population in a distributed manner. Hence the population are distributed amongst the agents . Each agent maintains a set of objects having size equal to the population size (we will discuss the property of an object shortly). For further clarification, we have shown the population distribution in Table I. In this table, each row represents a single solution, whereas each column represents what a single agent will store . Here, denotes the object which is stored by agent and is part of the solution . As we have stated earlier, each solution consists of a set of objects that hold various values. Here are the list of values a single object holds:
: Candidate value for which is the decision variable held by agent
: Aggregated utility of the calculated by agent only with its neighbors
: Utility calculated by agent for Solution . From now on, we use the terms fitness and utility interchangeably.
ABCD-E keeps an elite set which stores a copy of best solutions from . Similar to , the elite set is also maintained in a distributed way. Here, maintains a set namely , which has a total of elements for each solution in the population . Each element of has boolean values for each agent . If the value of is , it means that for solution , an agent has explored its search space. also has and values for each solution which we will discuss later in this section. ABCD-E takes two parameters as inputs: is the number of solutions in the population that has to be created and is the number of solutions in which is called the elite set.
Iii-a The Algorithm Description
ABCD-E starts with constructing a BFS-pseudo-tree (see [chen2017improved] for more details) (Line 44, a initialization procedure called from Line 1). The BFS-pseudo-tree is used as a communication structure to transmit messages among the agents. Here, each agent has a unique priority associated with them. To be precise, an agent with a lower depth has higher priority than an agent with higher depth. When two agents have the same depth, their priorities are set randomly. We use to refer agent ’s neighboring agents and the notations and to refer to the parent and children of agent , respectively. The root agent and the leaf agents do not have any parent and children, respectively.
After constructing the BFS-pseudo-tree, Line 45 executes the INITIALIZATION procedure that initializes the population using Equation 2. In Equation 2, is a random floating number from the range chosen by agent for the -th solution of population . then sets the value for each to FALSE (Line 48). It also initializes the values of to , as we are searching for a solution with maximum utility (Line 47).
Now, the procedure EVALUATE is used to calculate the local utility for each object in a decentralized manner and also by using the local utility, we determine the aggregated utility for a particular population . It first waits for the objects of its neighboring agents (Line 49). It then calculates each agent ’s utility for all of its neighbors . For each function, we pass two values, and , to calculate . It aggregates all values and stores it in the variable (Lines 50-51). Then, it waits for values from its child agents (Line 52). After receiving the values, it sums up those values and adds it to its own (Lines 53-54). Any agent other than sends its values to its parent (Lines 58-59). This process continues until receives all the values from its child agents. At this point, the utility of each constrained is doubly added in the aggregated utility values. This is because the local utility values of both agents in the scope of the constraint are aggregated together and we are only considering binary quadratic functions in this paper. Hence, divides each aggregated utility by 2 (Lines 55-57).
After the INITIALIZATION phase, ABCD-E calls the BUILD procedure for population (Line 3). It first calculates the aggregated utility for the population (Line 60). Afterwards, selects best solutions among them and stores them in a distributed way (Lines 63-64). It also updates if any of those solutions have a greater utility than the utility of (Line 62). Each agent receives the propagation and stores the specific values in (Line 65). Afterwards, each agents constraint variable is set to to achieve the anytime property.
ABCD-E now moves onto updating each solution in the population . It first creates a copy of the main in (Line 4). then chooses a random agent from and sends a request to update , and sets the attribute to TRUE (Lines 7-9). Agents who receive the request now update the object. It selects a random solution and another random agent excluding itself (Lines 11-12). It then waits until and is received from agent (Line 13). It then calculates from Equation 3 where and are two random numbers from and , respectively (Line 14). Note that often updating a value might take it outside the range of . It uses Equation 4 to get the value inside the range . After completing the updates, each agent executes for population (Line 15). then checks for a better solution(s). If is better than , gets replaced by all agents (Line 17). also tries to update if there is any better solution than the (Line 18). Whenever a solution gets updated, we reset the values of to in ABCD-E.
calculates the probability of each solution being chosen for a re-search. Firstly, Equation5 converts every value to a positive value because there can also be negative valued fitness in the population (Line 19), while Equation 6 determines the probability of being chosen for any solution.
Afterward, ABCD-E runs an update process times. Each time selects a solution from according to the probabilities it previously calculated (Line 22). It then creates solutions by changing the selected solution and for that it selects a random agent (Line 25), and set the values for to TRUE (Line 26). Agent will send a request to to update the values given and , where denotes the index of the solution in and denotes the index of the solution of (Line 27). Every agent makes copies of for the update process (Line 28). The agents, who receive the request for updating values in , start with selecting another random agent other than itself (Line 30) and randomly from (Line 31). It then receives the values and from agent (Line 32). Equation 7 determines the value of where and are two random numbers from the ranges and , respectively (Line 33). Equation 4 fits the values inside the domain when some values are outside the range. Following the update process, each agent runs the EVALUATE procedure for (Line 34). tries to update and with when there is any scope of improvement (Lines 36-37). And, whenever an update occurs, it resets values of .
Finally, observes each solutions to check whether every element of is TRUE or not (Line 40). When TRUE, it means that, all agents have explored it for -th solution. Otherwise, it resets the value of to FALSE and sends a request to each agent to replace that solution values with random values using Equation 2 (Lines 41-42). Once agents receive the request from , the update operation is executed (Line 43).
Iv Theoretical Analysis
In this section, we prove that ABCD-E algorithm is anytime, that is the quality of solutions found by ABCD-E increases monotonically. We also evaluate both the time complexity and memory complexity for ABCD-E.
The ABCD-E algorithm is anytime.
At the beginning of the iteration , the complete assignment to is updated according to the (Algorithm 2: Line 66). Now, is updated according to the best solution in the population found up to (Algorithm 2: Lines 18, 37, and 62). Hence, the utility of either stays the same or increases. Since X is updated according to Gbest, the global utility of the given C-DCOP instance at iteration is greater than or equal to the utility at iteration . As a consequence, the quality of the solution monotonically improves as the number of iterations increases. Hence, the ABCD-E algorithm is anytime. ∎
In terms of complexity, the population that are created in ABCD-E are stored in a distributed manner. So each agent holds elements for , elements for and elements for . Hence, in total, each agent has values stored in them. But holds two extra variables: and . Because of that, has extra values. Each agent first updates each solution once; and it then updates a solution times. For this reason, each agent would do operations. As the value of is very small, this does not create an issue for time consumption.
V Experimental Results
This section provides empirical results to demonstrate the superiority of ABCD-E against the current state-of-the-art C-DCOP algorithms, namely AC-DPOP, C-DSA, and PFD. The results reported in the name of ABCD-C represents the results of our algorithm without taking into account our proposed exploration mechanism. We do so to observe the impact of ABCD-E’s exploration mechanism.
Specifically, we first show the impact of population size and elite size on ABCD-E’s performace and select the best pair of parameter values. Then we benchmark ABCD-E, and competing algorithms on Random DCOPs Problems [choudhury2020particle] with binary quadratic constraints, i.e. constraints in the form of . It is worth mentioning, although ABCD-E can operate with functions of any kind. We consider this form of constraint to be consistent with prior works. In each problem, we set each variable’s domain to . Furthermore, we consider three different types of constraint graph topology: scale-free graphs, small-world graphs, and random graphs. Finally, we run these benchmarks by varying both the number of agents and constraint density.
The performance of ABCD-E depends on its two parameters, i.e., population size and elite size . To demonstrate the effect of population size on the solution quality, we benchmark ABCD-E on Erdos-Renyi topology[erdds1959random] with constraint density 0.3. Figure 2 present the vs. solution quality results of ABCD-E.
It can be observed from the results that, as we increase , solution quality quickly improves up to . Increasing does not, however, result in a significant improvement in solution quality after the population size surpasses . We run a similar experiment for elite size and present the results in Figure 3. Similar to , increasing after a certain point does not significantly improve the solution quality. Further, the computational complexity and the number of messages of ABCD-E depend on both and (). Considering this, we select and for ABCD-E. It is worth mentioning that selecting parameter values for ABCD-E is considerably easier than its main competitor algorithm PFD. This is because both and are only constrained by available resources, and increasing them results in a consistent improvement of solution quality.
We now compare ABCD-C and ABCD-E with the state-of-the-art C-DCOP algorithms, namely, PFD, AC-DPOP, and C-DSA. We select the parameters for these algorithms as suggested in their corresponding papers. In all settings, we run all algorithms on 50 independently generated problems and 50 times on each problem. We run all the algorithms for an equal amount of time. It is worth noting that all differences shown in Figures 2, 3, 4, and 5 are statistically significant for .
We first consider the random network benchmark. To construct random networks, we use Erdos-Renyi topology [erdds1959random]. We vary the constraint density from to . In all cases, both ABCD-E and ABCD-C outperform the competing algorithms. We only present the results for density (sparse Figure 5) and density (dense Figure 4) for space constraints. In sparse settings, ABCD-C produces better solutions than its closest competitor PFD. On the other hand, ABCD-E improves the solutions generated by ABCD-C by . In dense settings, we see a similar trend.
We similarly run the experiment using scale-free network topology. To create scale-free sparse networks, we used Barabasi-Albert [albert2002statistical] topology where the number of edges to connect from a new node to an existing node is . We present the result for this setting in Figure 6. We see a similar performance improvement by ABCD-C and ABCD-E as in the random network experiment above. It is worth mentioning a similar trend in performance gain continues as we increase the graph density.
Finally, we run our experiments on small-world networks. To construct small-world networks, we use the Watts-Strogatz topology [watts1998collective] model where the number of nodes to join with a new node is , and the likelihood of rewiring is . Figure 7 depicts that ABCD-C offers better performance than PFD. Moreover, ABCD-E enhances results generated by ABCD-C by .
The experiments demonstrate the superiority of ABCD-E against the competing algorithm. We see a consistent performance gain over competing algorithms by ABCD-E under different constraint graph topology, size, and density. We see a similar trend of performance gain when comparing ABCD-E and ABCD-C. ABCD-E outperforms ABCD-C because each solution in the population is getting explored by every agent in ABCD-E. As different agents possess the ability to modify different parts of the solution due to factored nature of C-DCOPs, it significantly improves exploration. On the other hand, in ABCD-C, each solution gets explored a limited number of times, and it is not ensured that all the agents are given the opportunity to improve each solution. As a result, many potential good solutions are prematurely discarded.
Vi Conclusions and Future Work
We develop a C-DCOP solver, namely ABCD-E, by tailoring a well-known population-based algorithm (i.e., Artificial Bee Colony). More importantly, we introduce a new a exploration mechanism with the intend to further improve the quality of the solution. We theoretically prove that ABCD-E is an anytime algorithm. Finally, we present empirical results that show that ABCD-E outperforms the state-of-the-art non-exact C-DCOPs algorithms by a notable margin. In the future, we would like to investigate whether ABCD-E can be applied to solve multi-objective C-DCOPs.