I Introduction
Distributed Constraint Optimization Problems (DCOPs) [petcu2005dpop] are a powerful framework for modeling cooperative multiagent systems in which multiple agents (or robots) communicate with each other directly or indirectly. The agents act autonomously in a shared environment in order to optimize a global objective that is an aggregation of their constraint cost functions. Each of the functions is associated with a set of variables controlled by the corresponding agents. Agents in a DCOP must coordinate value assignments to their variables in order to maximize their aggregated utility or minimize the overall cost. DCOPs have been successfully applied to solve many multiagent coordination problems, including multirobot coordination [zivan2015distributed], sensor networks [stranders2009decentralised], smart home automation [fioretto2017multiagent, rust2016using], smart grid optimization [miller2012optimal, kumar2009distributed], cloud computingapplications[hoang2019new],etc.
In general, DCOPs assume these variables are discrete and that constraint utilities of the system are represented in tabular forms. However, several problems such as target tracking sensor orientation [fitzpatrick2003distributed], sleep scheduling of wireless sensors [hsin2004network] are better modeled with variables with continuous domains. Even though DCOPs can deal with the continuousvalued variables through discretization, for such a problem to be tractable, the discretization process must be coarse and sufficiently fine to produce highquality solutions to the problem [stranders2009decentralised]. To address this issue, [stranders2009decentralised] suggest a framework called Continuous DCOPs (CDCOPs), which extends the general DCOPs to operate with variables that take values from a range. Additionally, in CDCOPs, constraint utilities are specified by functions instead of the tabular form of the traditional DCOPs.
Over the last few years, a number of algorithms have been introduced to solve CDCOPs [stranders2009decentralised, voice2010hybrid, choudhury2020particle, hoang2020new, sarker2020c]. Initially, Continuous MaxSum (CMS) [stranders2009decentralised] is proposed to solve CDCOPs which is an extension of the discrete maxsum algorithm [farinelli2008decentralised]. CMS approximates the utility functions of the system with piecewise linear functions. Afterward, Hybrid CMS (HCMS) [voice2010hybrid] utilizes discrete MaxSum as the underlying algorithmic framework with the addition of a continuous nonlinear optimization method. A major issue with CMS and HCMS is that they are not capable of providing quality guarantees of the solutions. [sarker2020c]
proposed a noniterative solution for CDCOPs; however, it cannot provide anytime solutions. Recently, a populationbased anytime algorithm based on Particle Swarm Optimization (PSO), namely PFD
[choudhury2020particle] is proposed, which has shown better results than other stateoftheart algorithms. However, scalability remains a big issue for PFD as the number of agents in the system increases.More recently, a variety of exact and nonexact algorithms were introduced by [hoang2020new]. Specifically, the inferencebased DPOP [petcu2005dpop] has been expanded to suggest three algorithms: Exact Continuous DPOP (ECDPOP), Approximate Continuous DPOP (ACDPOP), and Clustered ACDPOP (CACDPOP). ECDPOP presents an exact solution where a system’s agents are grouped in a treestructured network. However, this is not a feasible assumption in most problems. ACDPOP and CACDPOP, on the other hand, give approximate solutions to the CDCOP problem for general constraint graphs. In addition, they develop Continuous DSA (CDSA) by extending the search based Distributed Stochastic Algorithm (DSA) [zhang2005distributed]. As these three approximate algorithms use continuous optimization techniques such as gradientbased optimization, they require derivative calculations and, therefore not suitable for nondifferentiable optimization problems such rectilinear data fitting [bertsekas1973descent, elhedhli1999nondifferentiable]. Moreover, recent experiments comparing against other CDCOP algorithms also show that these algorithms produce nonexact solutions of poor quality [choudhury2020particle].
Against this background, we propose a new nonexact anytime algorithm  Distributed Artificial Bee Colony algorithm for CDCOPs, that is inspired by a recent variant of the wellknown Artificial Bee Colony (ABC) algorithm [karaboga2005idea, xiao2019improved]. Similar to the ABC algorithm, our approach is a populationbased stochastic algorithm that stores multiple candidate solutions amongst the agents. It improves the global solution by iteratively adjusting the candidate solutions and discards solutions that do not improve after a certain period. Additionally, we have also designed a noble exploration mechanism for our algorithm, that we generally call ABCDE, with the intend to further improve the solution quality. We also theoretically prove that ABCDE is an anytime algorithm. Finally, our empirical results show that ABCDE outperforms the stateoftheart algorithms by upto .
Ii Background
In this section, we first describe the CDCOP framework. We then briefly discuss the Artificial Bee Colony (ABC) algorithm on which our contribution is based.
Iia Continuous Distributed Constraint Optimization Problems
A Continuous Distributed Constraint Optimization Problem can be defined as a tuple [stranders2009decentralised, choudhury2020particle] where,

is a set of agents.

is a set of variable. In this paper, we use the terms “agent“ and “variable“ interchangeably i.e. .

is a set of continuous domains. Each variable takes values from a range .

is a set of utility functions, each is defined over a subset of variables and the utility for that function is defined for every possible value assignment of , that is, where the arity of the function is . In this paper, we only considered binary quadratic functions for the sake of simplicity.

is a mapping function that associates each variable to one agent .
The solution of a CDCOP is an assignment that maximizes the constraint aggregated utility functions as shown in Equation 1.
(1) 
Figure 1 shows an exemplary CDCOP in which Figure 1a depicts a constraint graph where the connections among the variables are given and variable is controlled by the corresponding agent . Figure 1b shows the utility functions defined for each edge in the constraint graph. In this example, every variable takes value from its domain .
IiB Artificial Bee Colony Algorithm
Artificial Bee Colony (ABC) [karaboga2005idea] is a populationbased stochastic algorithm that has been used to find the minimum or maximum of a multidimensional numeric function. It is worth noting that ABC is inspired by honey bees’ search for a better food source in nature. Algorithm 1 depicts the steps that ABC follows.
Initially, a population is created with several random solutions (Line 1). Afterwards, for each solution of , it creates new solutions (Line 3) and if better solutions are found, gets updated with (Lines 4). It also creates new solutions from particular solutions of which are selected based on their quality (Line 5). gets updated with solutions in by replacing the solutions that they have been produced from if they are better from those solutions (Line 6). Solutions that are not being updated after a certain period in are replaced with new random solutions (Line 7). These operations are running until the termination criteria are met (Line 8). Recently, a recent variant of ABC has been proposed that improves the overall solution quality by introducing elite set and dimension learning (see [xiao2019improved] for more detail).
IiC Challenges
ABC optimizes a single centralized function that operates in centralized system. On the other hand, CDCOP is a framework designed for multiagent environment. In this setting, the population generated by the algorithm need to be stored in a distributed among the agents. Hence, tailoring ABC for CDCOPs is not a trivial task. Moreover, it is also challenging to incorporate the anytime property as it is necessary to identify the global best solution within the whole population. And, whenever the global best gets updated, a distributed method needs to be devised to propagate that global best to all the agents. In the following section, we describe our algorithm ABCDE which addresses all of these challenges.
Iii The ABCDE Algorithm
ABCDE is a population based anytime CDCOP algorithm that is inspired by the popular Artificial Bee Colony Algorithm^{1}^{1}1We use the recent variant of the ABC algorithm[xiao2019improved]. ABCDE in general works with a population, referring to a set of solutions of a given problem. In a CDCOP framework, we have to store the population in a distributed manner. Hence the population are distributed amongst the agents . Each agent maintains a set of objects having size equal to the population size (we will discuss the property of an object shortly). For further clarification, we have shown the population distribution in Table I. In this table, each row represents a single solution, whereas each column represents what a single agent will store . Here, denotes the object which is stored by agent and is part of the solution . As we have stated earlier, each solution consists of a set of objects that hold various values. Here are the list of values a single object holds:

: Candidate value for which is the decision variable held by agent

: Aggregated utility of the calculated by agent only with its neighbors

: Utility calculated by agent for Solution . From now on, we use the terms fitness and utility interchangeably.
Agent  Agent  …  Agent  
Solution  …  
Solution  …  
…  …  …  …  … 
Solution  … 
ABCDE keeps an elite set which stores a copy of best solutions from . Similar to , the elite set is also maintained in a distributed way. Here, maintains a set namely , which has a total of elements for each solution in the population . Each element of has boolean values for each agent . If the value of is , it means that for solution , an agent has explored its search space. also has and values for each solution which we will discuss later in this section. ABCDE takes two parameters as inputs: is the number of solutions in the population that has to be created and is the number of solutions in which is called the elite set.
Iiia The Algorithm Description
ABCDE starts with constructing a BFSpseudotree (see [chen2017improved] for more details) (Line 44, a initialization procedure called from Line 1). The BFSpseudotree is used as a communication structure to transmit messages among the agents. Here, each agent has a unique priority associated with them. To be precise, an agent with a lower depth has higher priority than an agent with higher depth. When two agents have the same depth, their priorities are set randomly. We use to refer agent ’s neighboring agents and the notations and to refer to the parent and children of agent , respectively. The root agent and the leaf agents do not have any parent and children, respectively.
algocf[t]
After constructing the BFSpseudotree, Line 45 executes the INITIALIZATION procedure that initializes the population using Equation 2. In Equation 2, is a random floating number from the range chosen by agent for the th solution of population . then sets the value for each to FALSE (Line 48). It also initializes the values of to , as we are searching for a solution with maximum utility (Line 47).
(2) 
Now, the procedure EVALUATE is used to calculate the local utility for each object in a decentralized manner and also by using the local utility, we determine the aggregated utility for a particular population . It first waits for the objects of its neighboring agents (Line 49). It then calculates each agent ’s utility for all of its neighbors . For each function, we pass two values, and , to calculate . It aggregates all values and stores it in the variable (Lines 5051). Then, it waits for values from its child agents (Line 52). After receiving the values, it sums up those values and adds it to its own (Lines 5354). Any agent other than sends its values to its parent (Lines 5859). This process continues until receives all the values from its child agents. At this point, the utility of each constrained is doubly added in the aggregated utility values. This is because the local utility values of both agents in the scope of the constraint are aggregated together and we are only considering binary quadratic functions in this paper. Hence, divides each aggregated utility by 2 (Lines 5557).
algocf[t]
After the INITIALIZATION phase, ABCDE calls the BUILD procedure for population (Line 3). It first calculates the aggregated utility for the population (Line 60). Afterwards, selects best solutions among them and stores them in a distributed way (Lines 6364). It also updates if any of those solutions have a greater utility than the utility of (Line 62). Each agent receives the propagation and stores the specific values in (Line 65). Afterwards, each agents constraint variable is set to to achieve the anytime property.
algocf[t]
ABCDE now moves onto updating each solution in the population . It first creates a copy of the main in (Line 4). then chooses a random agent from and sends a request to update , and sets the attribute to TRUE (Lines 79). Agents who receive the request now update the object. It selects a random solution and another random agent excluding itself (Lines 1112). It then waits until and is received from agent (Line 13). It then calculates from Equation 3 where and are two random numbers from and , respectively (Line 14). Note that often updating a value might take it outside the range of . It uses Equation 4 to get the value inside the range . After completing the updates, each agent executes for population (Line 15). then checks for a better solution(s). If is better than , gets replaced by all agents (Line 17). also tries to update if there is any better solution than the (Line 18). Whenever a solution gets updated, we reset the values of to in ABCDE.
(3) 
(4) 
calculates the probability of each solution being chosen for a research. Firstly, Equation
5 converts every value to a positive value because there can also be negative valued fitness in the population (Line 19), while Equation 6 determines the probability of being chosen for any solution.(5) 
(6) 
Afterward, ABCDE runs an update process times. Each time selects a solution from according to the probabilities it previously calculated (Line 22). It then creates solutions by changing the selected solution and for that it selects a random agent (Line 25), and set the values for to TRUE (Line 26). Agent will send a request to to update the values given and , where denotes the index of the solution in and denotes the index of the solution of (Line 27). Every agent makes copies of for the update process (Line 28). The agents, who receive the request for updating values in , start with selecting another random agent other than itself (Line 30) and randomly from (Line 31). It then receives the values and from agent (Line 32). Equation 7 determines the value of where and are two random numbers from the ranges and , respectively (Line 33). Equation 4 fits the values inside the domain when some values are outside the range. Following the update process, each agent runs the EVALUATE procedure for (Line 34). tries to update and with when there is any scope of improvement (Lines 3637). And, whenever an update occurs, it resets values of .
Finally, observes each solutions to check whether every element of is TRUE or not (Line 40). When TRUE, it means that, all agents have explored it for th solution. Otherwise, it resets the value of to FALSE and sends a request to each agent to replace that solution values with random values using Equation 2 (Lines 4142). Once agents receive the request from , the update operation is executed (Line 43).
(7) 
Iv Theoretical Analysis
In this section, we prove that ABCDE algorithm is anytime, that is the quality of solutions found by ABCDE increases monotonically. We also evaluate both the time complexity and memory complexity for ABCDE.
Proposition 1.
The ABCDE algorithm is anytime.
Proof.
At the beginning of the iteration , the complete assignment to is updated according to the (Algorithm 2: Line 66). Now, is updated according to the best solution in the population found up to (Algorithm 2: Lines 18, 37, and 62). Hence, the utility of either stays the same or increases. Since X is updated according to Gbest, the global utility of the given CDCOP instance at iteration is greater than or equal to the utility at iteration . As a consequence, the quality of the solution monotonically improves as the number of iterations increases. Hence, the ABCDE algorithm is anytime. ∎
In terms of complexity, the population that are created in ABCDE are stored in a distributed manner. So each agent holds elements for , elements for and elements for . Hence, in total, each agent has values stored in them. But holds two extra variables: and . Because of that, has extra values. Each agent first updates each solution once; and it then updates a solution times. For this reason, each agent would do operations. As the value of is very small, this does not create an issue for time consumption.
V Experimental Results
This section provides empirical results to demonstrate the superiority of ABCDE against the current stateoftheart CDCOP algorithms, namely ACDPOP, CDSA, and PFD. The results reported in the name of ABCDC represents the results of our algorithm without taking into account our proposed exploration mechanism. We do so to observe the impact of ABCDE’s exploration mechanism.
Specifically, we first show the impact of population size and elite size on ABCDE’s performace and select the best pair of parameter values. Then we benchmark ABCDE, and competing algorithms on Random DCOPs Problems [choudhury2020particle] with binary quadratic constraints, i.e. constraints in the form of . It is worth mentioning, although ABCDE can operate with functions of any kind. We consider this form of constraint to be consistent with prior works. In each problem, we set each variable’s domain to . Furthermore, we consider three different types of constraint graph topology: scalefree graphs, smallworld graphs, and random graphs. Finally, we run these benchmarks by varying both the number of agents and constraint density.
The performance of ABCDE depends on its two parameters, i.e., population size and elite size . To demonstrate the effect of population size on the solution quality, we benchmark ABCDE on ErdosRenyi topology[erdds1959random] with constraint density 0.3. Figure 2 present the vs. solution quality results of ABCDE.
It can be observed from the results that, as we increase , solution quality quickly improves up to . Increasing does not, however, result in a significant improvement in solution quality after the population size surpasses . We run a similar experiment for elite size and present the results in Figure 3. Similar to , increasing after a certain point does not significantly improve the solution quality. Further, the computational complexity and the number of messages of ABCDE depend on both and (). Considering this, we select and for ABCDE. It is worth mentioning that selecting parameter values for ABCDE is considerably easier than its main competitor algorithm PFD. This is because both and are only constrained by available resources, and increasing them results in a consistent improvement of solution quality.
We now compare ABCDC and ABCDE with the stateoftheart CDCOP algorithms, namely, PFD, ACDPOP, and CDSA. We select the parameters for these algorithms as suggested in their corresponding papers. In all settings, we run all algorithms on 50 independently generated problems and 50 times on each problem. We run all the algorithms for an equal amount of time. It is worth noting that all differences shown in Figures 2, 3, 4, and 5 are statistically significant for .
We first consider the random network benchmark. To construct random networks, we use ErdosRenyi topology [erdds1959random]. We vary the constraint density from to . In all cases, both ABCDE and ABCDC outperform the competing algorithms. We only present the results for density (sparse Figure 5) and density (dense Figure 4) for space constraints. In sparse settings, ABCDC produces better solutions than its closest competitor PFD. On the other hand, ABCDE improves the solutions generated by ABCDC by . In dense settings, we see a similar trend.
We similarly run the experiment using scalefree network topology. To create scalefree sparse networks, we used BarabasiAlbert [albert2002statistical] topology where the number of edges to connect from a new node to an existing node is . We present the result for this setting in Figure 6. We see a similar performance improvement by ABCDC and ABCDE as in the random network experiment above. It is worth mentioning a similar trend in performance gain continues as we increase the graph density.
Finally, we run our experiments on smallworld networks. To construct smallworld networks, we use the WattsStrogatz topology [watts1998collective] model where the number of nodes to join with a new node is , and the likelihood of rewiring is . Figure 7 depicts that ABCDC offers better performance than PFD. Moreover, ABCDE enhances results generated by ABCDC by .
The experiments demonstrate the superiority of ABCDE against the competing algorithm. We see a consistent performance gain over competing algorithms by ABCDE under different constraint graph topology, size, and density. We see a similar trend of performance gain when comparing ABCDE and ABCDC. ABCDE outperforms ABCDC because each solution in the population is getting explored by every agent in ABCDE. As different agents possess the ability to modify different parts of the solution due to factored nature of CDCOPs, it significantly improves exploration. On the other hand, in ABCDC, each solution gets explored a limited number of times, and it is not ensured that all the agents are given the opportunity to improve each solution. As a result, many potential good solutions are prematurely discarded.
Vi Conclusions and Future Work
We develop a CDCOP solver, namely ABCDE, by tailoring a wellknown populationbased algorithm (i.e., Artificial Bee Colony). More importantly, we introduce a new a exploration mechanism with the intend to further improve the quality of the solution. We theoretically prove that ABCDE is an anytime algorithm. Finally, we present empirical results that show that ABCDE outperforms the stateoftheart nonexact CDCOPs algorithms by a notable margin. In the future, we would like to investigate whether ABCDE can be applied to solve multiobjective CDCOPs.