Solving DCOPs with Distributed Large Neighborhood Search

02/22/2017 ∙ by Ferdinando Fioretto, et al. ∙ Ben-Gurion University of the Negev 0

The field of Distributed Constraint Optimization has gained momentum in recent years, thanks to its ability to address various applications related to multi-agent cooperation. Nevertheless, solving Distributed Constraint Optimization Problems (DCOPs) optimally is NP-hard. Therefore, in large-scale, complex applications, incomplete DCOP algorithms are necessary. Current incomplete DCOP algorithms suffer of one or more of the following limitations: they (a) find local minima without providing quality guarantees; (b) provide loose quality assessment; or (c) are unable to benefit from the structure of the problem, such as domain-dependent knowledge and hard constraints. Therefore, capitalizing on strategies from the centralized constraint solving community, we propose a Distributed Large Neighborhood Search (D-LNS) framework to solve DCOPs. The proposed framework (with its novel repair phase) provides guarantees on solution quality, refining upper and lower bounds during the iterative process, and can exploit domain-dependent structures. Our experimental results show that D-LNS outperforms other incomplete DCOP algorithms on both structured and unstructured problem instances.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In a Distributed Constraint Optimization Problem (DCOP), multiple agents coordinate their value assignments to maximize the sum of resulting constraint utilities [13, 28]. DCOPs represent a powerful approach to the description and solution of many practical problems in a variety of application domains, such as distributed scheduling, coordination of unmanned air vehicles, smart grid electrical networks, and sensor networks [19, 31, 10, 23].

In many cases, the coordination protocols required for the complete resolution of DCOPs demand a vast amount of resources and/or communication, making them infeasible to solve real-world complex problems. In particular complete DCOP algorithms find optimal solutions at the cost of a large runtime or network load, while incomplete approaches trade optimality for lower usage of resources. Since finding optimal DCOP solutions is NP-hard, incomplete algorithms are often necessary to solve large interesting problems. Unfortunately, several local search algorithms (e.g., DSA [29], MGM [12]) and local inference algorithms (e.g., Max-Sum [3]) do not provide guarantees on the quality of the solutions found. More recent developments, such as region-optimal algorithms [15, 26], Bounded Max-Sum [20], and DaC algorithms [25, 8] alleviate this limitation. Region-optimal algorithms allow us to specify regions with a maximum size of agents or hops from each agent, and they optimally solve the subproblem within each region. Solution quality bounds are provided as a function of and/or . Bounded Max-Sum is an extension of Max-Sum, which solves optimally an acyclic version of the DCOP graph, bounding its solution quality as a function of the edges removed from the cyclic graph. DaC-based algorithms use Lagrangian decomposition techniques to solve agent subproblems sub-optimally. Good quality assessments are essential for sub-optimal solutions. However, many incomplete DCOP approaches can provide arbitrarily poor quality assessments (as confirmed in our experimental results). In addition, they are unable to exploit domain-dependent knowledge or the hard constraints present in problems.

In this paper, we address these limitations by introducing the Distributed Large Neighborhood Search (D-LNS) framework. D-LNS solves DCOPs by building on the strengths of centralized LNS [22], a centralized

meta-heuristic that iteratively explores complex neighborhoods of the search space to find better candidate solutions. LNS has been shown to be very effective in solving a number of optimization problems 

[6, 21]. While typical LNS approaches focus on iteratively refining lower bounds of a solution, we propose a method that can iteratively refine both lower and upper bounds of a solution, imposing no restrictions (i.e., linearity or convexity) on the objective function and constraints.

This work advances the state of the art in DCOP resolution: (1) We provide a novel distributed local search framework for DCOPs, which provides quality guarantees by refining both upper and lower bounds of the solution found during the iterative process; (2) We introduce two novel distributed search algorithms, DPOP-DBR and T-DBR, built within the D-LNS framework, and characterized by the ability to exploit problem structure and offer low network usage—T-DBR provides also a low computational complexity per agent; and (3) Our evaluation against representatives of search-based, inference-based, and region-optimal-based incomplete DCOP algorithms shows that T-DBR converges faster to better solutions, provides tighter solution quality bounds, and is more scalable.

The rest of the paper is organized as follows. In the next section, we introduce DCOPs and review centralized LNS. Section 3 presents our novel D-LNS schema. Section 4 presents a general algorithm framework, based on D-LNS, that iteratively refines lower and upper bounds of the DCOP solutions. We further describe two implementations of such framework offering different tradeoffs of agent complexity and solution quality. Prior concluding the Section, we report an example trace of the proposed repair algorithm, aimed at elucidate its behavior within the D-LNS framework. Section 5 discusses the theoretical properties of the algorithms presented, with particular emphasis on the correctness for the solution bounds returned during the iterative process. We present the related works in Section 6, and summarize our evaluation of the proposed framework against search-based, inference-based, and region-optimal-based DCOP incomplete algorithms, in Section 7. Finally, Section 8 concludes the paper.

2 Background

(a) Constraint Graph           (b) Pseudo-tree        (c) Constraints

Figure 1: Example DCOP

Distributed Constraint Optimization Problems.

A Distributed Constraint Optimization Problem (DCOP) is a tuple , where: is a set of variables; is a set of finite domains (i.e., ); is a set of utility functions (also called constraints), where and is the set of the variables (also called the scope) relevant to ; is a set of agents; and is a function that maps each variable to one agent. specifies the utility of each combination of values assigned to the variables in . Following common conventions, we restrict our attention to binary utility functions and assume that each agent controls exactly one variable. Thus, we will use the terms “variable” and “agent” interchangeably and assume that . We assume at most one constraint between each pair of variables, thus making the order of variables in the scope of a constraint irrelevant.

A partial assignment is a value assignment to a set of variables that is consistent with the variables’ domains. The utility is the sum of the utilities of all the applicable utility functions in . A solution is a partial assignment for all the variables of the problem, i.e., with . We will denote with a solution, while is the value of in . The goal is to find an optimal solution .

Given a DCOP , is the constraint graph of , where  iff   s.t.  . A DFS pseudo-tree arrangement for is a spanning tree of s.t. if and , then and appear in the same branch of . Edges of that are in (resp. out of) are called tree edges (resp. backedges). Tree edges connect a node with its parent and its children, while backedges connect a node with its pseudo-parents and its pseudo-children. We use to denote the neighbors of the agent . We denote with , the subgraph of used in the execution of our iterative algorithms, where and .

Fig. 1(a) depicts the graph of a DCOP with agents , each controlling a variable with domain {0,1}. Fig. 1(b) shows a possible pseudo-tree (solid lines identify tree edges, dotted lines refer to backedges). Fig. 1(c) shows the DCOP constraints.

Large Neighborhood Search.

In (centralized) Large Neighborhood Search (LNS), an initial solution is iteratively improved by repeatedly destroying it and repairing it. Destroying a solution means selecting a subset of variables whose current values will be discarded. The set of such variables is referred to as large neighborhood (LN). Repairing a solution means finding a new value assignment for the destroyed variables, given that the other non-destroyed variables maintain their values from the previous iteration.

The peculiarity of LNS, compared to other local search techniques, is the (larger) size of the neighborhood to explore at each step. It relies on the intuition that searching over a larger neighborhood allows the process to escape local optima and find better candidate solutions.

3 The D-LNS Framework

In this section, we introduce D-LNS, a general distributed LNS framework to solve DCOPs. Our D-LNS solutions need to take into account factors that are critical for the performance of distributed systems, such as network load (i.e., number and size of messages exchanged by agents) and the restriction that each agent is only aware of its local subproblem (i.e., its neighbors and the constraints whose scope includes its variables). Such properties make typical centralized LNS techniques unsuitable and infeasible for DCOPs.

1 ;
2 Value-Initialization();
3 while  termination condition is not met  do
4        ;
5        Destroy-Algorithm();
6        if  then ; else  ;
7         Repair-Algorithm();
8        if  Accept ()  then  ;
Algorithm 1 D-LNS

Algorithm 1 shows the general structure of D-LNS, as executed by each agent . After initializing its iteration counter (line 1), its current value assignment (done by randomly assigning values to variables or by exploiting domain knowledge when available), and its current lower and upper bounds and of the optimal utility (line 2), the agent, like in LNS, iterates through the destroy and repair phases (lines 3-8) until a termination condition occurs (line 3). Possible termination conditions include reaching a maximum value of , a timeout limit, or a confidence threshold on the error of the reported best solution.

Destroy Phase. The result of this phase is the generation of a LN, which we refer to as , for each iteration . This step is executed in a distributed fashion, having each agent calling a Destroy-Algorithm to determine if its local variable should be destroyed () or preserved , as indicated by the flag (line 5). We say that destroyed (resp. preserved) variables are (resp. are not) in . In a typical destroy process, such decisions can be either random or made by exploiting domain knowledge. For example, in a scheduling problem, one may choose to preserve the start times of each activity and destroy the other variables. D-LNS allows the agents to use any destroy schema to achieve the desired outcome. Once the destroyed variables are determined, the agents reset their values and keep the values of the preserved variables from the previous iteration (line 6).

Repair Phase. The agents start the repair phase, which seeks to find new value assignments for the destroyed variables, by calling a Repair-Algorithm (line 7). The goal of this phase is to find an improved solution by searching over a LN, which is carried exclusively by the destroyed agents. However, the step to compute the solution bounds requires the cooperation of all agents in the problem. D-LNS is general in that it does not impose any restriction on the way agents coordinate to solve this problem. We propose two distributed repair algorithms in the next section, that provide quality guarantees and online bound refinements. Once the agents find and evaluate a new solution, they either accept it or reject it (line 8). In our proposed distributed algorithms, the agents accept the solution if it does not violate any hard constraints, that is, its utility is not .

While most of the current incomplete DCOP algorithms fail to guarantee the consistency of the solution returned w.r.t. the hard constraints of the problem [15], D-LNS can accommodate consistency checks during the repair phase.

4 Distributed Bounded Repair

We now introduce the Distributed Bounded Repair (DBR), a general Repair algorithm framework that, within D-LNS, iteratively refines the lower and upper bounds of the DCOP solution. Its general structure is illustrated in the flow chart of Figure 2. At each iteration , each DBR agent checks if its local variable was preserved or destroyed. In the former case, the agent waits for the Bounding phase to start, which is algorithm dependent. In the latter case the agent executes, in order, the following phases:

Relaxation Phase. Given a DCOP , this phase constructs two relaxations of , and , which are used to compute, respectively, a lower and an upper bound on the optimal utility for . Let be the subgraph of in iteration , where is the subset of edges of (defined in Section 2) whose elements involve exclusively nodes in . Both problem relaxations and are solved using a relaxation graph , computed from , where depends on the algorithm adopted.

In the problem , we wish to find a partial assignment using


where is the value assigned to the preserved variable for problem in the previous iteration. The first summation is over all functions listed in , while the second is over all functions between destroyed and preserved variables. Thus, solving means optimizing over all the destroyed variables given that the preserved ones take on their previous value, and ignoring the (possibly empty) set of edges that are not part of the relaxation graph. This partial assignment is used to compute lower bounds during the bounding phase.

In the problem , we wish to find a partial assignment using


Thus, solving means optimizing over all the destroyed variables considering exclusively the set of edges that are part of the relaxation graph. This partial assignment is used to compute upper bounds during the bounding phase.

Notice that the partial assignments returned solving these two relaxed problems involve exclusively the variables in .

Solving Phase. Next, DBR solves the relaxed DCOPs and using the equations above. At a high level, one can use any complete DCOP algorithm to solve and . Below, we describe two inference-based DBR algorithms, defined over different relaxation graphs . Thus, the output of this phase are the values for the agent’s local variable, associated to eqs. (1) and (2).

Figure 2: DBR Flow chart. The Solving phase illustrates the T-DBR algorithm’s solving phase.

Bounding Phase. Once the relaxed problems are solved, all agents start the bounding phase, which results in computing the lower and upper bounds based on the partial assignments and . To do so, both solutions to the problems and are extended to a solution and , respectively, for , where the preserved variables are assigned the values from the previous iteration.

The lower bound is thus computed by evaluating . The upper bound is computed by evaluating , where


with is the optimal utility on the relaxation graph , and is the set of past iteration indices for which the function was an edge in the relaxation graph. Specifically, .

Therefore, the utility of is composed of three parts. The first part involves all functions that have never been part of up to the current iteration, the second part involves all the functions optimized in the current iteration, and the third part involves all the remaining functions. The utility of each function in the first part is the maximal utility over all possible pairs of value combinations of variables in the scope of that function. The utility of each function in the second part is the largest utility among the mean utility of the functions optimized in the current iteration (i.e., those in ), and the utilities of such function optimized in a past iteration. The utility of each function in the third part is equal to the utility assigned to such function in the previous iteration. In particular, imposing that the edges optimized in the current iteration contribute at most equally (i.e., as the mean utility of ) to the final utility of allows us to not underestimate the solution upper bound within the iterative process (see Lemma 1). As we show in Theorems 5 and 5, . Therefore, is a guaranteed approximation ratio for .

The significance of this Repair framework is that it enables D-LNS to iteratively refine both lower and upper bounds of the solution, without imposing any restrictions on the form of the objective function and of the constraints adopted.222 Note, however that this does not implies that the lower bound and the upper bound will converge to the same value. Below, we introduce two implementations of the DBR framework, summarized in the flow-chart of Figure 2, whose solving phase is shown in the dotted area.

4.1 DPOP-based DBR Algorithm

DPOP-based DBR (DPOP-DBR) solves the relaxed DCOPs and over the relaxed graph . Thus, , and solving problem means optimizing over all the destroyed variables ignoring no edges in .

The DPOP-DBR solving phase uses DPOP [17], a complete inference-based algorithm composed of two phases operating on a DFS pseudo-tree. In the utility propagation phase, each agent, starting from the leaves of the pseudo-tree, projects out its own variable and sends its projected utilities to its parent. These utilities are propagated up the pseudo-tree induced from until they reach the root. The hard constraints of the problem can be naturally handled in this phase, by pruning all inconsistent values before sending a message to its parent. Once the root receives utilities from all its children, it starts the value propagation phase, where it selects the value that maximizes its utility and sends it to its children, which repeat the same process. The problem is solved as soon as the values reach the leaves.

Note that the relaxation process may create a forest, in which case one should execute the algorithm in each tree of the forest. As a technical note, DPOP-DBR solves the two relaxed DCOPs in parallel. In the utility propagation, each agent computes two sets of utilities, one for each relaxed problem, and sends them to its parent. In the value propagation phase, each agent selects two values, one for each relaxed problem, and sends them to its children.

DPOP-DBR has the same worst case order complexity of DPOP, that is, exponential in the induced width of the relaxed graph . Thus, we introduce another algorithm characterized by a smaller complexity and low network load.

4.2 Tree-based DBR Algorithm

Tree-based DBR (T-DBR) defines the relaxed DCOPs and using a pseudo-tree structure that is computed from the subgraph . Thus, , and solving problem means optimizing over all the destroyed variables ignoring backedges. Its general solving schema is similar to that of DPOP, in that it uses Utility and Value propagation phases; however, the different underlying relaxation graph adopted imposes several important differences. Algorithm 2 shows the T-DBR pseudocode. We use the following notations: , , denote the parent, the set of children, and pseudo-parents of the agent , at iteration . The set of these items is referred to as , which is ’s local view of the pseudo-tree . We use “” to refer to the items associated with the pseudo-tree . and denote ’s context (i.e., the values for each ) w.r.t. problems and , respectively. We assume that by the end of the destroy phase (line 6) each agent knows its current context as well as which of its neighboring agents has been destroyed or preserved. In each iteration , T-DBR executes the following phases:

10 Relaxation()
11 Util-Propagation()
12 Value-Propagation()
13 Bound-Propagation()
14 return
Algorithm 2 T-DBR()

algocf[!t]     algocf[!t]     algocf[!t]    

Relaxation Phase. It constructs a pseudo-tree (line 9), which ignores, from , the preserved variables as well as the functions involving these variables in their scopes. The construction prioritizes tree-edges that have not been chosen in previous pseudo-trees over the others.

Figure 3: D-LNS with T-DBR example trace.

Solving Phase. Similarly to DPOP-DBR, T-DBR solving phase is composed of two phases operating on the relaxed pseudo-tree , and executed synchronously:

  • Utility Propagation Phase. After the pseudo-tree is constructed (line 10), each leaf agent computes the optimal sum of utilities in its subtree considering exclusively tree edges (i.e., edges in ) and edges with destroyed variables. Each leaf agent computes the utilities and for each pair of values of its variable and its parent’s variable (lines 15-17), in preparation for retrieving the solutions of and , used during the bounding phase. The agent projects itself out (lines 18-19) and sends the projected utilities to its parent in a Util message (line 20). Each agent, upon receiving the Util message from each child, performs the same operations. Thus, these utilities will propagate up the pseudo-tree until they reach the root agent.

  • Value Propagation Phase. This phase starts after the utility propagation (line 11) by having the root agent compute its optimal values and for the relaxed DCOPs and , respectively (line 22). It then sends its values to all its neighbors in a Value message (line 23). When its child receives this message, it also compute its optimal values and sends them to all its neighbors (lines 31-33). Thus, these values propagate down the pseudo-tree until they reach the leaves, at which point every agent has chosen its respective values. In this phase, in preparation for the bounding phase, when each agent receives a Value message from its neighbor, it will also update the value of its neighbor in both its contexts and (lines 24-26 and 29-30).

Bounding Phase. Once the relaxed DCOPs and have been solved, the algorithm starts the bound propagation phase (line 12). This phase starts by having each leaf agent of the pseudo-tree compute the lower and upper bounds and (lines 36-37). These bounds are sent to its parent in (line 38). When its parent receives this message form all its children (line 35), it performs the same operations. The lower and upper bounds of the whole problem are determined when the bounds reach the root agent.

4.3 T-DBR Example Trace

Figure 3 illustrates a running example of T-DBR during the first two D-LNS iterations, using the DCOP of Figure 1. The trees and are represented by bold solid lines (functions in ); all other functions are represented by dotted lines. The preserved variables in each iteration are shaded gray. At each step, the resolution of the relaxed problems involves the functions represented by bold lines— is solved optimizing over the blue colored functions, and over the red ones. We recall that while solving focuses solely on the functions in , solving further accounts for the function involving a destroyed and a preserved variable. The nodes illustrating destroyed variables are labeled with red values representing ,333In our example solving and yields the same solution for . and nodes representing preserved variables are labeled with black values representing . Each edge is labeled with a pair of values representing the utilities (top, in blue) and (bottom, in red) of the corresponding functions. The lower and upper bounds of each iteration are shown below.

When , each agent randomly assigns a value to its variable, which results in a solution with utility to get the lower bound. Moreover, solving yields a solution with utility , which is the upper bound.

In the first iteration (), the destroy phase preserves , and thus . The algorithm then builds the spanning tree with the remaining variables choosing and as a tree edges. Thus the relaxation graph for involves the edges (in red), and the relaxation graph for involves the edges (in blue). Solving yields partial assignment with utility , which results in a lower bound . Solving yields solution with utility , which is the current upper bound. Recall that the values for the functions in are computed as (see eq.  (3)).

Finally, in the second iteration (), the destroy phase retains assigning it its value in the previous iteration , and the repair phase builds the new spanning tree with the remaining variables choosing and as a tree edges. Thus the relaxation graph for involves the edges , and the relaxation graph for involves the edges . Solving and yields partial assignments and , respectively, with utilities , which results in a lower bound , and an upper bound .

5 Theoretical Properties

We report below the theoretical results on the bounds provided by our D-LNS framework with the DBR Repair algorithm, as well as the agents’ complexity and network load of T-DBR. Due to space constraints, we report sketch proofs.

For each , Proof (Sketch). The result follows from that is an optimal solution of the relaxed problem whose functions are a subset of .

Lemma 1

For each , where is the value assignment to variable when solving the relaxed DCOP and is the value assignment to variable when solving the original DCOP .

Proof (Sketch). For each iteration , it follows:

(by def. of (case 2))
(by def. of )

The last step follows from that, in each iteration , the functions associated with the edges in are solved optimally. Since their cost is maximized it is also greater than the corresponding cost when evaluated on the optimal solution for the problem .

Lemma 2

For each , where is the set of functions that have been chosen as edges of the relaxation graph in a previous iteration.

Proof (Sketch). We prove it by induction on the iteration . For ease of explanation we provide an illustration (below) of the set of relevant edges optimized in successive iterations.

For , then , thus the statement vacuously holds. Assume the claim holds up to iteration . For iteration it follows that,

(by def. of )

The last step follows from cases 2 and 3 of eq. (3). Additionally, the following inequalities hold:

(by inductive hypothesis)
(by Lemma 1)

Thus, combining the above it follows:

Which concludes the proof. .

Lemma 3 ensures that the utility associated to the functions optimized in the relaxed problems , up to iteration , is an upper bound for the evaluation of the same set of functions, evaluated under the optimal solution for . The above proof relies on the observation that the functions in include exclusively those ones associated with the optimization of problems , with , and that the functions over which the optimization process operates multiple times (in ), are evaluated with their maximal value observed so far.

For each , Proof (Sketch). By definition of , it follows that,

(by def. of )
(by Lemma 2)

which concludes the proof.

Figure 4: Normalized solution quality for the upper bounds and lower bounds, on regular grids (left), random graphs (center), and scale-free (right) networks, at varying of the maximum time (top rows) and network load (bottom rows) allotted to the algorithms.
Corollary 1

An approximation ratio for the problem is

Proof (Sketch). This result follows from (Theorem 5) and (Theorem 5).

In each iteration, T-DBR requires number of messages of size , where . Proof (Sketch). The number of messages required at each iteration is bounded by the Value Propagation Phase of Algorithm 2, where each agent sends a message to each of its neighbors (lines 23 and 33). In contrast all other phases use up to messages (which are reticulated from the leaves to the root of the pseudo-tree and vice-versa). The size of the messages is bounded by the Utility Propagation Phase, where each agent (excluding the root agent) sends a message containing a value for each element of its domain (line 20). All other messages exchanged contain two values (lines 23, 33, and 38). Thus the maximum size of the messages exchanged at each iteration is at most .

In each iteration, the number of constraint checks of each T-DBR agent is , where . Proof (Sketch). The number of constraint checks, performed by each agent in each iteration, is bounded by the operations performed during the Util-Propagation Phase. In this phase, each agent (except the root agent) computes the lower and upper bound utilities for each values of its variable and its parent’s variable (lines 16–17). 

6 Related Work

Aside from the incomplete algorithms described in the introduction, researchers have also developed extensions to complete algorithms that trade solution quality for faster runtime. For example, complete search algorithms have mechanisms that allow users to specify absolute or relative error bounds [13, 27]. Researchers have also worked on non-iterative versions of inference-based incomplete DCOP algorithms, with and without quality guarantees [20, 14, 16]. Such methods are, however, unable to refine the initial solution returned. Finally, the algorithm that is the most similar to ours is LS-DPOP [18], which operates on a pseudo-tree performing a local search. However, unlike D-LNS, LS-DPOP operates only in a single iteration, does not change its neighborhood, and does not provide quality guarantees.

7 Experimental Results

We evaluate the D-LNS framework against state-of-the-art incomplete DCOP algorithms, with and without quality guarantees, where we choose representative search-, inference-, and region optimal-based solution approaches. We select Distributed Stochastic Algorithm (DSA) as a representative of an incomplete search-based DCOP algorithm; Max-Sum (MS), and Bounded Max-Sum (BMS), as representative of inference-based DCOP algorithms, and - and -optimal algorithms (KOPT, and TOPT), as representative of region optimal-based DCOP methods. All algorithms are selected based on their performance and popularity. We run the algorithms using the following implementations: We use the FRODO framework [11] to run MS, and DSA,444We modified DSA-C in FRODO to DSA-B and set . the authors’ code of BMS [20], and the DALO framework [9] for KOPT and TOPT. We systematically evaluate the runtime, solution quality and network load of the algorithms on binary constraint networks with random, scale-free, and grid topologies, and we evaluate the ability of D-LNS to exploit domain knowledge over distributed meeting scheduling problems.

(ms) (ms) (ms) (ms) (ms) (ms) (ms) (ms)
10 1.06 1.00 2058 1.15 1.00 103 1.87 0.82 211 4.33 0.94 63 3.50 0.97 137 6.00 0.99 998 0.78 126 0.94 51
20 1.28 0.98 58811 1.31 1.00 190 2.30 0.82 698 7.67 0.92 313 6.00 0.95 1206 0.80 441 0.97 69
50 1.54 1.00 554 3.00 0.85 2639 17.66 0.90 1961 13.50 0.90 8744 0.83 2290 0.99 162
100 1.67 1.00 2101