Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search

12/10/2020 ∙ by Taoan Huang, et al. ∙ University of Southern California 8

Conflict-Based Search (CBS) is a state-of-the-art algorithm for multi-agent path finding. At the high level, CBS repeatedly detects conflicts and resolves one of them by splitting the current problem into two subproblems. Previous work chooses the conflict to resolve by categorizing the conflict into three classes and always picking a conflict from the highest-priority class. In this work, we propose an oracle for conflict selection that results in smaller search tree sizes than the one used in previous work. However, the computation of the oracle is slow. Thus, we propose a machine-learning framework for conflict selection that observes the decisions made by the oracle and learns a conflict-selection strategy represented by a linear ranking function that imitates the oracle's decisions accurately and quickly. Experiments on benchmark maps indicate that our method significantly improves the success rates, the search tree sizes and runtimes over the current state-of-the-art CBS solver.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1  Introduction

Multi-Agent Path Finding (MAPF) is the problem of finding a set of conflict-free paths for a given number of agents on a given graph that minimizes the sum of costs or the makespan. Although MAPF is NP-hard to solve optimally Yu and LaValle (2013), significant research effort has been devoted to MAPF to support its application in distribution centers Ma et al. (2017a); Hönig et al. (2019), traffic management Dresner and Stone (2008), airplane taxiing Morris et al. (2015); Balakrishnan and Jung (2007) and computer games Ma et al. (2017b).

Conflict-Based Search (CBS) Sharon et al. (2015) is one of the leading algorithms for solving MAPF optimally, and a number of enhancements to CBS have been developed Boyarski et al. (2015); Li et al. (2019a); Felner et al. (2018); Barer et al. (2014). The key idea behind CBS is to use a bi-level search that resolves conflicts by adding constraints at the high level and replans paths for agents respecting these constraints at the low level. The high level of CBS is a best-first search on a binary search tree called constraint tree (CT). To expand a CT node (that consists of a set of paths and a set of constraints on these paths), CBS has to choose a conflict in the current set of paths to resolve and add constraints that prevent this conflict in the child nodes. Picking good conflicts is important, and a good strategy for conflict selection could have a big impact on the efficiency of CBS by reducing both the size of CT and its runtime. Boyarski et al. (2015) proposes to prioritize conflicts by categorizing them into three classes and always picking one from the top class. Such a strategy has been proven to be efficient Boyarski et al. (2015) and is commonly used for conflict selection in recent research Li et al. (2019a); Felner et al. (2018); Li et al. (2019c)

. In this paper, we propose a new conflict-selection oracle that results in smaller CT sizes than the one used in previous work but is much more computationally expensive since it has to compute 1-step lookahead heuristics for each conflict.

To overcome the high computational cost of the oracle, we leverage insights from studies on variable selection for branching in Mixed Integer Linear Programming (MILP) solving and propose to use machine learning (ML) techniques for designing conflict-selection strategies that imitate the oracle’s decisions to speed up CBS. Variable selection for branching in MILP is analogous to conflict selection in CBS. As part of the branch-and-bound algorithm for MILP

Wolsey and Nemhauser (1999), non-leaf nodes in the CT must be expanded into two child nodes by selecting one of the unassigned variables and splitting its domain by adding new constraints, while CBS chooses and splits on conflicts. Recent studies Khalil et al. (2016, 2017); He et al. (2014) have shown that data-driven ML approaches for MILP solving are competitive with and can even outperform state-of-the-art commercial solvers.

We borrow such ML tools from MILP solving Khalil et al. (2016) and propose a data-driven framework for designing conflict-selection strategies for CBS. In the first phase of our framework, we observe and record decisions made by the oracle on a set of instances and collect data on features that characterize the conflicts at each CT node. In the second phase, we learn a ranking function for conflicts in a supervised fashion that imitates the oracle but is faster to evaluate. In the last phase, we use the learned ranking function to replace the oracle and select conflicts in CBS to solve unseen instances. Compared to previous work on conflict selection in CBS, our approach is able to discover more efficient rules for conflict selection that significantly improve the success rate and reduce the CT size and the runtime of the search. Our method is flexible since we are able to customize the conflict-selection strategies easily for different environments and do not need to hard-code different rules for different scenarios. Different from recent work on ML-guided MILP solving, we utilize problem-specific features which contain essential information about the conflicts, while previous work only takes MILP-level features (e.g., counts and statistics of variables) into account Khalil et al. (2016, 2017). Another advantage of our offline learning method over training an instance-specific model on-the-fly is that our learned ranking function is able to generalize to instances and graphs unseen during training.

2  Mapf

Given an undirected unweighted underlying graph , the Multi-Agent Path Finding (MAPF) problem is to find a set of conflict-free paths for a set of agents . Each agent is assigned a start vertex and a goal vertex . Time is discretized into time steps, and, at each time step, every agent can either move to an adjacent vertex or wait at the same vertex in the graph. The cost of an agent is the number of time steps until it reaches its goal vertex and no longer moves. We consider two types of conflicts: i) a vertex conflict occurs when agents and are at the same vertex at time step ; and ii) an edge conflict occurs when agents and traverse the same edge in opposite directions between time steps and . Our objective is to find a set of conflict-free paths that move all agents from their start vertices to their goal vertices with the optimal cost, that is, the minimum sum of all agents’ costs.

3  Background and Related Work

In this section, we first provide a brief introduction to CBS and its variants. Then, we summarize other related work using ML in MAPF and MILP solving.

Conflict-Based Search (CBS)

CBS is a bi-level tree search algorithm. It records the following information for each CT node :

  1. : There are two types of constraints: i) a vertex constraint , corresponding to a vertex conflict, prohibits agent from being at vertex at time step ; and ii) an edge constraint , corresponding to an edge conflict, prohibits agent from moving from vertex to vertex between time steps and .

  2. : A solution of consists of a set of individually cost-minimal paths for all agents respecting the constraints in . An individually cost-minimal path for an agent is the cost-minimal path between its start and goal vertices assuming it is the only agent in the graph.

  3. : the cost of defined as the sum of costs of the paths in .

  4. : the set of conflicts in .

On the high level, CBS starts with a tree node whose set of constraints is empty and expands the CT in a best-first manner by always expanding a tree node with the lowest . After choosing a tree node to expand, CBS identifies the set of conflicts in . If there are none, CBS terminates and returns . Otherwise, CBS randomly (by default) picks one of the conflicts to resolve and adds two child nodes of to the CT by imposing, depending on the type of conflict, an edge or vertex constraint for one of two conflicting agents to of one of the child node under and for the other conflicting agent to of the other child node. On the low level, it replans the paths in to accommodate the newly-added constraints, if necessary. CBS guarantees completeness by exploring both ways of resolving each conflict and optimality by performing best-first searches on both of its high and low levels.

The Random Map The Game Map
Runtime CT Size Oracle Time Search Time Runtime CT Size Oracle Time Search Time
CBSH2+ 9.95s 2,362 nodes 0.00s 9.95s 2.3min 952 nodes 0.0min 2.3min
CBSH2+ 24.89s 746 nodes 21.34s 3.55s 19.8min 565 nodes 19.0min 0.8min
CBSH2+ 12.13s 632 nodes 9.52s 2.61s 27.4min 2,252 nodes 23.4min 4.0min
Our Solver 6.19s 998 nodes 0.88s 5.31s 1.6min 754 nodes 0.2min 1.4min
Table 1: Performance of CBSH2 with different oracles and our solver. Oracle time is the total time that the oracle takes per instance. Search time is the runtime minus the oracle time. All entries are averages taken over the instances that are solved by all solvers.

Variants of CBS

CBS chooses conflicts randomly, but this conflict-selection strategy can be improved. Improved CBS (ICBS) Boyarski et al. (2015) categorizes conflicts into three types to prioritize them. A conflict is cardinal iff, when CBS uses the conflict to split CT node , the costs of both resulting child nodes are strictly larger than . A conflict is semi-cardinal iff the cost of one of the child nodes is strictly larger than and the cost of the other child node is the same as

. A conflict is non-cardinal otherwise. By first resolving cardinal conflicts, then semi-cardinal conflicts and finally non-cardinal conflicts, CBS is able to improve its efficiency since it increases the lower bound on the optimal cost more quickly by generating child nodes with larger costs. ICBS uses Multi-Valued Decision Diagrams (MDD) to classify conflicts. An MDD for agent

is a directed acyclic graph consisting of all cost-minimal paths from to of a given cost that respect the current constraints . The nodes at depth of the MDD are exactly the nodes that agent could be at when following one of its cost-minimal paths. A vertex (edge) conflict () is cardinal iff vertex (edge ) is the only vertex at depth (the only edge from depth to depth ) in the MDDs of both agents. Li et al. (2019b) proposes to add disjoint constraints to two child nodes when expanding a CT node in CBS and prioritize conflicts based on the number of singletons in or the widths of the MDDs of both agents.

Another line of research focuses on speeding up CBS by calculating a tighter lower bound on the optimal cost to guide the high-level search. When expanding a tree node , CBSH Felner et al. (2018) uses the CG heuristic, which builds a conflict graph (CG) whose vertices represent agents and whose edges represent cardinal conflicts in . Then, the lower bound on the optimal cost within the subtree rooted at is guaranteed to increase at least by the size of the minimum vertex cover of this CG. We refer to this increment as the -value of the CT node. Based on CBSH, CBSH2 Li et al. (2019a) uses the DG and WDG heuristics that generalize CG and compute -values for CT nodes using (weighted) pairwise dependency graphs that take into account semi-cardinal and non-cardinal conflicts besides cardinal ones. CBSH2 with the WDG heuristic is the current state-of-the-art CBS solver for MAPF Li et al. (2019a).

To the best of our knowledge, other than prioritizing conflicts using MDDs, conflict prioritization has not yet been explored. Barer et al. (2014) proposes a number of heuristics to prioritize CT nodes for the high-level search, including those using the number of conflicts, the number of conflicting agents and the number of conflicting pairs of agents. However, this work uses conflict-related metrics to select CT nodes, while we learn to select conflicts.

Other Related Work

ML techniques are not often applied to MAPF. Sartoretti et al. (2019)

proposes a reinforcement-learning framework for learning decentralized policies for agents offline to avoid the cost of planning online. Our work is different from their work since we focus on search algorithms and use ML to find efficient and flexible conflict-selection strategies to speed up them. Furthermore, our ML model is simple and easy to implement, without the need to train and fine-tune a deep neural network.

Using ML to speed up search has been explored in the context of MILP solving. Khalil et al. (2016) uses ML to design strategies for branching that mimic strong branching. Our overall framework is similar to Khalil et al. (2016) but different in several aspects. Instead of collecting training data and learning a model online, we collect training data and learn a model offline. We leverage insights from existing heuristics for computing -values to design problem-specific labels and features for learning. Finally, once our model is learned, it performs well on unseen instances while Khalil et al. (2016) learns instance-specific models. Their line of work also includes learning when to run primal heuristics to find incumbents in a tree search Khalil et al. (2017) and learning how to order nodes adaptively for branch-and-bound algorithms He et al. (2014).

4  Oracles for Conflict Selection

Given a MAPF instance, at a particular CT node with the set of conflicts , an oracle for conflict selection is a ranking function that takes as input, calculates a real-valued score per conflict and outputs the ranks determined by the scores. We say that CBS follows an oracle for conflict selection iff CBS builds the CT by always resolving the conflict with the highest rank. We define oracle to be the one proposed by Boyarski et al. (2015), that uses MDDs to rank conflicts. Given a CT node , oracle ranks the conflicts in in the order of cardinal conflicts, semi-cardinal conflicts and non-cardinal conflicts, breaking ties in favor of conflicts at the smallest time step and remaining ties randomly.

Next, we define oracles and that both calculate 1-step lookahead scores by using, for each conflict, the two child nodes of that would result if the conflict were resolved at . Given a CT node , oracle computes the score for each conflict , where and would be the costs of the two child nodes of and and would be the -values given by the WDG heuristic of the two child nodes of if conflict were resolved at . Then, it outputs the ranks determined by the descending order of the scores (i.e., the highest rank for the highest score). Oracle chooses the conflict that results in the tightest lower bound on the optimal cost in the child nodes. We use the WDG heuristic to compute the -values since it is the state of the art. The intuition behind using this oracle is that the sum of the cost and the -value of a node is a lower bound on the cost of any solution found in the subtree rooted in the node, and, thus, we want CBS to increase the lower bound as much as possible to find a solution quickly.

Given a CT node , oracle computes the score for each conflict , where and would be the number of conflicts in the two child nodes of if conflict were resolved at . Then, it outputs the ranks determined by the increasing order of the scores (i.e., the highest rank for the lowest score). Oracle chooses the conflict that results in the least number of conflicts in the child nodes.

We use CBSH2 with the WDG heuristic as our search algorithm and run it with oracles and on (1) a random map, a four-neighbor grid map with randomly generated blocked cells, and (2) the game map “lak503d” Sturtevant (2012), a four-neighbor grid map with blocked cells. The figures of the maps are shown in Table 4. The experiments are conducted on 2.4 GHz Intel Core i7 CPUs with 16 GB RAM. We set the runtime limit to 20 minutes for the random map and 1 hour for the game map. We set the number of agents to for the random map and for the game map and run the solvers on 50 instances for each map. Following Stern et al. (2019), the start and goal vertices are randomly paired among all vertices in each map’s largest connected component for each instance throughout the paper. In Table 1, we present the performance of the three oracles as well as our solver. All entries are averages taken over the instances that are solved by all solvers. We consider the CT size since a small CT size implies a small runtime and first look at the performance of CBSH2 with the three oracles. Oracle is the best for the random map, followed closely by oracle . Oracle is the best for the game map. Overall, oracle is the best. Therefore, in the rest of the paper, we mainly focus on learning a ranking function to imitate oracle . Table 1 shows that by learning to imitate oracle , our solver achieves the best performance in term of the runtime, even though it induces a larger CT than CBSH2+. We introduce our machine learning methodology in Section 5 and show experimental results in Section 6. We use the solver, , introduced in Section 6 to generate results of our solver in Table 1.

Feature Descriptions Count
Types of the conflict: binary indicators for edge conflicts, vertex conflicts, cardinal conflicts, semi-cardinal conflicts and non-cardinal conflicts. 5
Number of conflicts involving agent () that have been selected and resolved so far during the search: their min., max. and sum. 3
Number of conflicts that have been selected and resolved so far during the search at vertex (): their min., max. and sum. 3
Number of conflicts that agent () is involved in: their min., max. and sum. 3
Time step of the conflict. 1
Ratio of and the makespan of the solution. 1
Cost of the path of agent (): their min., max., sum, absolute difference and ratio. 5
Difference of the costs of the path of agent () and its individually cost-minimal path: their min. and max. 2
Ratio of the cost of the path of agent () and the cost of its individually cost-minimal path: their min. and max. 2
Difference of the cost of the path of agent () and : their min. and max. 2
Ratio of the cost of the path of agent () and : their min. and max. 2
Ratio of the cost of the path of agent () and : their min. and max. 2
Binary indicator whether none (at least one) of agents and has reached its goal by time step . 2
Number of conflicts such that (). 6
Number of agents such that there exists and such that (). 6
Number of conflicts such that (). 6
Width of level of the MDD for agent : their min. and max. Li et al. (2019b). 10
Weight of the edge between agents and in the weighted dependency graph Li et al. (2019a). 1
Number of vertices in graph such that (). 5
Table 2: Features of a conflict () of a CT node . Given the underlying graph , let , and define the time-expanded graph as an unweighted graph . Let be the cost of the cost-minimal path between vertices and in and be the minimum distance between vertices and in . For a conflict () in , define () and (). For an agent , define The counts are the numbers of features contributed by the corresponding entries, which add up to .

5  Machine Learning Methodology

We now introduce our framework for learning which conflict to resolve in CBS. The key idea is that, by observing and recording the features and ranks of conflicts determined by the scores given by the oracle, we learn a ranking function that ranks the conflicts as similarly as possible to the oracle without actually probing the oracle. Our framework consists of three phases:

  1. Data collection. We obtain two set of instances, a training dataset and a test dataset . For each instance , we obtain a dataset by running the oracle.

  2. Model learning. The training dataset is fed into a machine learning algorithm to learn a ranking function that maximizes the prediction accuracy.

  3. ML-guided search. We replace the oracle with the learned ranking function to rank conflicts in the CBSH2 solver. We run the new solver on randomly generated instances on the same graphs seen during training or unseen graphs.

Data Collection

The first task in our pipeline is to construct a training dataset from which we can learn a model that imitates the oracle’s output. We first fix the graph underlying the instances that we want to solve and the number of agents. The number of agents is only fixed during the data collection and model learning phases. We obtain two sets of instances, for training and for testing. A dataset is obtained for each instance , and the final training (test) dataset is obtained by concatenating these datasets. To obtain dataset , oracle is run for each CT node to produce the ranking for . The data consists of: (i) a set of CT nodes ; (ii) a set of conflicts for a given ; (iii) binary labels for all transformed from the oracle’s ranking of the conflicts; and (iv) a feature map for all that describes conflict at each CT node with features. The test dataset is used to evaluate the prediction accuracy of the learned model.

Game Random Maze Room Warehouse City
Number of agents in instances for data collection 100 18 30 22 30 180
Training on
the same map
Swapped pairs (%) 4.40 10.89 4.5 12.58 5.78 2.89
Top pick accuracy (%) 60.16 69.03 87.69 67.56 84.93 83.05
Training on
the other maps
Swapped pairs (%) 7.45 19.64 21.98 15.24 6.08 7.66
Top pick accuracy (%) 53.13 50.44 49.90 66.80 86.85 78.57

Table 3: Numbers of agents in instances for data collection, test losses and accuracies. The swapped pairs (%) are the fractions of swapped pairs averaged over all test CT nodes and the top pick accuracies are the accuracies of the ranking function picking the conflicts labled as 1 in the test dataset.
Map Success Rate (%) Runtime (min) CT Size (nodes) PAR10 Score (min)
CBSH2 ML-S ML-O CBSH2 ML-S ML-O CBSH2 ML-S ML-O CBSH2 ML-S ML-O
Warehouse 30 93 96 (93) 96 (93) 0.20 0.06 0.07 1,154 294 378 7.18 4.14 4.25
36 72 86 (71) 88 (71) 0.54 0.24 0.19 3110 980 977 28.46 14.56 12.81
42 55 68 (55) 70 (55) 1.27 0.65 0.38 6,834 2,874 1,781 45.70 32.61 30.56
48 17 32 (17) 32 (17) 1.99 1.12 0.56 9,646 5,357 2,221 83.34 68.64 68.48
54 6 16 (6) 15 (6) 2.82 1.70 1.23 12,816 8,886 6,427 94.17 84.42 85.36
Improvement over CBSH2 0 49.8% 64.4% 0 56.6% 68.2% 0 22.9% 22.7%
Room 22 83 91 (83) 91 (83) 0.61 0.49 0.51 7,851 5,648 5,888 17.51 9.76 9.83
26 47 57 (47) 55 (46) 1.32 1.01 1.14 15,791 11,087 12,108 53.68 43.97 45.91
30 28 36 (28) 34 (28) 2.08 1.21 1.45 21,279 10,284 12,117 73.22 65.32 67.28
32 17 24 (17) 24 (17) 1.88 1.39 1.70 22,152 13,943 16,327 83.77 77.02 77.14
34 9 14 (9) 14 (9) 3.99 2.70 3.24 39,447 22,611 28,392 91.36 86.56 86.63
Improvement over CBSH2 0 26.6% 21.3% 0 35.2% 32.0% 0 14.0% 11.6%
Maze 30 90 91 (90) 90 (90) 0.54 0.47 0.42 500 373 289 10.49 9.51 10.38
32 84 87 (84) 87 (84) 0.49 0.39 0.42 519 427 397 16.42 13.59 13.60
36 80 81 (80) 82 (79) 0.73 0.65 0.57 1,200 1,067 910 20.66 19.68 18.68
40 56 60 (56) 62 (56) 0.85 0.80 0.75 1,194 1,099 1,026 44.47 40.79 38.85
44 45 49 (45) 50 (45) 1.08 1.06 0.87 1,389 1,343 1,055 54.49 50.82 49.75
Improvement over CBSH2 0 10.3% 18.3% 0 13.0% 24.4% 0 6.3% 8.2%
Random 18 95 95 (95) 94 (94) 0.32 0.23 0.31 5032 3,105 4,148 5.32 5.27 6.29
20 88 91 (88) 91 (88) 0.43 0.30 0.36 7,834 3,829 4,595 12.38 9.37 9.48
23 74 80 (74) 80 (74) 0.96 0.56 0.78 17,952 8,118 11,555 26.71 20.60 20.81
26 39 48 (39) 45 (39) 1.27 0.87 1.24 19,236 8,053 13,301 61.50 52.75 55.82
29 17 27 (17) 24 (17) 4.04 2.74 3.39 63,661 35,485 44,179 83.69 74.07 77.02
Improvement over CBSH2 0 33.4% 17.6% 0 49.3% 35.8% 0 15.1% 10.3%
City 180 78 85 (76) 84 (75) 3.53 2.43 2.46 859 468 476 134.99 93.04 99.87
200 76 82 (75) 83 (75) 4.78 5.08 4.13 849 702 490 147.96 113.53 106.78
230 57 68 (56) 64 (54) 4.86 4.26 4.24 835 444 449 261.36 196.99 220.50
260 44 54 (44) 54 (43) 11.69 9.75 9.55 1,883 1,178 1,219 341.37 282.00 282.58
290 18 27 (16) 28 (17) 11.65 8.45 8.75 1,966 1,372 1,429 494.10 441.87 436.70
Improvement over CBSH2 0 24.0% 25.2% 0 47.3% 46.4% 0 19.3% 20.2%
Game 100 68 77 (68) 75 (68) 6.76 5.49 5.94 4,100 3,114 3,341 196.66 145.09 156.74
110 59 67 (59) 67 (59) 6.58 6.03 5.92 3,978 3,652 3,596 249.89 202.61 202.39
120 35 44 (35) 44 (34) 9.59 8.76 8.80 5,351 4,643 4,691 393.27 341.63 341.82
125 34 41 (34) 42 (34) 9.32 7.77 7.58 5,145 4,153 4,054 399.18 358.91 353.32
130 19 26 (19) 25 (18) 4.83 5.00 4.85 2,486 2,498 2,338 487.01 447.22 453.05
Improvement over CBSH2 0 16.6% 17.3% 0 22.7% 23.3% 0 16.1% 16.4%
Table 4: Success rates, average runtimes and CT sizes of instances solved by all solvers and PAR10 scores for different number of agents in 6 maps. For the success rates of and , the fractions of instances solved by both our solver and CBSH2 are given in parentheses (bolded if it solves all instances that CBSH2 solves). For each map, we report the percentage of improvement of our solvers over CBSH2 on the runtime and CT size on instances solved by all solvers and PAR10 score.

Features

We collect a

-dimensional feature vector

that describes a conflict with respect to CT node . The features of a conflict () in our implementation are summarized in Table 2. They consist of (1) the properties of the conflict, (2) statistics of CT node , the conflicting agents and and the contested vertex or edge w.r.t. the current solution, (3) the frequency of a conflict being resolved for a vertex or an agent, and (4) features of the MDD and the weighted dependency graph. For each feature, we normalize its value to the range across all conflicts in . All features of a given conflict can be computed in time.

Labels

We aim to label each conflict in such that conflicts with higher ranks determined by the oracle have larger labels. Instead of using the full ranking provided by oracle , we use a binary labeling scheme similar to the one proposed by Khalil et al. (2016). We assign label 1 to each conflict strictly among the top 20% of the full ranking and label 0 to the rest, with one exception. When more than 20% of the conflicts have the same highest score, we assign label 1 to those conflicts and label 0 to the rest. By doing so, we ensure that at least one conflict is labeled 1 and conflicts with the same score have the same label. This labeling scheme relaxes the definition of “top” conflicts that allows the learning algorithm to focus on only high-ranking conflicts and helps avoid the irrelevant task of learning the correct ranking of conflicts with low scores.

Model Learning

We learn a linear ranking function with parameters

that minimizes the loss function

where is the ground-truth label vector, is the vector of predicted scores resulting from applying to the feature vectors of every conflict in , is a loss function measuring the difference between the ground truth labels and the predicted scores, and is a regularization parameter. The loss function is based on a pairwise loss that has been used in the literature Joachims (2002). Specifically, we consider the set of pairs , where is the ground-truth label of conflict in label vector . The loss function is the fraction of swapped pairs, defined as

We use an open-source package made available by

Joachims (2006)

that implements a Support Vector Machine (SVM) approach

Joachims (2002) that minimizes an upper bound on the loss, which is NP-hard to minimize.

ML-Guided Search

After offline data collection and ranking function learning, we replace the oracle for conflict selection in CBS with the learned function. At each CT node , we first compute the feature vector for each conflict and pick the conflict with the maximum score . The overall time complexity for conflict selection at node is . Even though the complexity of conflict selection with oracle is only , we will show in our experiments that we are able to outperform CBSH2+ in terms of both the CT size and the runtime.

Figure 1: Success rates within the runtime limit.

6  Experimental Results

In this section, we demonstrate the efficiency and effectiveness of our solver, ML-guided CBS, through extensive experiments. We use the C++ code for CBSH2 with the WDG heuristic made available by Li et al. (2019a) as our CBS version. We compare against CBSH2+ as baseline since is the most commonly used conflict-selection oracle. The reason why we choose CBSH2 with the WDG heuristic over CBS, ICBS or CBSH2 with the CG or DG heuristics is that it performs best, as demonstrated in Li et al. (2019a). All reported results are averaged over 100 randomly generated instances.

Our experiments provide answers to the following questions: i) If the graph underlying the instances is known in advance, can we learn a model that performs well on unseen instances on the same graph with different numbers of agents? ii) If the graph underlying the instances is unknown, can we learn a model from other graphs that performs well on instances on that graph?

We use a set of six four-neighbor grid maps of different sizes and structures as the graphs underlying the instances and evaluate our algorithms on them. includes (1) a warehouse map Li et al. (2020), a grid map with 100 rectangle obstacles; (2) the room map “room-32-32-4” Stern et al. (2019), a grid map with 64 rooms connected by single-cell doors; (3) the maze map “maze-128-128-2” Stern et al. (2019), a grid map with two-cell-wide corridors; (4) the random map; (5) the city map “Paris_1_256” Stern et al. (2019), a grid map of Paris; (6) the game map. The figures of the maps are shown in Table 4. For each grid map , we collect data from randomly generated training instances and test instances on with a fixed number of agents, where and . We learn two ranking functions for map : a ranking function that is trained using 5,000 CT nodes i.i.d. sampled from the training dataset collected by solving instances on the same map and another that is trained using 5,000 CT nodes sampled from the training dataset collected by solving instances on the other maps, 1,000 i.i.d. CT nodes sampled for each of the five other maps. For each , we denote our solver that uses the ranking function trained on the same map as and the solver that uses the one trained on the other maps as . We set to train an Joachims (2002) with a linear kernel to obtain each of the ranking functions. We varied and found that and perform similarly. We test the learned ranking functions on the test dataset collected by solving . The numbers of agents in the instances used for data collection, the test losses and the test accuracies of picking the conflicts labeled as 1 are reported in Table 3. We varied the numbers of agents for data collection and found that they led to similar performances. In general, the losses of the ranking functions for are larger and their accuracies of picking “good” conflicts are lower that those for .

We run CBSH2, and on randomly generated instances on each of the six maps and vary the number of agents. The runtime limits are set to 60 minutes for the two largest maps (the city and game maps) and 10 minutes for the others. In Table 4, we report the success rates, the average runtimes and CT sizes of instances solved by all solvers and the PAR10 scores (a commonly used metric to score solvers where we count the runs that exceed the given runtime limit as 10 times the limit when computing the average runtimes) Bischl et al. (2016) for some numbers of agents on each map and defer the full table to Appendix A. We plot the success rates on the warehouse and city maps in Figure 1 and the rest in Appendix A. and dominate CBSH2 in all metrics on all maps for almost all cases. Overall, CBSH2, and solve 3,326 (55.43%), 3,779 (62.98%) and 3,758 (62.63%) instances out of 6,000 we tested, respectively. The improvement of and over CBSH2 on instances commonly solved by all solvers is 10.3% to 64.4% for the runtime and 13.0% to 68.2% for the CT sizes across different maps. For , even though we learn the ranking function from data collected on instances with a fixed number of agents, the learned function generalizes to instances with larger numbers of agents on the same map and outperforms CBSH2. , without seeing the actual map being tested on during training, is competitive with and even outperforms sometimes on the warehouse, city, maze and game maps. The results suggest that our method, when focusing on solving instances on a particular grid map, can outperform CBSH2 significantly and, when faced with a new grid map, will still gain an advantage. To demonstrate the efficiency of our solver further, we show the success rates for different runtime limits varying the number of agents on each map in Appendix A. Typically, the three solvers tie on the easy instances but and gain an advantage on the hard ones, even more for larger numbers of agents.

Next, we look at the feature importance of the learned ranking functions. For , the six ranking functions have nine features in common among their eleven features with the largest absolute weights. Thus, they are similar when looking at the important features. We take the average of each weight and sort them in decreasing order of their absolute values. The plot and the full list of features with their indices are included in Appendix B. The top eight features are (1) the weight of the edge between agents and in the weighted dependency graph (WDG) (feature 67); (2) the binary indicator for non-cardinal conflicts (feature 5); (3) the maximum of the differences of the cost of the path of agent () and (feature 23); (4) the binary indicator for cardinal conflict (feature 3); (5) the minimum of the numbers of conflicts that agent () is involved in (feature 12); (6-8) the minimum (feature 6), the maximum (feature 7) and the sum (feature 8) of the numbers of conflicts involving agent () that have been selected and resolved. Those features mainly belong to three categories: features related to the cardinal conflicts (features 3 and 5), the WDG (feature 67) and the frequency of a conflict being resolved for an agent (features 6, 7 and 8), where the first is commonly used in previous work on CBS and the third is an analogue of the branching variable pseudocosts Achterberg et al. (2005) in MILP solving. For , we plot the weights of the features in decreasing order of their absolute values for each of the six ranking functions in Appendix B. For the random map and the room map, features 5, 67 and 3 come as the top three features. Among the top five features for the two largest maps (the city and game maps), three of them are features 6, 7 and 8. For the maze map, the top feature is the maximum of the widths of level of the MDD for agent (

), followed by features 67 and 23. For the warehouse map, the top three features are features 6, 23 and 8. As can be seen, most of the important features for each individual map also belong to the three categories. In Appendix B, we present the results for feature selection. We show that we are able to achieve slightly better results with certain combinations of the important features.

7  Conclusions and Future Directions

In this paper, we proposed the first ML framework for conflict selection in CBS. The extensive experimental results showed that our learned ranking function can generalize across different numbers of agents on a fixed graph (map) or unseen graphs. Our objective was to imitate the decisions made by the oracle that picks the conflict that produces the tightest lower bound on the optimal cost in its child nodes. We are also interested in discovering a better oracle for conflict selection from which we can learn. We expect our method to work well with other newly-developed techniques, such as symmetry breaking techniques Li et al. (2020), and it remains future work to incorporate those techniques into the framework of CBSH2 to work with our ML-guided conflict selection.

Acknowledgments

We thank Jiaoyang Li, Peter J. Stuckey and Danial Harabor for helpful discussions. The research at the University of Southern California was supported by the National Science Foundation (NSF) under grant numbers 1409987, 1724392, 1817189, 1837779 and 1935712 as well as a gift from Amazon.

References

  • T. Achterberg, T. Koch, and A. Martin (2005) Branching rules revisited. Operations Research Letters 33 (1), pp. 42–54. Cited by: 6  Experimental Results.
  • H. Balakrishnan and Y. Jung (2007) A framework for coordinated surface operations planning at Dallas-Fort Worth International Airport. In AIAA Guidance, Navigation and Control Conference and Exhibit, pp. 6553. Cited by: 1  Introduction.
  • M. Barer, G. Sharon, R. Stern, and A. Felner (2014) Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In Annual Symposium on Combinatorial Search, pp. 19–27. Cited by: 1  Introduction, Variants of CBS.
  • B. Bischl, P. Kerschke, L. Kotthoff, M. Lindauer, Y. Malitsky, A. Fréchette, H. Hoos, F. Hutter, K. Leyton-Brown, K. Tierney, et al. (2016) Aslib: a benchmark library for algorithm selection. Artificial Intelligence 237, pp. 41–58. Cited by: 6  Experimental Results.
  • E. Boyarski, A. Felner, R. Stern, G. Sharon, D. Tolpin, O. Betzalel, and E. Shimony (2015) ICBS: Improved conflict-based search algorithm for multi-agent pathfinding. In International Joint Conference on Artificial Intelligence, pp. 442–449. Cited by: 1  Introduction, Variants of CBS, 4  Oracles for Conflict Selection.
  • K. Dresner and P. Stone (2008) A multiagent approach to autonomous intersection management. Journal of Artificial Intelligence Research 31, pp. 591–656. Cited by: 1  Introduction.
  • A. Felner, J. Li, E. Boyarski, H. Ma, L. Cohen, T. S. Kumar, and S. Koenig (2018) Adding heuristics to conflict-based search for multi-agent path finding. In International Conference on Automated Planning and Scheduling, pp. 83–87. Cited by: 1  Introduction, Variants of CBS.
  • H. He, H. Daume III, and J. M. Eisner (2014) Learning to search in branch and bound algorithms. In Advances in Neural Information Processing Systems, pp. 3293–3301. Cited by: 1  Introduction, Other Related Work.
  • W. Hönig, S. Kiesel, A. Tinka, J. W. Durham, and N. Ayanian (2019) Persistent and robust execution of MAPF schedules in warehouses. IEEE Robotics and Automation Letters 4 (2), pp. 1125–1131. Cited by: 1  Introduction.
  • T. Joachims (2002) Optimizing search engines using clickthrough data. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133–142. Cited by: Model Learning, 6  Experimental Results.
  • T. Joachims (2006) Training linear SVMs in linear time. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. Cited by: Model Learning.
  • E. B. Khalil, B. Dilkina, G. L. Nemhauser, S. Ahmed, and Y. Shao (2017) Learning to run heuristics in tree search.. In International Joint Conference on Artificial Intelligence, pp. 659–666. Cited by: 1  Introduction, 1  Introduction, Other Related Work.
  • E. B. Khalil, P. Le Bodic, L. Song, G. L. Nemhauser, and B. Dilkina (2016) Learning to branch in mixed integer programming. In AAAI Conference on Artificial Intelligence, pp. 724–731. Cited by: 1  Introduction, 1  Introduction, Other Related Work, Labels.
  • J. Li, A. Felner, E. Boyarski, H. Ma, and S. Koenig (2019a) Improved heuristics for multi-agent path finding with conflict-based search.. In International Joint Conference on Artificial Intelligence, pp. 442–449. Cited by: Appendix C, Table 2, 1  Introduction, Variants of CBS, 6  Experimental Results.
  • J. Li, G. Gange, D. Harabor, P. J. Stuckey, H. Ma, and S. Koenig (2020) New techniques for pairwise symmetry breaking in multi-agent path finding. In Proceedings of the International Conference on Automated Planning and Scheduling, Vol. 30, pp. 193–201. Cited by: 6  Experimental Results, 7  Conclusions and Future Directions.
  • J. Li, D. Harabor, P. J. Stuckey, H. Ma, and S. Koenig (2019b) Disjoint splitting for multi-agent path finding with conflict-based search. In International Conference on Automated Planning and Scheduling, pp. 279–283. Cited by: Table 2, Variants of CBS.
  • J. Li, P. Surynek, A. Felner, H. Ma, T. S. Kumar, and S. Koenig (2019c) Multi-agent path finding for large agents. In AAAI Conference on Artificial Intelligence, pp. 7627–7634. Cited by: 1  Introduction.
  • H. Ma, J. Li, T. S. Kumar, and S. Koenig (2017a) Lifelong multi-agent path finding for online pickup and delivery tasks. In International Conference on Autonomous Agents and Multi-Agent Systems, pp. 837–845. Cited by: 1  Introduction.
  • H. Ma, J. Yang, L. Cohen, T. S. Kumar, and S. Koenig (2017b) Feasibility study: moving non-homogeneous teams in congested video game environments. In Artificial Intelligence and Interactive Digital Entertainment Conference, pp. 270–272. Cited by: 1  Introduction.
  • R. Morris, M. L. Chang, R. Archer, E. V. Cross, S. Thompson, J. Franke, R. Garrett, W. Malik, K. McGuire, and G. Hemann (2015) Self-driving aircraft towing vehicles: a preliminary report. In AI for Transportation Workshop at the AAAI Conference on Artificial Intelligence, Cited by: 1  Introduction.
  • G. Sartoretti, J. Kerr, Y. Shi, G. Wagner, T. S. Kumar, S. Koenig, and H. Choset (2019) PRIMAL: pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters 4 (3), pp. 2378–2385. Cited by: Other Related Work.
  • G. Sharon, R. Stern, A. Felner, and N. R. Sturtevant (2015) Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence 219, pp. 40–66. Cited by: 1  Introduction.
  • R. Stern, N. Sturtevant, A. Felner, S. Koenig, H. Ma, T. Walker, J. Li, D. Atzmon, L. Cohen, T. Kumar, et al. (2019) Multi-agent pathfinding: definitions, variants, and benchmarks. pp. 151–158. Cited by: Appendix C, 4  Oracles for Conflict Selection, 6  Experimental Results.
  • N. R. Sturtevant (2012) Benchmarks for grid-based pathfinding. IEEE Transactions on Computational Intelligence and AI in Games 4 (2), pp. 144–148. Cited by: Appendix C, 4  Oracles for Conflict Selection.
  • L. A. Wolsey and G. L. Nemhauser (1999)

    Integer and combinatorial optimization

    .
    Vol. 55, John Wiley & Sons. Cited by: 1  Introduction.
  • J. Yu and S. M. LaValle (2013) Planning optimal paths for multiple robots on graphs. In IEEE International Conference on Robotics and Automation, pp. 3612–3617. Cited by: 1  Introduction.

Appendix:
Learning to Resolve Conflicts for Multi-Agent Path Finding with Conflict-Based Search


Appendix A A  Additional Experimental Results

The success rates on the room, maze, random and game maps are shown in Figure 3, where we can see that the success rates of and are both marginally higher than CBSH2. Table 5 includes the results on all data points in Figures 1 and 3. The success rates for different runtime limits varying the number of agents on the warehouse, room, maze, random, city and game maps are shown in Figures 4, 5, 6, 7, 8 and 9, respectively.

Appendix B B   Feature Importance

For , we plot the weights of the features in decreasing order of their absolute values for each of the six ranking functions in Figures 10 and 11. For , we take the average weight of each feature over the six ranking functions and sort them in decreasing order of their absolute values. The plot is shown in Figure 12.

The top five features for the warehouse map are: (1) the minimum of the numbers of conflicts involving agent () that have been selected and resolved; (2) the maximum of the differences of the cost of the path of agent () and time step ; (3) the sum of the numbers of conflicts involving agent () that have been selected and resolved; (4) the minimum of the differences of the cost of the path of agent () and time step ; (5) the binary indicator for non-cardinal conflicts.

The top five features for the room map are: (1) the binary indicator for non-cardinal conflicts; (2) the weight of the edge between agents and in the weighted dependency graph; (3) the binary indicator for cardinal conflicts; (4) number of empty cells that are two steps away from where the conflicts occur (5) the minimum of the numbers of conflicts involving agent () that have been selected and resolved.

The top five features for the maze map are: (1) the maximum of the widths of level of the MDD for agent (); (2) the weight of the edge between agents and in the weighted dependency graph; (3) the maximum of the differences of the cost of the path of agent () and time step ; (4) the binary indicator for semi-cardinal conflicts; (5) the maximum of the widths of level of the MDD for agent ().

The top five features for the random map are: (1) the binary indicator for non-cardinal conflicts; (2) the weight of the edge between agents and in the weighted dependency graph; (3) the binary indicator for cardinal conflicts; (4) the maximum of the widths of level of the MDD for agent (); (5) the maximum of the differences of the cost of the path of agent () and time step .

The top five features for the city map are: (1) the binary indicator for non-cardinal conflicts; (2) the maximum of the numbers of conflicts involving agent () that have been selected and resolved; (3) the minimum of the numbers of conflicts involving agent () that have been selected and resolved; (4) the binary indicator for cardinal conflicts; (5) the sum of the numbers of conflicts involving agent () that have been selected and resolved.

The top five features for the game map are: (1) the minimum of the differences of the cost of the path of agent () and time step ; (2) the maximum of the differences of the cost of the path of agent () and time step ; (3) the maximum of the numbers of conflicts involving agent () that have been selected and resolved; (4) the minimum of the numbers of conflicts involving agent () that have been selected and resolved; (5) the sum of the numbers of conflicts involving agent () that have been selected and resolved.

Figure 2: Success rates with feature selection on the warehouse map. The curver for is partially hidden by the curves for (5) and (-5).

Results on Feature Selection

In this subsection, we present preliminary results on feature selection. We select five categories of features: (1) features related to the cardinal conflicts (features 3,4 and 5); (2) features related to the frequency of a conflict being resolved for an agent (features 6,7 and 8); (3) features related to the numbers of conflicts that the agent is involved in (features 12 and 13); (4) features related to the difference of the cost of the path of the agent and its individually cost-minimal path; (5) features related to the MDDs and the WDG (features 62 and 67). The features selected cover the top ten features for (as shown in Figure 12) that can be computed in constant time and four features among the top five features for each individual map (as shown in Figures 11 and 10). We train a ranking function with all five categories of features and denote the corresponding solver as (5). We then hold out each of the five categories and train a ranking functions with the rest of four categories. We denote the solver that is trained without the -th category as (). Since now we use only 9 to 12 features for training, we can afford to train a SVM using a polynomial kernel of degree 2 for each solver while keeping the other parameters the same. We show the success rates on the warehouse map in Figure 2. (5) performs similarly to , implying that when using only the selected features, we are still able to achieve performance as good as which uses all features. (-1) trained without using the 1-st category (features related to the cardinal conflicts) performs the worse among our solvers, only slightly better than CBSH2. (-3) is the best among our solvers, which dominates .

Appendix C C  Code and Data for Reproducibility

We provide the core of our code in the supplementary material, which is based on the open-source code from Li et al. (2019a). We also include the random and warehouse maps used in experiments, while other maps could be found at Stern et al. (2019) and Sturtevant (2012). We do not include the training and test data due to the file size limit.

Figure 3: Success rates within the runtime limit.
Figure 4: The warehouse map: Percentage of solved instances under problem parameters.
Figure 5: The room map: Percentage of solved instances under problem parameters.
Figure 6: The maze map: Percentage of solved instances under problem parameters.
Figure 7: The random map: Percentage of solved instances under problem parameters.
Figure 8: The city map: Percentage of solved instances under problem parameters.
Figure 9: The game map: Percentage of solved instances under problem parameters.
Map Success Rate (%) Runtime (min) CT Size (nodes) PAR10 Score
CBSH2 ML-S ML-O CBSH2 ML-S ML-O CBSH2 ML-S ML-O CBSH2 ML-S ML-O
Warehouse 30 93 96 (93) 96 (93) 0.20 0.06 0.07 1,154 294 378 7.18 4.14 4.25
33 92 96 (91) 95 (91) 0.37 0.21 0.14 2,367 1,053 721 8.37 4.38 5.27
36 72 86 (71) 88 (71) 0.54 0.24 0.19 3110 980 977 28.46 14.56 12.81
39 59 77 (57) 75 (59) 0.77 0.39 0.49 4,064 1,395 2,000 41.52 23.83 25.84
42 55 68 (55) 70 (55) 1.27 0.65 0.38 6,834 2,874 1,781 45.70 32.61 30.56
45 36 55 (36) 53 (36) 2.10 0.99 0.68 9,887 4,980 3,333 64.76 45.94 47.79
48 17 32 (17) 32 (17) 1.99 1.12 0.56 9,646 5,357 2,221 83.34 68.64 68.48
51 11 22 (11) 23 (11) 1.71 0.93 0.16 7,685 3,768 429 89.19 78.31 77.26
54 6 16 (6) 15 (6) 2.82 1.70 1.23 12,816 8,886 6,427 94.17 84.42 85.36
City 170 86 92 (84) 94 (86) 3.85 2.87 2.80 669 355 361 87.77 50.52 38.92
180 78 85 (76) 84 (75) 3.53 2.43 2.46 859 468 476 134.99 93.04 99.87
190 70 76 (69) 77 (68) 4.42 2.96 3.07 878 326 325 192.05 155.37 150.13
200 76 82 (75) 83 (75) 4.78 5.08 4.13 849 702 490 147.96 113.53 106.78
210 67 75 (65) 76 (66) 5.40 3.21 3.27 956 339 335 201.66 153.96 147.72
220 46 60 (45) 59 (45) 8.24 6.91 5.74 1,546 648 638 328.21 246.88 251.54
230 57 68 (56) 64 (54) 4.86 4.26 4.24 835 444 449 261.36 196.99 220.50
240 37 55 (35) 55 (34) 11.11 4.73 6.87 2,368 541 968 382.45 276.16 277.02
250 34 51 (34) 50 (34) 6.34 5.44 6.25 1,288 1,057 1,170 398.18 299.93 305.62
260 44 54 (44) 54 (43) 11.69 9.75 9.55 1,883 1,178 1,219 341.37 282.00 282.58
270 26 35 (24) 38 (24) 8.38 5.81 6.48 1,293 682 822 446.51 392.84 375.70
280 23 29 (19) 33 (22) 7.42 7.09 4.97 1,099 811 650 464.48 429.63 406.60
290 18 27 (16) 28 (17) 11.65 8.45 8.75 1,966 1,372 1,429 494.10 441.87 436.70
Room 20 89 96 (89) 94 (89) 0.22 0.17 0.18 1,823 1,038 1,101 11.19 4.38 6.33
22 83 91 (83) 91 (83) 0.61 0.49 0.51 7,851 5,648 5,888 17.51 9.76 9.83
24 79 86 (79) 84 (79) 0.69 0.53 0.55 8,392 5,007 5,160 21.55 14.83 16.68
26 47 57 (47) 55 (46) 1.32 1.01 1.14 15,791 11,087 12,108 53.68 43.97 45.91
28 45 52 (45) 50 (44) 1.45 0.95 0.96 15,951 10,184 10294 55.70 48.93 50.81
30 28 36 (28) 34 (28) 2.08 1.21 1.45 21,279 10,284 12,117 73.22 65.32 67.28
32 17 24 (17) 24 (17) 1.88 1.39 1.70 22,152 13,943 16,327 83.77 77.02 77.14
34 9 14 (9) 14 (9) 3.99 2.70 3.24 39,447 22,611 28,392 91.36 86.56 86.63
Maze 30 90 91 (90) 90 (90) 0.54 0.47 0.42 500 373 289 10.49 9.51 10.38
32 84 87 (84) 87 (84) 0.49 0.39 0.42 519 427 397 16.42 13.59 13.60
34 80 82 (80) 84 (80) 0.58 0.50 0.52 908 763 780 20.46 18.59 16.73
36 80 81 (80) 82 (79) 0.73 0.65 0.57 1,200 1,067 910 20.66 19.68 18.68
38 64 65 (64) 65 (64) 0.68 0.57 0.53 900 740 663 36.44 35.40 35.38
40 56 60 (56) 62 (56) 0.85 0.80 0.75 1,194 1,099 1,026 44.47 40.79 38.85
42 54 56 (54) 57 (54) 1.86 1.69 1.47 2,223 1,973 1,580 47.00 45.11 44.03
44 45 49 (45) 50 (45) 1.08 1.06 0.87 1,389 1,343 1,055 54.49 50.82 49.75
46 37 40 (37) 40 (37) 1.61 1.50 1.31 2,021 1,743 1,506 63.60 60.77 60.71
Random 17 96 97 (96) 97 (96) 0.11 0.08 0.10 1,195 743 902 4.10 3.13 3.19
18 95 95 (95) 94 (94) 0.32 0.23 0.31 5032 3,105 4,148 5.32 5.27 6.29
19 92 93 (92) 93 (92) 0.44 0.32 0.36 7,208 4,264 4677 8.41 7.38 7.43
20 88 91 (88) 91 (88) 0.43 0.30 0.36 7,834 3,829 4,595 12.38 9.37 9.48
21 79 83 (79) 81 (79) 0.59 0.49 0.56 8,814 5,244 6,927 21.47 17.53 19.54
22 74 80 (74) 77 (74) 0.90 0.48 0.59 15,884 6,286 7,577 26.66 20.69 23.54
23 74 80 (74) 80 (74) 0.96 0.56 0.78 17,952 8,118 11,555 26.71 20.60 20.81
24 62 71 (62) 66 (60) 1.14 0.77 0.92 20,433 10,413 12,422 38.87 29.82 34.71
25 55 62 (55) 59 (55) 1.19 0.85 1.02 21,725 12,137 14,874 45.66 38.84 41.74
26 39 48 (39) 45 (39) 1.27 0.87 1.24 19,236 8,053 13,301 61.50 52.75 55.82
27 27 38 (27) 37 (27) 1.36 0.99 1.05 26,642 16,597 17,130 73.41 62.72 63.79
28 24 31 (24) 29 (23) 1.66 0.83 1.10 26,597 9,239 13,423 76.45 69.41 71.46
29 17 27 (17) 24 (17) 4.04 2.74 3.39 63,661 35,485 44,179 83.69 74.07 77.02
Game 95 85 91 (85) 91 (85) 4.75 3.22 3.02 3,006 1,714 1,662 94.04 57.23 57.13
100 68 77 (68) 75 (68) 6.76 5.49 5.94 4,100 3,114 3,341 196.66 145.09 156.74
105 64 72 (64) 72 (64) 6.59 5.63 5.32 3,959 3,130 2,896 220.22 173.47 172.21
110 59 67 (59) 67 (59) 6.58 6.03 5.92 3,978 3,652 3,596 249.89 202.61 202.39
115 50 61 (50) 64 (50) 6.99 5.66 5.64 3,864 2,713 2,705 303.50 240.79 223.41
120 35 44 (35) 44 (34) 9.59 8.76 8.80 5,351 4,643 4,691 393.27 341.63 341.82
125 34 41 (34) 42 (34) 9.32 7.77 7.58 5,145 4,153 4,054 399.18 358.91 353.32
130 19 26 (19) 25 (18) 4.83 5.00 4.85 2,486 2,498 2,338 487.01 447.22 453.05
Table 5: Success rates, average runtimes and CT sizes of instances solved by all solvers, and PAR10 scores (calculated using the runtimes in minutes) for different number of agents . For the success rates of and , the fractions of instances solved by both the solver and the baseline are given in parentheses (bolded if the solver solves all instances that CBSH2 solves).
Figure 10: Feature importance plots for the warehouse, room, maze and random maps.
Figure 11: Feature importance plots for the city and game maps.
Figure 12: Feature importance plots for .
Index Feature
1 Binary indicators for edge conflicts.
2 Binary indicators for vertex conflicts.
3 Binary indicators for cardinal conflicts.
4 Binary indicators for semi-cardinal conflicts.
5 Binary indicators for non-cardinal conflicts.
6 Minimum of the numbers of conflicts involving agent () that have been selected and resolved.
7 Maximum of the numbers of conflicts involving agent () that have been selected and resolved.
8 Sum of the numbers of conflicts involving agent () that have been selected and resolved.
9 Minimum of the numbers of conflicts at vertex () that have been selected and resolved.
10 Maximum of the numbers of conflicts at vertex () that have been selected and resolved.
11 Sum of the numbers of conflicts at vertex () that have been selected and resolved.
12 Minimum of the numbers of conflicts that agent () is involved in
13 Maximum of the numbers of conflicts that agent () is involved in
14 Sum of the numbers of conflicts that agent () is involved in
15 Time step of the conflict.
16 Ratio of and the makespan of the solution.
17 Minimum of the costs of the path of agent ().
18 Maximum of the costs of the path of agent ().
19 Sum of the costs of the path of agent ().
20 Absolute difference of the costs of the path of agent ()
21 Ratio of the costs of the path of agent ().
22 Minimum of the differences of the cost of the path of agent () and its individually cost-minimal path.
23 Maximum of the differences of the cost of the path of agent () and its individually cost-minimal path.
24 Minimum of the ratios of the cost of the path of agent () and the cost of its individually cost-minimal path.
25 Maximum of the ratios of the cost of the path of agent () and the cost of its individually cost-minimal path.
26 Minimum of the ratios of the cost of the path of agent () and .
27 Maximum of the ratios of the cost of the path of agent () and .
28 Binary indicator whether none of agents and has reached its goal by time step .
29 Binary indicator whether at least one of agents and has reached its goal by time step .
30 Minimum of the differences of the cost of the path of agent () and .
31 Maximum of the differences of the cost of the path of agent () and .
32 Minimum of the ratios of the cost of the path of agent () and .
33 Maximum of the ratios of the cost of the path of agent () and .
34 Number of conflicts such that .
35 Number of conflicts such that .
36 Number of conflicts such that .
37 Number of conflicts such that .
38 Number of conflicts such that .
39 Number of conflicts such that .
40 Number of agents such that there exists and such that .
41 Number of agents such that there exists and such that .
42 Number of agents such that there exists and such that .
43 Number of agents such that there exists and such that .
44 Number of agents such that there exists and such that .
45 Number of agents such that there exists and such that .
46 Number of conflicts such that .
47 Number of conflicts such that .
48 Number of conflicts such that .
49 Number of conflicts such that .
50 Number of conflicts such that .
51 Number of conflicts such that .
52 Minimum of the widths of level of the MDDs for agent .
53 Maximum of the widths of level of the MDDs for agent .
54 Minimum of the widths of level of the MDDs for agent .
55 Maximum of the widths of level of the MDDs for agent .
56 Minimum of the widths of level of the MDDs for agent .
57 Maximum of the widths of level of the MDDs for agent .
58 Minimum of the widths of level of the MDDs for agent .
59 Maximum of the widths of level of the MDDs for agent .
60 Minimum of the widths of level of the MDDs for agent .
61 Maximum of the widths of level of the MDDs for agent .
62 Number of vertices in graph such that .
63 Number of vertices in graph such that .
64 Number of vertices in graph such that .
65 Number of vertices in graph such that .
66 Number of vertices in graph such that .
67 Weight of the edge between agents and in the weighted dependency graph.
Table 6: Features with their indices.