In the Multi-Agent Path Finding (MAPF) problem, we are given a graph and a set of agents . Each agent has a start position and a goal position . At each time step an agent can either move to a neighboring location or wait in its current location, at some cost. The objective is to return a least-cost set of actions for all agents, which will move all of the agents from start to goal positions goal without conflicts (i.e., without any pair of agents being in the same node or crossing the same edge at the same time). MAPF has practical applications in robotics, video games, aviation, vehicle routing, and other domains [Silver2005, Wang, Botea, and Kilby2011]. In its general form, MAPF is NP-complete, since it is a generalization of the sliding tile puzzle, an NP-complete problem [Ratner and Warmuth1986].
In this paper we consider a particular variant of MAPF, for which Meta-Agent Conflict-Based Search (MA-CBS) [Sharon et al.2012b], the algorithm explored here, was formulated. The total solution cost is the sum of costs of all actions (and hence the sum of costs of solutions for each of the agents). Any single action, as well as waiting during a single time step in a non-goal position, has unit cost. Waiting in the goal position has zero cost. The problem is solved in the centralized computing setting, where a single program controls all of the agents111This setting is tantamount to decentralized cooperative setting with full knowledge sharing and free communication [Sharon et al.2012b]..
MA-CBS is a generalization of Conflict-Based Search (CBS) [Sharon et al.2012a]. MA-CBS may serve as a bridge between CBS and completely coupled solvers, such as A*, A*+OD [Standley2010], or EPEA* [Felner et al.2012]. MA-CBS starts as a regular CBS solver, where the low-level search is performed by a single-agent search algorithm. At every search step MA-CBS employs a heuristic: if the number of conflicts for a pair of agents exceeds a certain threshold , MA-CBS merges the two agents into a combined agent. Experimental results showed that for certain values of the threshold MA-CBS outperforms both CBS and single-agent search. However, threshold
used in the heuristic has to be empirically determined, and varies both with the size and shape of graphand with the number of agents . Difficulty choosing the ‘right’ value for limits practical usability of MA-CBS.
Generally, a heuristic represents abstraction or approximation of a phenomenon associated with the problem or algorithm. Understanding why a particular heuristic works helps make better decisions involving the heuristic. One way to discover powerful heuristics for a particular problem is to design them systematically [Hernávölgyi and Holte2004]. However, a heuristic can also come as an insight, and in this case explaining why the heuristic is successful helps further improve the algorithm.
In this paper we look at the heuristic decision-making of MA-CBS, in which a fixed threshold on the number of conflicts between a pair of agents is used to replace the agents with a single combined agent. Based on the observations of the dependence of the threshold on features of the problem, we suggest an explanation for the threshold, and propose a model problem where the decision can be made optimal in a certain sense of optimality. Based on the model problem, we empirically investigate variants of MA-CBS. The investigation
provides further support for the hypothesis regarding the root cause behind the fixed threshold, and
allows improving MA-CBS algorithm through better use of the heuristic.
Finally, we propose more efficient decision rules, based both on well-known and new results, for merging agents in MA-CBS. We empirically compare the variants of MA-CBS on different problem domains to illustrate a steady increase of performance.
Contributions of the paper are:
Justification of heuristic decision-making in MA-CBS.
Improvement of MA-CBS based on understanding of the phenomenon behind the use of a fixed threshold.
Derivation of variants of MA-CBS with further improved performance.
2 Background and Related Work
The pseudocode for MA-CBS is shown in Algorithm 1. Like CBS, MA-CBS maintains a list of nodes, sorted by the increasing sum of costs of individual solutions (SIC). At every step of the main loop (lines 3–15) a node with the lowest SIC is removed from the node list (line 6). If the solutions in the node do not have any conflicts, this set of solutions is returned as the solution for the problem (line 15). In case of conflicts there are two possibilities. MA-CBS either adds, just like CBS, two nodes to the node list. The nodes are created according to a single conflict between a pair of agents. Each of the nodes has the solution for one of the agents updated to avoid the conflict with the other agent (lines 12–14). Otherwise, MA-CBS merges the two agents into a combined agent and adds a single node to the node list with the combined agent instead of the pair of agents (lines 9–10). The decision whether to split or to merge is based on parameter : the agents are merged if the number of encountered conflicts between the agents since the beginning of the search is at least .
Both CBS and MA-CBS solve MAPF optimally, however sub-optimal variants of CBS were also introduced [Barrer et al.2014]. On the other hand, different algorithms for solving MAPF optimally are also pursued. Some of the other algorithms bear similarities to MA-CBS, such as Independence Detection (ID) [Standley2010], which for every pair of conflicting agents tries to find an alternative solution for each agent avoiding the conflicts, and if failed merges the conflicting agents into a combined agent. A suboptimal variant of ID offers a trade-off between running time and solution quality [Standley and Korf2011]. Other algorithms, such as A*+OD [Standley2010], EPEA*[Felner et al.2012], or ICTS [Sharon et al.2013] can be used for lower-level search in MA-CBS.
3 Justification of Fixed Threshold
The authors of MA-CBS summarized the results of their empirical evaluation of the algorithm with an evidence that:
The best value of threshold decreases with hardness of the problem instances.
The advantage of MA-CBS is more prominent on harder instances.
Such behavior is characteristic for online competitive algorithms [Manasse, McGeoch, and Sleator1988], and in particular reminds of the ski rental problem, also known as snoopy caching problem in a more general setting [Karlin et al.1988]. In the ski rental problem a tourist at a ski resort may either pay a fixed rent for each day of ski rental, or to buy the ski, obviously at a higher price. The tourist does not know in advance how many days he is going to spend at the resort, and must decide every morning whether to rent or to buy. The famous result for this problem is that the tourist should rent the ski for days, and to buy the ski on the next day if he/she is still at the resort. This algorithm is 2-competitive, that is the tourist will spend at most twice as much money as if the number of days were known in advance, and this is the best competitive ratio a deterministic algorithm can achieve. There are randomized algorithms with a lower competitive ratio though [Karlin et al.1994].
Consequently, we conjectured that the fixed threshold in MA-CBS plays a role similar to the threshold in the online algorithm for the ski rental problem. Both theoretical analysis and empirical evaluation confirmed this conjecture.
3.1 Model Problem: 2 agents
Consider the MAPF problem for 2 agents as the simplest non-trivial case. If MA-CBS is used, and the number of conflicts reaches , some number of merges between 1 and the number of nodes currently in the node list solves the problem instance. If the time to find a solution for the combined agent does not become much shorter when constraints are added, it may be better to just remove all constraints and compute the solution for the combined agent once, rather than multiple times for each node in the node list. We shall call a version of MA-CBS that restarts the search upon a merge MA-CBS/R. A comparative evaluation of MA-CBS and MA-CBS/R is provided in Table 1. The problem instance is shown in Figure 1.a. The number of merges performed by MA-CBS is, for all but extreme, (1 and 8) values of is greater than 1 (the number of restarts in MA-CBS/R), and the number of single-agent nodes expanded by MA-CBS is greater than by MA-CBS/R.222Let us note that the number of expanded single-agent nodes is, along with the search time, an adequate measure of the performance of CBS, MA-CBS, and variants. Evaluation of the distance heuristic for a single agent can be memoized, and the total heuristic evaluation time is thus negligible compared to the time spent expanding single-agent nodes and generating children satisfying the constraints.
The intuition behind MA-CBS/R is formalized by the following two lemmas about competitiveness of both MA-CBS/R and MA-CBS for 2 agents:
Let us denote by the time to find the shortest path for the combined agent, and by the time to find the shortest paths for both agents independently, ignoring conflicts between the agents. Under the assumptions that
and are constant for a given problem instance at any point of the algorithm,
the ratio is known in advance,
MA-CBS/R is -competitive, and the competitive ratio is achieved for .
Since merging two agents solves a 2-agent problem at the cost , and splitting on a conflict may or may not solve the problem at the cost , this problem is equivalent to the ski rental or two caches and one block snoopy caching problem [Karlin et al.1988]. ∎
Under the assumptions of Lemma 1 MA-CBS is -competitive, and the competitive ratio is achieved for .
After splits there are nodes in the node list (Algorithm 1) for any . Hence, MA-CBS performs at most splits and then at most merges, and the worst-case time is . Just like in the proof for the ski rental problem, the competitive ratio is
for . ∎
According to the assumptions of Lemma 1, is at least 1, hence MA-CBS/R is competitive with a lower ratio (that is, in the worst case finds a solution in a shorter time) than MA-CBS.
The worst-case approach is apparently a reasonable option for designing an algorithm for 2-agent MAPF. Table 2 shows solution costs and the amount of computation spent by CBS and MA-CBS/R to find the solutions for two problem instances in Figure 1. Both instances have agents at the same locations, as well as the same number of passable cells, and the same position of the bottleneck. Nonetheless, the cost of an optimal solution for the instance in Figure 1.a is 11, and CBS has to resolve 7 conflicts, but for the instance in Figure 1.b the cost is 9, and only 1 conflict has to be resolved before a solution is found. MA-CBS/R is more efficient for 1.a but not for 1.b, where CBS is faster.
4 MA-CBS/R for Any Number of Agents
MA-CBS/R can be extended to an arbitrary number of agents. The pseudocode of MA-CBS/R is shown in Algorithm 2. MA-CBS/R differs from MA-CBS (Algorithm 1) in lines 8–10. Firstly, Merge/R creates a node with unconstrained solutions for individual agents. Secondly, the node list is re-initialized to contain just the new node (line 10). Effectively, the search is restarted with the two agents replaced by a combined agent.
The decision whether to merge two agents and restart the search is again based on a fixed threshold. Given the suggested interpretation for
as an estimate of, merging combined, instead of single, agents into a larger yet agent should require a threshold that depends on the sizes of the agents to be merged. This was confirmed by preliminary experiments on partial sliding tile puzzle (see below), which showed that using the same for merging both single and combined agents, as in the original version of MA-CBS [Sharon et al.2012b], slows down the search compared to merging just single agents. Indeed, the number of children grows exponentially with the number of single agents in a combined agent, and thus the search time grows at least exponentially with the size of the combined agent, demanding a higher . In the experiments333The program code and the problem instances for the experiments in this paper are attached to the submission. In the camera-ready version a URL pointing at a public source code repository will be provided instead., we limited the maximum size of a combined agent to 2, that is, only single agents would be merged, efficiently setting for producing combined agents consisting of more than 2 single agents. A more advanced implementation would be based on different values of for different sizes of agents to be merged.
4.1 Exploring MA-CBS/R with Partial Sliding Tile Puzzle
The sliding tile puzzle problem is quite obviously an example of MAPF, and the NP-completeness of MAPF is shown through reduction to the sliding tile puzzle [Sharon et al.2012b]. However, a modified version of the puzzle can also be used to empirically explore MAPF algorithms and, in the case of MA-CBS and MA-CBS/R, to understand the influence of the number of agents and the threshold on the search time.
We used the partial sliding tile puzzle, in which only some of the tiles are present on the board, for the exploration. A problem instance with 9 tiles is shown in Figure 2. The original locations of the tiles are marked by solid circles with the uppercase letter identifying the tile. Dashed circles with lowercase letters are the goal locations for each of tiles. Any instance with fewer than 15 tiles is solvable [Johnson and Story1879]. The partial sliding -tile puzzle can be viewed as an MAPF problem with agents. A solution of the partial sliding tile puzzle is translated to a solution of the MAPF problem with each sequence of upto moves of different tiles translated to the simultaneous move of the agents (where some of the agents may be waiting). Hence, any solvable instance of the puzzle is also solvable as an MAPF instance. The solution costs and optimal solutions can be different though, since waiting in a non-goal position is free in the partial sliding tile puzzle but not in the MAPF problem.
Since solving the sliding tile puzzle optimally requires problem-specific algorithms [Korf1985], we needed to determine the number of tiles for which problem instances are hard enough but still solvable using CBS with an A* variant as the low-level solver. We generated a set of 100 random scenes (agent locations) of partial sliding tile puzzle for every number of agents from 2 to 9, spreading the agents in such a way that conflicts between the agents are likely. Table 3 shows the total running time of CBS, number of expanded nodes, and number of splits, for a range of values of . The growth of the number of expanded nodes is the highest for 6–8 agents (the bold curve segment), where we should expect MA-CBS becoming better than CBS. Indeed, for up to 6 agents CBS is faster than MA-CBS for any on the problem sets used. However for 7 agents MA-CBS becomes comparable to CBS, and for 8 agents MA-CBS/R outperforms CBS for a range of values of , as one can see in Table 4.
can be learned offline on a subset of problem instances. However, a more efficient approach is to directly estimate the ratio from a few runs of CBS and MA-CBS/R, and to use . Indeed, this estimate gives for problem instances with 8 agents, and MA-CBS/R performs reasonably well in the vicinity of this value of (Table 4).
The relative performance of the original MA-CBS is consistent with the results for 2 agents. Table 5 shows the running time, number of expanded nodes, number of splits and merges, for the same parameters as Table 4. The difference between MA-CBS and MA-CBS/R is more prominent for higher values of , where MA-CBS/R exhibits much lower running times and numbers of expanded nodes. Only by the search time and the number of expanded nodes begin to decrease; this, like in the case of 2 agents, can be explained by the decrease in the number of search branches reaching this number of conflicts.
4.2 An Estimate on Competitive Ratio
An important question is how competitive an MA-CBS/R algorithm can be in the general case. One answer to the question is the following lemma, which can be proven by construction.
Under the assumption that the cost of finding a solution for a single meta-agent of any size is known in advance, MA-CBS/R for an arbitrary number of agents can achieve competitive ratio.
MA-CBS/R for an arbitrary number of agents merges at most agents into a combined agent (that is, all the agents), and performs at least merges. The algorithm should merge two agents every time the total cost of computations performed from the beginning of the algorithm is at least the cost of the next merge. Two cases are possible:
After a merge, the total cost of computations performed so far is no more than the cost of the next merge.
The total cost of computations is greater than the cost of the next merge, in which case the agents should be merged immediately.
Assuming that the cost of a merge is much higher than the cost of a split (a basic assumption behind MA-CBS), the second case takes place when there are several merges of the same cost (that is, when several combined agents of the same size must be constructed). The number of such subsequent merges of the same cost is at most since the size of a combined agent is at least 2, and the total overhead is . Thus, an algorithm with a competitive ratio of can be, at least theoretically, constructed within the framework of MA-CBS/R. ∎
For the case of two agents, the competitive ratio coincides with Lemma 1. In practice, since MA-CBS/R uses a heuristic suggesting to merge two agents based on the number of conflicts between these agents only rather than the total number of conflicts encountered, the performance should be better than what follows from the worst-case analysis.
5 Further improvements to MA-CBS/R
There are several directions in which to look for further improvements in the performance of MA-CBS/R. Here we introduce two improved variants of MA-CBS/R based on different decision rules whether to split the search on a conflict or to merge the agents and restart.
The first variant is Randomized MA-CBS/R, which is derived from the randomized algorithm for the snoopy caching problem [Karlin et al.1994]. According to the randomized algorithm, instead of deterministically merging after conflicts, the decision to merge can be made randomly for any number of conflicts between 1 and
inclusive, and the probabilityof merging after conflicts is given by (2):
One can see that the probability grows with and reaches 1 for . For the ski rental problem and, consequently, for the 2-agent case, the randomized algorithm yields a competitive ratio of [Karlin et al.1994], compared to the competitive ratio of for deterministic MA-CBS/R.
The second variant is Delayed MA-CBS/R, which decides whether to merge or to split based both on the number of conflicts for a particular pair of agents and on the value of the heuristic cost estimate of the first and the next node in the node list. The algorithm is based on the observation that the node obtained after a merge is likely to have a higher heuristic cost estimate, such that instead of exploring the subtree rooted in the node, the search will consider other nodes instead. Therefore it might make sense to merge a pair of agents in a node only if the current cost estimate is sufficiently lower than the cost estimate of the next node in the node list.
A full analysis of the utility of merging agents given the cost estimates of nodes in the node list is beyond the scope of this paper. However, the simplest case for discrete cost domains (to which CBS, MA-CBS, and other MAPF algorithms are applied) are two first nodes with the same cost estimate. In this case it should be beneficial to delay the merge (and split instead) until the cost estimate of the first node becomes strictly lower than of the second node, even if the number of conflicts reached . Indeed, this decision rule was implemented in Delayed MA-CBS/R.
The two variants where empirically compared to the basic MA-CBS/R on partial sliding tile puzzle with 8 agents for (M-CBS/R has the best performance for this ) (Table 6). Both Randomized and Delayed MA-CBS/R showed shorter search times and lower numbers of expanded nodes, by and correspondingly. A better yet performance might be achieved through combining the ideas of both Randomized and Delayed MA-CBS/R, as well as through a more informed decision-making in Delayed MA-CBS/R.
6 Experiments on Benchmark Maps
Following the empirical evaluation in [Sharon et al.2012b], we used the same three maps from the game Dragon Age: Origins [Sturtevant2012]. As with the puzzle, 100 random instances were generated, and the reported numbers are the totals over the 100 instances. Table 7 shows the results of CBS, MA-CBS, MA-CBS/R on 16-agent scenes for a range of values of .
Again, MA-CBS/R shows the best performance (bold in the table). The advantage of MA-CBS/R over MA-CBS depends on the map. den520d consists mostly of open spaces, and MA-CBS/R is 5 times faster than MA-CBS for the tested values of . ost003d is a combination of open spaces and bottlenecks; MA-CBS/R (for ) is 10% faster than MA-CBS (for ) for the tested values of , but for intermediate values of , 16 and 64, MA-CBS/R is 4 times faster than MA-CBS: hence the best value that would be estimated from test runs of the low-level search on single and combined agents might give a 4-fold increase in performance. brc202d mostly consists of narrow paths resulting in many bottlenecks. MA-CBS/R is faster than MA-CBS for the tested values of . For all maps and for all values of , MA-CBS/R is faster than MA-CBS. Moreover, the search time of MA-CBS for intermediate values of is often longer than for extreme (either low or high) values, an evidence which further supports the advantage of restarting the search upon a merge. Delayed or Randomized MA-CBS/R can be used to further improve the performance for the best found value of .
7 Summary and Future Work
This paper has several contributions:
We provided a justification for the use of a fixed threshold for decision-making in MA-CBS; the justification was based on the worst-case analysis of a two-agent MAPF problem.
Using a model problem based on this justification we introduced a more efficient version of MA-CBS, MA-CBS/R, where the search is restarted after a merge. MA-CBS/R exhibits shorter search times and lower numbers of expanded nodes than MA-CBS on both partial sliding tile puzzle and computer game scenes.
We also introduced two improved variants of MA-CBS/R, Randomized MA-CBS/R based on a known randomized algorithm, and Delayed MA-CBS/R based on the analysis of decision utilities. Both algorithms show better yet performance compared to MA-CBS/R.
There is room for further improvement of MA-CBS variants. Firstly, the decision to merge a pair of agents can be made based on the history of conflict occurrence and resolution through splitting, rather than just the number of conflicts. Secondly, the tie-breaking, such as selection of the conflicting agents to split on or merge, and of the conflict to resolve in case of a split, can be improved using heuristic decision rules. We believe that metareasoning techniques [Russell and Wefald1991, Russell2014] can be applied successfully to MAPF domain in general and MA-CBS variants in particular to design the heuristics.
- [Barrer et al.2014] Barrer, M.; Sharon, G.; Stern, R.; and Felner, A. 2014. Suboptimal variants of the conflict-based search algorithm for the multi-agent pathfinding problem. In Proceedings of the Seventh Annual Symposium on Combinatorial Search, SOCS 2014, Prague, Czech Republic, 15-17 August 2014.
[Felner et al.2012]
Felner, A.; Goldenberg, M.; Sharon, G.; Stern, R.; Beja, T.; Sturtevant, N. R.;
Schaeffer, J.; and Holte, R.
Partial-expansion a* with selective node generation.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22-26, 2012, Toronto, Ontario, Canada.
- [Hernávölgyi and Holte2004] Hernávölgyi, I., and Holte, R. C. 2004. Steps towards the automatic creation of search heuristics. Technical Report TR04-02, Computer Science Department, University of Alberta.
- [Johnson and Story1879] Johnson, W. W., and Story, W. E. 1879. Notes on the ”15” puzzle. American Journal of Mathematics 2(4):pp. 397–404.
- [Karlin et al.1988] Karlin, A. R.; Manasse, M. S.; Rudolph, L.; and Sleator, D. D. 1988. Competitive snoopy caching. Algorithmica 3:77–119.
- [Karlin et al.1994] Karlin, A. R.; Manasse, M. S.; McGeoch, L. A.; and Owicki, S. S. 1994. Competitive randomized algorithms for nonuniform problems. Algorithmica 11(6):542–571.
- [Korf1985] Korf, R. E. 1985. Depth-first iterative-deepening: An optimal admissible tree search. Artificial Intelligence 27(1):97–109.
[Manasse, McGeoch, and
Manasse, M. S.; McGeoch, L. A.; and Sleator, D. D.
Competitive algorithms for on-line problems.
Proceedings of the 20th Annual ACM Symposium on Theory of Computing, May 2-4, 1988, Chicago, Illinois, USA, 322–333.
- [Ratner and Warmuth1986] Ratner, D., and Warmuth, M. K. 1986. Finding a shortest solution for the N N extension of the 15-puzzle is intractable. In Proceedings of the 5th National Conference on Artificial Intelligence. Philadelphia, PA, August 11-15, 1986. Volume 1: Science., 168–172.
- [Russell and Wefald1991] Russell, S., and Wefald, E. 1991. Principles of metereasoning. Artificial Intelligence 49:361–395.
- [Russell2014] Russell, S. 2014. Rationality and intelligence: A brief update. In Vincent C. Müller (ed.), Fundamental Issues of Artificial Intelligence (Synthese Library). Berlin: Springer. To appear.
- [Sharon et al.2012a] Sharon, G.; Stern, R.; Felner, A.; and Sturtevant, N. R. 2012a. Conflict-based search for optimal multi-agent path finding. In Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22-26, 2012, Toronto, Ontario, Canada.
- [Sharon et al.2012b] Sharon, G.; Stern, R.; Felner, A.; and Sturtevant, N. R. 2012b. Meta-agent conflict-based search for optimal multi-agent path finding. In SOCS.
- [Sharon et al.2013] Sharon, G.; Stern, R.; Goldenberg, M.; and Felner, A. 2013. The increasing cost tree search for optimal multi-agent pathfinding. Artif. Intell. 195:470–495.
- [Silver2005] Silver, D. 2005. Cooperative pathfinding. In AIIDE, 117–122.
- [Standley and Korf2011] Standley, T. S., and Korf, R. E. 2011. Complete algorithms for cooperative pathfinding problems. In IJCAI 2011, Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Catalonia, Spain, July 16-22, 2011, 668–673.
- [Standley2010] Standley, T. S. 2010. Finding optimal solutions to cooperative pathfinding problems. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, Atlanta, Georgia, USA, July 11-15, 2010.
- [Sturtevant2012] Sturtevant, N. R. 2012. Benchmarks for grid-based pathfinding. IEEE Trans. Comput. Intellig. and AI in Games 4(2):144–148.
- [Wang, Botea, and Kilby2011] Wang, K.-H. C.; Botea, A.; and Kilby, P. 2011. On improving the quality of solutions in large-scale cooperative multi-agent pathfinding. In SOCS.