Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks

by   Shushman Choudhury, et al.
Stanford University

We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path-finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to 100 agents on the real-world Manhattan road network containing nearly 4500 vertices and 10000 edges. Our framework saves on more than 50% of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within 5 minutes on commodity hardware.


page 1

page 2

page 3

page 4


Efficient Large-Scale Multi-Drone Delivery Using Transit Networks

We consider the problem of controlling a large fleet of drones to delive...

Flexible and Explainable Solutions for Multi-Agent Path Finding Problems

The multi-agent path finding (MAPF) problem is a combinatorial search pr...

Lifelong Multi-Agent Path Finding in Large-Scale Warehouses

Multi-Agent Path Finding (MAPF) is the problem of moving a team of agent...

Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure of Costs

We interpret solving the multi-vehicle routing problem as a team Markov ...

Eco-Routing Using Open Street Maps

A vehicle's fuel consumption depends on its type, the speed, the conditi...

Analysing Congestion Problems in Multi-agent Reinforcement Learning

Congestion problems are omnipresent in today's complex networks and repr...

Explanation Generation for Multi-Modal Multi-Agent Path Finding with Optimal Resource Utilization using Answer Set Programming

The multi-agent path finding (MAPF) problem is a combinatorial search pr...

1. Introduction

Figure 1. An illustration of our overall approach. (Left) Stage 1 computes truck paths solid (red and blue) in the vicinity of the shortest road paths for drones (dashed); these truck paths may deviate from the shortest road paths for the trucks (dashed red and blue). (Right) Stage 2 computes the shortest road-and-transit paths for drones, which can use trucks as transit. See Section 4 for further details.

Drones have great potential for transforming urban logistics services. By enabling quick, flexible, and efficient delivery, they can help address the rapidly growing logistics and e-commerce needs of dense urban populations chung2020optimization. They can also reduce our reliance on traditional ground delivery services that contribute to traffic congestion HolguinETAL18. However, operating package delivery services that rely solely on drones may be infeasible due to their limited flight range and carrying capacity SudburyETAL16. To overcome these limitations, we study the problem of operating delivery drones in tandem with ground vehicles by allowing drones to ride on ground vehicles to conserve energy and increase effective flight range choudhury2019dynamic; choudhury2021efficient.

In particular, we focus on routing a team of drones and trucks over a common road network, where drones can use trucks as transit in addition to flying. We frame our setting as a coordinated extension of the Multi-Agent Path Finding (MAPF) problem stern2019multi, which requires us to compute start-goal routes for all agents while aiming to minimize the total path cost incurred by the drones and trucks. The path cost can be different for the two agent types and can encode the energy consumed or the operational expense. As in the classical MAPF formulation, our problem setting requires us to satisfy inter-agent constraints, which in our case bounds the maximum number of drones that can simultaneously use a truck. It also requires reasoning about the potential cost savings from coordinating trucks and drones to share trip segments.

The feature of agents temporarily coordinating has not been explored by the MAPF community so far (to the best of our knowledge) and makes our problem much harder than the already difficult classical MAPF yu2013structure; YuLaValle16. Most state-of-the-art MAPF algorithms focus on avoiding inter-agent collisions rather than optimizing coordination felner2017search; stern2019multi. Our recent work developed a MAPF solver for routing drones over transit networks choudhury2021efficient but assumed the transit vehicles follow fixed and known routes and that drones were not required to fly over the road network. Another work considered MAPF problems with cooperation but only planned for predefined agent pairs to arrive at fixed meeting points greshler2021cooperative, rather than simultaneously traversing a route. The operations research community has looked at optimizing drones with trucks but their approaches only work for few agents on small abstract routing graphs agatz2012optimization.

Contributions. We develop an effective algorithmic approach for coordinated MAPF in the context of drones and ground trucks working in tandem. Our key idea is to decouple our overall intractable MAPF problem into two distinct MAPF sub-problems that we can solve in stages (Figure 1). In Stage 1, we compute truck routes that are likely to be useful as transit options for drones. In Stage 2, we fix the truck routes, create a transit network based on them, and overlay this network on the road graph. We then compute drone routes over the composite road-and-transit network, where drones incur no cost on the segments where they use transit (subject to capacity constraints). As a post-processing step, we re-route trucks not used as transit to their shortest road network paths.

We implement two variants of our approach that use different MAPF solvers; the first uses Enhanced Conflict-Based Search barer2014suboptimal and the second uses Prioritized Planning silver2005cooperative for both stages. We evaluate our approach on a range of coordinated MAPF settings over the Manhattan road network, with nearly vertices and edges covering an area of nearly , and up to truck and drones. Our experiments show that coordinating drones and trucks can save more than of total vehicle distance traveled compared to no coordination. We refer to the sum of truck path distance and drone flight distance as vehicle distance traveled, since drones incur no energy cost when riding on a truck. For brevity and convenience, we will use this phrase hereafter, with some abuse of terminology. Our approach plans paths for all settings within at most minutes of computation time on commodity hardware.

We foresee that our approach could serve as a building block in more complex problem settings where we also need to optimally allocate agents to tasks (i.e., package-delivery locations) and decide the order in which to execute them. For instance, in a manner similar to our previous work choudhury2021efficient, we could use a bi-level approach where the upper layer allocates jobs to the agents and a lower-level planner—a coordinated MAPF solver in our case—executes the allocation in a receding-horizon fashion.

Layout. In Section 2 we review prior related fundamental approaches and state-of-the-art applications. In Section 3 we introduce basic notation and definitions and describe our problem setting of coordinated MAPF. In Section 4 we describe our two-stage approach for coordinated MAPF in detail and discuss our extensive experimental results in Section 5. We conclude by summarizing our work and outlining future research directions in Section 6.

2. Related Work

We briefly review three related areas of prior research: the general multi-agent path finding problem, peer-to-peer ridesharing algorithms, and coordinated logistics with drones and trucks.

2.1. Multi-Agent Path Finding

The problem of planning paths for a team of agents subject to domain-specific inter-agent constraints (e.g., collision avoidance) is known as Multi-Agent Path Finding or MAPF yu2013structure. MAPF is a sub-class of the more general Multi-Agent Planning problem torreno2017cooperative. Though the underlying MAPF problem is computationally hard, the research community has developed several effective search-based solvers that work well in practice sharon2012conflict; barer2014suboptimal; felner2017search.

Most MAPF algorithms are evaluated on grid-worlds where agents can move step-wise along the four cardinal directions, rather than on large-scale road networks stern2019multi. They are also designed for avoiding collisions, not enabling agents to temporarily coordinate by actively sharing their locations. There are two relevant exceptions. First, a recent paper developed a bounded-suboptimal MAPF approach that routes drones over a time-dependent ground transit network choudhury2021efficient; however, the ground vehicles are fixed and not controllable. Second, an algorithmic extension to the classical MAPF formulation enables agents to explicitly cooperate greshler2021cooperative. But this approach assigns agents to cooperative tasks in advance and does not require them to simultaneously traverse a shared path.

2.2. Peer-to-Peer Ridesharing

The recent shift towards shared-use mobility services has motivated the study of peer-to-peer ridesharing. This problem involves matching drivers to passengers with similar itineraries, such that the former can share their trips with the latter agatz2012optimization. A taxonomy of ridesharing variants has emerged, based on the type of matching required tafreshian2020frontiers; our problem in this paper can be considered a fixed-role many-to-many matching variant masoud2017decomposition, where the trucks are the ‘drivers’ and drones are the ‘passengers’. Several ridesharing algorithms provide useful foundations for us, such as one that geographically partitions a large-scale road network pelzer2015partition, one that partitions an intermediate trip graph datastructure to decompose the problem tafreshian2020trip

, and one that incorporates return restrictions on drivers in an integer linear program 

chen2019ride. However, a common challenge with all of them is that their objective functions only consider the distance traveled by the driver vehicles, and possibly wait time for the passengers, but not the distance that passengers must traverse to get to the drivers. Another challenge is that most approaches frame and solve a mathematical program, which scales poorly to complex multi-agent path finding problems bartak2017modeling.

2.3. Coordinated Drone-Truck Routing

The idea of pairing drones with trucks for last-mile delivery and logistics has been partially explored. The flying-sidekick traveling-salesman problem was formulated to model a single truck-drone pair making a set of deliveries, with the drone leaving and returning to the truck at various points murray2015flying. A range of optimization approaches have been developed for this flying sidekick problem AlenaETAL18, including an extension that considers multiple drones murray2020multiple. Unfortunately, they do not scale well with scenario size and are only applied for a small number of trucks and drones.

Several other works address similar settings; a genetic algorithm optimizing a truck-drone pair in a tandem delivery network 

ferrandez2016optimization, a geometric approach that relies on Euclidean plane analysis carlsson2018coordinated, a sequential decision-making model that assumes geographical districting ulmer2018same, and a drone scheduling routine for given truck routes boysen2018drone. All of those approaches consider a small number of agents (typically one drone and one truck) and allow the drones to move freely in the plane, which can be unrealistic. Lastly, we mention that control aspects of drone landing on a moving truck have been explored recently Haberfeld.ea.2021.

3. Problem Formulation

We control a centralized fleet of agents comprising trucks and drones. Each agent is assigned an index from the set , where and denote truck and drone indices, respectively. The agents operate on a shared road network, represented as a directed graph with two types of edges. An edge where represents a traversal of a physical road segment.

Each edge has two cost values and representing the travel cost incurred by a truck and drone, respectively. In both cases, we set and to the physical distance of the corresponding road link, but the two quantities could be different. The drone incurs cost while traversing only if it is flying along the edge. Our objective, discussed in detail below, is to optimize the total travel cost incurred by all agents. Edges are also annotated with travel times that depend on the type of agents using them, based on an average traversal speed. We consider a discrete time setting in this work, where and denote the integer traversal time for trucks and drones, respectively.

3.1. Truck and drone paths

Each agent is assigned start and goal nodes . For simplicity, we assume that all the agents begin their journey at time step . We could account for different start times without losing generality, by including a new zero-cost edge for each agent with traversal time equal to its start time. We must compute a set of paths that move agents from their start to goal nodes over the graph , such that the total travel cost across all agents is minimized.

Drones can use trucks as temporary modes of transit for one or more edges to save on travel cost. They may only board or alight at nodes in the road network graph. Each truck has a maximum carrying capacity for drones. In our experiments, all trucks are homogeneous and have the same capacity , though our approach could accommodate varying truck capacities.

Next, we describe the solution paths for trucks, which encode their traversal over the road network graph. Given a truck , its solution path is a sequence of edges, for some , where for every it holds that , and , where and denote the origin and destination of a given edge . Additionally, a path must satisfy connectivity constraints between the edges, i.e., for every . In addition to encoding the truck’s location in space, the path also implicitly describes its departure and arrival time over the edges. In particular, denotes the departure time of edge , where , and is defined as


where . The path cost incurred by the truck is simply the sum of truck travel costs .

We define solution paths for drones similarly, except that these paths also describe whether drones use trucks for some trip segments. For drone , its solution is a path for some and an assignment sequence . The start-goal and continuity constraints imposed on are the same as those for trucks, as are the arrival and departure times of the drone path edges.

The assignment sequence is defined as follows. For a given solution segment , the value describes whether drone is riding a truck or flying (in which case ), when traversing the edge . Assuming that the assignment sequence is valid with respect to the trucks used along the route (defined below), the cost of the drone solution is computed as


where returns if , and otherwise returns . That is, the value sums up the cost along edges for which the drone does not ride on a truck.

A global solution to our problem is a collection of solutions over all agents, i.e., . The solution is valid if it satisfies the following two conditions on the coordination between trucks and drones. First, fix a drone and fix a segment where it is assigned to some truck . Then the edge along the drone’s path must also be part of the truck’s path , whose departure time must also align with that of the drone. That is, there exists a truck segment such that and . The second condition requires that a truck’s capacity will not be exceeded. In particular, fix a truck and a solution segment . Then it must hold that


We are ready to state our problem.

Problem 0 (Coordinated MAPF).

Given a set of agents , road graph , edge cost function , edge traversal-time function , we wish to find a valid global solution minimizing the total cost


3.2. Discussion

We discuss some computational aspects of our problem. Integer programming would scale poorly, even for only one truck and one drone murray2015flying. Coordinating multiple drones and trucks increases the nominal decision space by orders of magnitude through two axes: (i) all possible matchings of drones to trucks based on capacity (i.e., all ways to distribute drones across trucks where each truck can have up to drones); (ii) for each drone-truck pairing, all possible intermediate start and end points of the truck route, and the various routes a truck can take. We aslo need to account for conflicts arising from violating vehicle capacity constraints. Those observations suggest that our problem is more difficult than the classical MAPF problem, which is NP-hard yu2013structure. But we defer the study of the complexity of Coordinated MAPF for future work.

We now discuss our modeling assumptions. First, we use a shared road graph for trucks and drones. For urban areas with high-rises and no-fly zones, restricting drones to fly over the road network rather than point-to-point between any two locations is a reasonable design choice. Second, of the two popular MAPF objective functions, sum-of-costs and makespan, we choose to minimize the former rather than the latter (the makespan of a MAPF solution is the maximum path cost for any agent).

In our setting, sum-of-costs better reflects the gains from having drones use trucks as transit; e.g., if the maximum-cost path in a solution was that of some truck whose start and goal were disproportionately far apart, then the makespan of the solution would only depend on the cost of that worst path (unlike the sum-of-costs metric); no further optimizing of the other drones and trucks to save on drone flight cost would be incentivized, even though that would have led to real-world benefits. In any case, we considered makespan in a related earlier work where it was more appropriate choudhury2021efficient. Finally, we only include travel distance in our objective and not the elapsed travel time to avoid arbitrary scaling between the two physical quantities. Vehicle distance traveled is a standard objective for the ridesharing problem tafreshian2020frontiers.

4. Coordinated Drone-Truck MAPF

Most effective multi-agent planning approaches rely on the system being loosely coupled brafman2008one. In the context of MAPF (a sub-class of multi-agent planning), a loosely coupled multi-agent system is one where the optimal path for an individual agent mostly does not interact or coordinate with that of other agents, and when it does, the extent of interaction required is small compared to the overall path length. For example, Conflict-Based Search or CBS sharon2012conflict is an influential MAPF algorithm that works by computing individual paths independently for each agent with a search-based method like A* hart1968formal and resolving any conflicts or inter-path constraint violations in a structured hierarchical manner. However, it only works well on problems where there are not many conflicts, i.e., the underlying multi-agent system is loosely coupled Gordon.ea.21.

Figure 2. The two MAPF sub-problems that we need to solve. (Left) In Stage 1, we take initial drone shortest paths (dashed purple and green) and add drone-annotated weight-discounted copies of nearby road edges (coloured thinner edges; for clarity, we only show a subset of all possible such copies). We then solve a MAPF problem for trucks on this augmented graph. (Right) In Stage 2, we take the truck paths computed in Stage 1 (solid red and blue) and create zero-weight transit edge copies from them. We then solve a MAPF problem for drones on this road-and-transit graph.

In our problem, the ability of drones to coordinate with trucks when convenient vastly increases the amount of coupling in the system and thereby the complexity of the MAPF problem. If drones incur no cost when using trucks as transit, then their optimal paths are highly dependent on the set of truck paths. On the other hand, the set of individual shortest paths for trucks may not yield a useful transit network for the drones. We noted this increased complexity in the previous section, through the orders-of-magnitude larger decision space. But the tighter coupling affects even search-based solvers like CBS, which are less sensitive to dimensionality than integer programming methods.

The tight coupling and large-scale road network in our setting make it intractable to solve optimally or bounded-suboptimally. Therefore, we decouple the overall MAPF problem into two distinct MAPF sub-problems and solve them in stages (Figure 1). In Stage 1, we compute a set of truck paths that drones are likely to use as transit. We do so by generating a MAPF instance for the trucks where their travel cost is discounted when they get close to nominal drone routes, i.e., the shortest drone paths over the road network. In Stage 2, we design another MAPF instance to route the drones. We augment the road graph to keep track of the motion of the trucks along and allow drones to ride on the trucks for some segments of their trips and fly for the others. After both stages, we re-route any trucks that were not used by a single drone to their original shortest path on the road network, if their Stage 1 paths deviated from their shortest path.

4.1. Stage 1: MAPF for Trucks

Our problem imposes no system constraints between any two truck paths. If we ignore drones (whose paths will be computed in Stage 2), the solution to the MAPF problem for the trucks alone is trivial: the set of shortest road network paths for each individual truck. However, this set of truck paths may be ill-suited for the drones to use as transit and may yield a poor-quality downstream solution. Therefore, we seek a set of truck paths that drones are likely to benefit from in Stage 2. To inform this search, we first consider a set of one or more start-goal flight paths for each drone over the road network. In our case, we use the shortest flight path for each drone along , but this set could capture other properties like geographical coverage or path diversity bast2013result.

Next, we describe an instance of the MAPF problem whose solution would allocate trucks to routes that could potentially benefit the drones. We describe the graph used in this stage. The graph augments the road network graph to encourage trucks to take paths that are close to the initial drone flight paths. In particular, the vertices are equal to . The edge set generalizes the edge set , i.e., . For each drone , we consider all edges within hops (or edge traversals) of its flight path . For each such edge , we create a copy annotated with the drone and the hop ; the same road network in can have multiple copies in if it is close to multiple nominal drone paths. See Figure 2 (Left) for illustration.

The weights of the road edges reflect their truck cost, i.e., . For each copy we discount the original edge truck cost by depending on the hop distance between the edge and the corresponding drone flight path, i.e., . We want

to reflect that the drone need not deviate from its shortest path at all if the truck takes that edge, thus halving the effective edge cost. The choice of discounting function is a heuristic. For our experiments in 

Section 5, we use . We also run an ablation study with the sigmoid function, i.e. in the extended version of our paper choudhury2021coordinated. Both satisfy our desiderata that and that approaches as increases. The maximum hop distance () the drone flight path is hops in all cases; we ran some offline ablations, found negligible change for greater values of , and omit those results in the interest of space.

The road network graph, augmented by discounted weight copies, is the underlying pathfinding graph for the MAPF problem with trucks (Figure 2; left). The shortest paths for trucks (with respect to the weights ) on this augmented graph may deviate from their original shortest paths (with respect to weights ) on the graph , in favor of edges with drone-annotated copies and thus lower weights. The truck paths on the augmented road graph do have pairwise constraints, unlike those on the base graph. To reduce the chance that several trucks are assigned to “assist” the same drone, we impose a capacity constraint of one, i.e., on every edge (but do not restrict capacities of edges not associated with drones). In the terminology of classical MAPF, two trucks are in conflict if they use the same drone-annotated weight-discounted edge copy . Note that we do not forbid multiple trucks being assigned to copies annotated with the same drone but different hops and . This choice keeps the Stage 1 runtime low but may introduce some known inefficiency, as the drone can only use one truck at a time.

Given graph , weights , truck capacities and start-goal nodes , we solve a MAPF problem with the objective of minimizing the total weight incurred by all trucks. The outcome of Stage 1 is a set of truck paths . Capacity conflicts between truck paths do not reflect any physical constraints, in contrast to most MAPF applications. Rather, they encode our design choice that truck paths be useful for drones downstream. Again, since a particular drone can only use one truck at a time on transit, if two trucks deviate to use the same drone-annotated edge copy, one of those deviations is likely to be wasted in Stage 2.

4.2. Stage 2: MAPF for Drones

Stage 1 yields a set of truck paths over the road network , because a truck using an edge copy is essentially using . Whether a truck path used drone-annotated discounted-weight edge copies is now irrelevant as the discounting was simply a heuristic to guide trucks closer to the nominal drone flight paths. In Stage 2, we will fix the truck paths, keep track of their locations in time and space, and solve a MAPF problem for drones that can use trucks as transit, subject to capacity constraints. This stage builds upon recent work for routing drones over ground transit choudhury2021efficient, but differs in how it requires drones to operate on the road network rather than fly point-to-point between locations.

For the Stage 2 MAPF problem, we augment the original road network with the transit network derived from truck paths (Figure 2; right panel) to yield the graph . Each truck path is a sequence of road network edges. For each road edge of a given truck path, we add a copy to that we call a transit edge. Every transit edge is annotated with the corresponding truck, has drone capacity , and has zero weight because the drone incurs no distance cost when using transit. We set the weight of non-transit edges to be . Since we omit elapsed time in the objective and assume that drones and trucks can slow down as needed to wait for the coordinating agent, any drone-truck connection can be made in principle.

In contrast to Stage 1, the conflicts here do represent physical constraints, i.e., the maximum drone-carrying capacity of the truck. Drone paths may conflict with each other if more than drones use the same transit edge. Given the graph , weights , and the above constraints, we have a well-defined MAPF problem for Stage 2, where we seek to compute a set of drone paths, each with a combination of road and transit edges (although there may be drone paths that do not take transit at all). Our objective is to minimize the sum of drone path costs, given that only road edges incur cost due to the distance of the corresponding link.

4.3. Solving stage-wise MAPF problems

1:Input: MAPF Graph , agents
3:procedure ConflictBasedSearch
4:     Initialize constraint tree node with
5:     for  do Any ordering
6:          A*      
8:     Insert into Open Higher-level open list
9:     while Open is not empty do
10:          Min. cost candidate
11:         if  has no conflicts then
12:              return Best valid solution          
13:         for all conflicts  do
16:              Insert into Open               
18:procedure PrioritizedPlanning
19:     Initialize with and
20:     for  do Priority ordering
23:     return
Algorithm 1 Pseudocode of two MAPF techniques

The prior work on drone-transit routing that we build upon choudhury2021efficient used Enhanced CBS or ECBS barer2014suboptimal for the MAPF problem. CBS is a hierarchical algorithm, where the multi-agent level defines inter-path and per-path constraints on the single-agent level. The single-agent level computes optimal paths that satisfy the respective per-path constraints. If two or more single-agent paths conflict with each other, i.e., violate any shared constraints, the multi-agent level imposes more constraints to resolve the conflict, and reruns the single-agent level for the conflicting agents.

ECBS uses bounded-suboptimal Focal Search pearl1982studies instead of optimal A* at both levels. It can be orders of magnitude more efficient than CBS, especially on MAPF problems with many more conflicts, i.e., that reflect a more tightly coupled multi-agent system. We also implement and use ECBS for our MAPF problems in both stages. However, for larger numbers of cars and drones ECBS would have far too many conflicts to resolve and timeout before returning a solution (as we shall see in Section 5). Moreover, conflict resolution is particularly expensive in our setting as the truck capacities are greater than one, and resolving them generates a large number of constraints. A conflict generates constraints for all subsets of excess drones, i.e., if a transit edge has capacity , and drones choose to use it, then -choose- constraints are generated in the higher-level search tree, where all -subsets of the drones are restricted from using that transit edge. A similar setting with capacities is discussed in our earlier work choudhury2021efficient. A version of CBS tailored for MAPF with capacity constraints was recently developed surynek2019multi. But developing vanilla MAPF solvers is not the focus of our work and we defer its implementation for future research.

The challenges of using ECBS for larger problem settings motivates us to also consider Prioritized Planning (PP) silver2005cooperative. Here, we plan paths for agents one-by-one (using A*) based on some priority ordering. After planning each agent’s path, PP analyzes the edges along it and updates any MAPF constraints for subsequent paths. In both our stages, these update rules depend on the conflict criteria between paths. In Stage 1, if a truck path uses any drone-annotated edge copies, then those copies are removed and unusable for subsequent trucks. In Stage 2, each time a drone uses a specific transit edge, its capacity is reduced by one; when a transit edge reaches zero capacity, it can no longer be used by subsequent drones. Once a drone-annotated or transit edge is used up in the respective stages, later agents are not allowed to use them. Note that PP circumvents the conflicts that CBS encounters when through the imposed priority ordering.

PP has no solution quality guarantees, unlike ECBS. But it does not need to resolve conflicts and can be more efficient than ECBS in practice. Section 5 will show how PP can solve problems intractable for ECBS and be competitive on the tractable problems. The choice of priority ordering can impact the solution quality for PP. For Stage 2, we use the sensible heuristic of prioritizing drones whose shortest paths on the road have higher cost, as they are most likely to benefit from using transit. No such-clearly motivated heuristic exists for the prioritizing the order of trucks paths in Stage 1, so we impose an arbitrary ordering based on truck IDs.

Algorithm 1 contains high-level pseudocode for CBS. It maintains a higher-level constraint tree whose root node is initialized with an empty constraint set and the independent shortest paths for each agent. When the lowest-cost constraint tree node is expanded, CBS evaluates its set of paths for any conflicts and generates a child node that recomputes the path for every conflicting agent and corresponding constraint . CBS continues until it yields the first conflict-free solution, which is guaranteed to be optimal. The basic structure of ECBS is the same as CBS, with a few extensions to enable more efficient behavior while sacrificing optimality for bounded-suboptimality. We omit those extensions for readability. We also sketch out the pseudocode of Prioritized Planning in Algorithm 1 to highlight its major differences from CBS. PP computes the shortest path of each of the agents one-by-one, following a given priority ordering. After obtaining the path for any agent , it updates the shared set of constraints based on that path.

5. Experiments and Results

Vehicle Distance Traveled () Plan Time ()
Cap = 5 Cap = 10 Cap = 5 Cap = 10
Trucks Drones Direct ECBS PP ECBS PP Direct ECBS PP ECBS PP
Table 1. We compare the solution quality (vehicle distance traveled) and computational efficiency (plan time) of our framework against the Direct baseline. All quantities are averaged over

trials. Note that by vehicle travel distance we mean the sum of truck distances and drone flight distances. In all cases, the standard error of the mean was less than

of the mean, and has been omitted. Red indicates that some of the 20 trials timed out, so the comparison is not precisely equivalent. For the larger settings, ECBS times out on the majority of the trials, and is thus omitted.

We implemented111The code is at our approach and ran all simulations using the Julia programming language Julia-2017 on a machine with RAM and a Intel Xeon CPU. On various problem settings, we evaluated the quality of solutions as per our optimization objective, i.e., the sum of total path costs over all agents. We also considered the efficiency of our approach by measuring the plan time and observing how it scales with more trucks and drones. For the road graph, we used the street network of Manhattan in New York City222, covering an area of nearly . This directed graph has 4426 nodes and 9604 edges; the nodes are annotated with geographical locations and the edges are annotated with the distance of the road link in kilometres. For simplicity, we use total vehicle distance traveled as the path cost metric (i.e., the sum of trucktravel distance and drone flight distance), and we defer more sophisticated cost metrics to future work. Our underlying graph is significantly larger than in most MAPF applications stern2019multi.

5.1. Solution Quality and Efficiency

We evaluated two versions of our approach on solution quality and computational efficiency: one with Enhanced Conflict-Based Search (ECBS) for both stages and the other with Prioritized Planning (PP) instead. Since the full problem is intractable to solve jointly, we do not baseline against a Mixed Integer Linear Program approach. As a reference point, we compared against an approach that simply assigns to each truck and drone its shortest path on the road network, with no coordination among them. This baseline (that we call Direct) is much faster than our approach but has much poorer solution quality, especially with more trucks that drones can use as transit to reduce path cost.

Table 1 displays the results of all simulations. We varied the number of trucks and drones and considered two different drone-carrying capacities for trucks, and . For each setting, we had different trials, each with different start and goal nodes for each agent. We ran all approaches on the MAPF problems defined by that trial, computed the total solution cost (vehicle distance) in kilometres and the planning time in seconds, and averaged over all trials. The standard error of the mean was less than of the mean in all cases, so we omitted it in the interest of space.

As expected, both ECBS and PP compute much better quality solutions than Direct, and the quality gap increases with more trucks and drones and with higher truck capacity. Also, Direct has much lower planning time than both PP and ECBS. We mentioned earlier in Section 4.2 that ECBS has many expensive conflicts to resolve for bounded-suboptimality. PP does not explicitly need to resolve conflicts but still needs to plan over much larger graphs in both stages than the road network that Direct plans on: Stage 1 adds the drone-annotated weight-discounted edge copies and Stage 2 adds the transit network from truck paths. Even the worst plan time for PP, nearly 5 minutes, is good enough in practice for an operation horizon on the order of hours. For that setting, PP has an absolute savings of more than and a relative savings of more than compared to Direct.

Comparing PP to ECBS yields many important insights. The plan times start out as comparable, but PP scales much more better than ECBS with increasing trucks and/or drones, as we foreshadowed in Section 4.3. Beyond a certain problem size, the majority of ECBS trials have too many conflicts (our threshold is ) and timeouts (more than 10 minutes of planning time); those entries in the table remain blank. Before those thresholds, there are settings where a minority of ECBS trials time out, making the table entry not directly comparable to PP as it averages over a subset of trials (we have marked those in red). For example, notice how the plan time for ECBS for trucks and drones shoots up to 157 seconds, i.e., nearly 3 minutes. Problems with truck capacity 5 take longer to solve than those with capacity 10 because they are more resource constrained and hence yield more conflicts to resolve in ECBS and more constraints to update in PP.

In contrast to ECBS, PP does not have to resolve conflicts, has lower plan time for most settings (with a slight exception for 5 trucks, 10 drones, and capacity 5), and is tractable even for our largest problems. We expected it to be more efficient than ECBS, which is why we implemented it in the first place. Even more promising is the modest gap in solution quality between PP and ECBS for the settings where the latter is tractable. The total vehicle distance traveled of the PP solution is typically between and higher than that of the ECBS solution (the gap for trucks and drones is higher though not entirely representative because ECBS times out on some of the problems). A possible reason for the small gap in solution quality between the two approaches is that our setting has a relatively small number of agents compared to the pathfinding graph size, which increases the space of non-conflicting agent paths and provides more flexibility. In problems that require tight coordination between agents such as automated warehouse scenarios wurman2008coordinating, PP may struggle to find a solution.

For both ECBS and PP, Stage 2 accounts for almost all the total plan time (over ). We expect this disparity between the stage plan times for three reasons: the number of drones to plan for is two to four times the number of trucks, constructing the composite road-and-transit network for stage 2 is itself expensive, and the second stage has more constraints and conflicts than the first.

5.2. Ablation Study on Effect of Stage 1

To disentangle the relative effects of the two stages of our approach on the solution quality (vehicle distance traveled), we ran an ablation study with a modified approach. Here, the first stage does not account for the drones at all and just sets the truck routes to their shortest paths on the road network. The second stage is the same as our original approach, and plans for the drones over the composite network of the road and the truck paths as transit. We expect this modified approach to compute solutions that are better than the Direct baseline. But the nature of these performance gaps will help us understand how much of the advantage from Section 5.1 is due to either stage and what future work should focus on improving.

Cap = 5 Cap =10
Trucks Drones Direct D-PP PP D-PP PP
Table 2. The ablation with the modified approach Direct-PP (D-PP for short) helps disentangle the relative effects of the two stages on Vehicle Distance Traveled. The values for the PP and Direct columns are copied over from the corresponding Distance columns in Table 1.

For the ablation study, we considered problem settings with a higher ratio of the number of drones to the number of trucks (). In Section 5.1, we had considered smaller ratios as well ( and ) to highlight the trends in solution quality and plan time with an increasing drone/truck ratio, but in practice we would prefer higher ratios as trucks are more expensive to operate than drones and contribute to ground congestion. The modified approach uses direct truck shortest paths for Stage 1 and PP for Stage 2; we call it Direct-PP (D-PP in short). We compare its solution quality against that of PP for both stages over the problem settings in Table 2, averaged over trials. The entries in the PP columns are copied over from the corresponding columns in Table 1.

The performance gap between D-PP and PP appears to be quite modest for most settings, particularly with capacity . Behind the low average differences, however, are specific instances where PP yields a significantly better solution than D-PP, e.g., in a few of the settings with drones, trucks, and capacity , PP’s solution was more than better than that of D-PP; we found a similar gap for a few of the settings with drones, trucks, and capacity . Of course, PP can be arbitrarily sub-optimal depending on the priority ordering (and we choose an arbitrary ordering for Stage 1). Future work could investigate various modifications and improvements to Stage 1: a more sophisticated priority ordering, a different set of nominal drone flight paths to use as the basis for weight-discounted copies of nearby road edges, or even a hybrid of ECBS for Stage 1 (which takes less time) and PP for Stage 2, to yield potentially better overall solution quality without an unacceptably large decrease in computational efficiency.

6. Conclusion

We introduced the problem of coordinated routing for drones and trucks over a common large-scale road network, where the former can use the latter as modes of transit to reduce total vehicle distance traveled. We explained how this problem is significantly more complex than prior work on MAPF. Our comprehensive algorithmic framework elegantly decouples the intractable overall problem into stage-wise multi-agent path finding sub-problems that it solves for trucks and drones respectively. In practice, it yields significant distance savings compared to independently operating trucks and drones (more than ), within a reasonable computation time (up to minutes) on the large-scale Manhattan road network.

Several interesting operational extensions emerge for future work, including equipping trucks with charging docks for the drones and allowing drones to drop packages onto moving trucks. In addition, we could further enhance the performance of our framework through an iterative process that improves truck and drone routes repeatedly, replans online in receding-horizon fashion, and considers hybrid methods using different heuristics in each stage. Finally, for practical applications, it would be useful to extend our method to the lifelong MAPF setting ma2017lifelong where agents receive new tasks when they complete their current one, and to consider more complex trajectory-level issues of the drone routes, such as kinematic constraints.

This work was supported in part by the National Science Foundation award no. 1830554, the Toyota Research Institute, the Israeli Ministry of Science and Technology grants no. 3-16079 and 3-17385, and the United States-Israel Binational Science Foundation grants no. 2019703 and 2021643. The authors thank Oren Salzman for insightful comments and suggestions.