## I Introduction

As robotic systems are increasingly used for applications such as drone delivery services, semi-automated warehouses, and autonomous cars, safe and efficient robotic navigation around humans is crucial.
Consider the example in Fig. 1, inspired by a drone delivery scenario, where two quadcopters must plan a safe trajectory around two humans who are walking through the environment. We would like to guarantee that the robots will reach their goals without ever colliding with each other, any of the humans, or the static surroundings.^{1}^{1}1

Note that our laboratory setting uses a motion capture system for sensing and state estimation—robustness with respect to sensor uncertainty is an important component that is beyond the scope of this paper.

This safe motion planning problem faces three main challenges: (1) controlling the nonlinear robot dynamics subject to external disturbances (e.g. wind), (2) planning around multiple humans in real time, and (3) avoiding conflicts with other robots’ plans. Extensive prior work from control theory, motion planning, and cognitive science has enabled computational tools for rigorous safety analysis, faster motion planners for nonlinear systems, and predictive models of human agents. Individually, these problems are difficult—computing robust control policies, coupled robot plans, and joint predictions of multiple human agents are all computationally demanding at best and intractable at worst [Mitchell2005, chen2016general]. Recent work, however, has made progress in provably-safe real-time motion planning [herbert2017fastrack, majumdar2017funnel, singh2018robust], real-time probabilistic prediction of a human agent’s motion [fisac2018probabilistically, ziebart2009planning], and robust sequential trajectory planning for multi-robot systems [bansal2017safe, chen2016multi].

It remains a challenge to synthesize these into a real-time planning system, primarily due to the difficulty of joint planning and prediction for multiple robots and humans. There has been some work combining subsets of this problem [knepper2012pedestrian, trautman2010unfreezing, kruse2013human], but the full setup of real-time and robust multi-robot navigation around multiple humans remains underexplored.

Our main contributions in this paper are tractable approaches to joint planning and prediction, while still ensuring efficient, probabilistically-safe motion planning.

We use the reachability-based FaSTrack framework [herbert2017fastrack] for real-time robust motion planning. To ensure real-time feasibility, robots predict human motion using a simple model neglecting future interaction effects. Because this model will be a simplification of true human motion, we use confidence-aware predictions [fisac2018probabilistically] that become more conservative whenever humans deviate from the assumed model. Finally, groups of robots plan sequentially according to a pre-specified priority ordering [chen2015safe], which serves to reduce the complexity of the joint planning problem while maintaining safety with respect to each other. We demonstrate our framework in hardware, and provide a large-scale simulation to showcase scalability.

## Ii The SCAFFOLD Framework

Fig. 2 illustrates our overall planning framework, called SCAFFOLD. We introduce the components of the framework by incrementally addressing the three main challenges identified above.

We first present the robot planning and control block (Section III), which is instantiated for each robot. Each robot uses a robust controller (e.g. the reachability-based controller of [herbert2017fastrack]) to track motion plans within a precomputed error margin that accounts for modeled dynamics and external disturbances. In order to generate safe motion plans, each robot will ensure that output trajectories are collision-checked with a set of obstacle maps, using the tracking error margin.

These obstacle maps include an *a priori* known set of static obstacles, as well as predictions of the future motion of any humans, which are generated by the human predictions block (Section IV).
By generating these predictions, each robot is able to remain probabilistically safe with respect to the humans.
To ensure tractability for multiple humans, we generate predictions using simplified interaction models, and subsequently adapt them
following a real-time Bayesian approach such as [fisac2018probabilistically].
We leverage the property that individual predictions automatically become more uncertain whenever their accuracy degrades, and use this to enable our tractable predictions to be robust to unmodeled interaction effects.

Finally, to guarantee safety with respect to other robots, we carry out sequential trajectory planning (Section V) by adapting the cooperative multi-agent planning scheme [bansal2017safe] to function in real time with the robust trajectories from the planning and control block. The robots generate plans according to a pre-specified priority ordering. Each robot plans to avoid the most recently generated trajectories from robots of higher priority, i.e. robot must generate a plan that is safe with respect to the planned trajectories from robots . This removes the computational complexity of planning in the joint state space of all robots at once.

## Iii Robot Planning and Control

In this section we begin with the canonical problem of planning through a static environment. Efficient algorithms such as A or rapidly-exploring random trees (RRT) [hart1968astar, karaman2011RRTPRM] excel at this task. These algorithms readily extend to environments with deterministically-moving obstacles by collision-checking in both time and space.

We now introduce robot dynamics and allow the environment to have external disturbances such as wind. Kinematic planners such as A

and RRT do not consider these factors when creating plans. In practice, however, these planners are often used to generate an initial trajectory, which may then be smoothed and tracked using a feedback controller such as a linear quadratic regulator (LQR). During execution, the mismatch between the planning model and the physical system can result in tracking error, which may cause the robot to deviate far enough from its plan to collide with an obstacle. To reduce the chance of collision, one can augment the robot by a heuristic error buffer; this induces a “safety bubble” around the robot used when collision checking. However, heuristically generating this buffer will not guarantee safety.

Several recent approaches address efficient planning while considering model dynamics and maintaining robustness with respect to external disturbances. Majumdar and Tedrake [majumdar2017funnel] use motion primitives with safety funnels, while Raković [rakovic2009set] utilizes robust model-predictive control, and Singh et. al. [Singh2017] leverage contraction theory.

In this paper, we use FaSTrack [herbert2017fastrack, fridovich2018planning], a modular framework that computes a tracking error bound (TEB) via *offline* reachability analysis.
This TEB can be thought of as a rigorous counterpart of the error-buffer concept introduced above.
More concretely, the TEB is the set of states capturing the maximum relative distance (i.e. maximum tracking error) that may occur between the physical robot and the current state of the planned trajectory.
We compute the TEB by formulating the tracking task as a pursuit-evasion game between the planning algorithm and the physical robot. We then solve this differential game using Hamilton-Jacobi reachability analysis. To ensure robustness, we assume (a) worst-case behavior of the planning algorithm (i.e. being as difficult as possible to track), and (b) that the robot is experiencing worst-case, bounded external disturbances.
The computation of the TEB also provides a corresponding error-feedback controller for the robot to always remain inside the TEB.
Thus, FaSTrack wraps efficient motion planners, and adds robustness to modeled system dynamics and external disturbances through the precomputed TEB and error-feedback controller.
Fig. 4 shows a top-down view of a quadcopter using a kinematic planner (A) to navigate around static obstacles. By employing the error-feedback controller, the quadcopter is guaranteed to remain within the TEB (shown in blue) as it traverses the A path.

### Iii-a FaSTrack Block

Requirements: To use FaSTrack, one needs a high-fidelity dynamical model of the system used for reference tracking, and a (potentially simpler) dynamic or kinematic model used by the planning algorithm. Using the relative dynamics between the tracking model and the planning model, the TEB and safety controller may be computed using Hamilton-Jacobi reachability analysis [herbert2017fastrack], sum-of-squares optimization [singh2018robust], or approximate dynamic programming [royo2018classification].

Implementation: Fig. 3 describes the online algorithm for FaSTrack after the offline precomputation of the TEB and safety controller. We initialize the planning block to start within the TEB centered on the robot’s current state. The planner then uses any desired planning algorithm (e.g. A, or model predictive control) to find a trajectory from this initial state to a desired goal state. When collision-checking, the planning algorithm must ensure that the tube defined by the Minkowski sum of the TEB and the planned trajectory does not overlap any obstacles in the obstacle map.

The planning block provides the current planned reference state to the FaSTrack controller, which determines the relative state between the tracking model (robot) and planned reference (motion plan). The controller then applies the corresponding optimal, safe tracking control via an efficient look-up table.

### Iii-B FaSTrack in the SCAFFOLD Framework

In the robot planning and control section of Fig. 2, each robot uses FaSTrack for robust planning and control. FaSTrack guarantees that each robot remains within its TEB-augmented trajectory.

## Iv Human Predictions

Section III introduced methods for the fast and safe navigation of a single robot in an environment with deterministic, moving obstacles. However, moving obstacles—especially human beings—are not always best modeled as deterministic. For such “obstacles,” robots can employ probabilistic predictive models to produce a distribution of states the human may occupy in the future. The quality of these predictions and the methods used to plan around them determine the overall safety of the system. Generating accurate real-time predictions for multiple humans (and, more generally, uncertain agents) is an open problem. Part of the challenge arises from the combinatorial explosion of interaction effects as the number of agents increases. Any simplifying assumptions, such as neglecting interaction effects, will inevitably cause predictions to become inaccurate. Such inaccuracies could threaten the safety of plans that rely on these predictions.

Our goal is to compute real-time motion plans that are based on up-to-date predictions of all humans in the environment, and at the same time maintain safety when these predictions become inaccurate. The confidence-aware prediction approach of [fisac2018probabilistically] provides a convenient mechanism for adapting prediction uncertainty online to reflect the degree to which humans’ actions match an internal model. This automatic uncertainty adjustment allows us to simplify or even neglect interaction effects between humans, because uncertain predictions will automatically result in more conservative plans when the observed behavior departs from internal modeling assumptions.

### Iv-a Human Prediction Block

Requirements:
In order to make any sort of collision-avoidance guarantees, we require a prediction algorithm that produces distributions over future states, and rapidly adjusts those predictions such that the actual trajectories followed by humans lie within the prediction envelope. There are many approaches to probabilistic trajectory prediction in the literature, e.g. [ding2011human, hawkins2013probabilistic, ziebart2009planning, lasota2015analyzing].
These methods could be used to produce a probabilistic prediction of the -th human’s state at future times , conditioned on observations^{2}^{2}2For simplicity, we will later assume complete state observability: .
:

. These observations are random variables and depend upon the full state of all robots and humans

until the current time . However, by default these distributions will not necessarily capture the true trajectories of each human, especially when the models do not explicitly account for interaction effects. Fisac et. al. [fisac2018probabilistically]provide an efficient mechanism for updating the uncertainty (e.g., the variance) of distributions

to satisfy this safety requirement.Implementation: Fig. 5 illustrates the human prediction block. We use a maximum-entropy model as in [fisac2018probabilistically, finn2016guided, ziebart2008maximum], in which the dynamics of the -th human are affected by actions

drawn from a Boltzmann probability distribution. This time-dependent distribution over actions implies a distribution over future states. Given a sensed state

of human at time , we invert the dynamics model to infer the human’s action, . Given this action, we perform a Bayesian update on the distribution of two parameters: , which describes the objective of the human (e.g. the set of candidate goal locations), and , which governs the variance of the predicted action distributions. can be interpreted as a natural indicator of model confidence, quantifying the model’s ability to capture humans’ current behavior [fisac2018probabilistically]. Were we to model actions with a different distribution, e.g. a Gaussian process, the corresponding parameters could be learned from prior data [ziebart2008maximum, ziebart2009planning, finn2016guided], or inferred online [Sadigh2016information, fisac2018probabilistically]using standard inverse optimal control (inverse reinforcement learning) techniques.

Once distributional parameters are updated, we produce a prediction over the future actions of human through the following Boltzmann distribution:

(1) |

This model treats each human as more likely to choose actions with high expected utility as measured by the (state-action) Q-value associated to a certain reward function, . In general, this value function may depend upon the joint state and the human’s own action , as well as the parameters . Finally, combining (1) with a dynamics model, these predicted actions may be used to generate a distribution over future states. In practice, we represent this distribution as a discrete occupancy grid. One such grid is visualized in Fig. 6.

By reasoning about each human’s model confidence as a hidden state [fisac2018probabilistically], our framework dynamically adapts predictions to the evolving accuracy of the models encoded in the set of state-action functions, . Uncertain predictions will force the planner to be more cautious whenever the actions of the humans occur with low probability as measured by (1).

### Iv-B Human Prediction in the SCAFFOLD Framework

The predicted future motion of the humans is generated as a probability mass function,
represented as time-indexed set of occupancy grids.
These distributions are interpreted as an obstacle map by the FaSTrack block.
During planning, a state is considered to be unsafe if the total probability mass contained within the TEB centered at that state exceeds a preset threshold, . As in [fisac2018probabilistically], we consider a trajectory to be unsafe if the maximum *marginal* collision probability at any individual state along it exceeds .

When there are multiple humans, their state at any future time will generally be characterized by a joint probability distribution .

Let be the planned state of a robot at time . We write to denote the overlap of the TEB centered at with the th human at state . Thus, we may formalize the probability of collision with *at least one* human as:

(2) | ||||

Intuitively, (2) states that the probability that the robot is in collision at is one minus the probability that the robot is not in collision. We compute the second term by taking the product over the probability that the robot is not in collision with each human, given that the robot is not in collision with all previously accounted for humans. Unfortunately, it is generally intractable to compute the terms in the product in (2). Fortunately, tractable approximations can be computed by storing only the marginal predicted distribution of each human at every future time step , and assuming independence between humans. This way, each robot need only operate with occupancy grids. The resulting computation is:

(3) |

Here we take the product over the probability that the robot is not in collision with each human (one minus probability of collision), and then again take the complement to compute the probability of collision with any human. Note that when the predictive model neglects future interactions between multiple humans, (2) reduces to (3). If model confidence analysis [fisac2018probabilistically] is used in conjunction with such models, we hypothesize that each marginal distribution will naturally become more uncertain when interaction effects are significant.

Once a collision probability is exactly or approximately computed, the planner can reject plans for which, at any time , the probability of collision from (3) exceeds . Thus, we are able to generate computationally tractable predictions that result in -safe planned trajectories for the physical robot.

## V Sequential Trajectory Planning

Thus far, we have shown how our framework allows a single robot to navigate in real-time through an environment with multiple humans while maintaining safety (at a probability of approximately -safe) and accounting for internal dynamics, external disturbances, and humans. However, in many applications (such as autonomous driving), the environment may also be occupied by other robots.

Finding the optimal set of trajectories for all robots in the environment would require solving the planning problem over the joint state space of all robots. This very quickly becomes computationally intractable with increasing numbers of robots. Approaches for multi-robot trajectory planning often assume that the other vehicles operate with specific control strategies such as those involving induced velocity obstacles [wu2012guaranteed, fiorini1998motion, chasparis2005linear, van2008reciprocal] and involving virtual structures or potential fields to maintain collision [olfati2002distributed, chuang2007multi, zhou2018agile]. These assumptions greatly reduce the dimensionality of the problem, but may not hold in general.

Rather than assuming specific control strategies of other robots, each robot could generate predictions over the future motion of all other robots. Successful results of this type typically assume that the vehicles operate with very simple dynamics, such as single integrator dynamics [Zhou2017], differentially flat systems [lian2002real], linear systems [ahmadzadeh2009multi].

However, when robots can communicate with each other, methods for centralized and/or cooperative multi-agent planning allow for techniques for scalability [lewis2013cooperative, torreno2017cooperative, mylvaganam2017differential]. One such method is sequential trajectory planning (STP) [chen2015safe], which coordinates robust multi-agent planning using a sequential priority ordering. Priority ordering is commonly used in many multi-agent scenarios, particularly for aerospace applications. In this work, we merge STP with FaSTrack to produce real-time planning for multi-agent systems.

### V-a Sequential Trajectory Planning

Requirements: To apply STP, robots must be able to communicate trajectories and TEBs over a network.

Implementation: STP addresses the computational complexity of coupled motion planning by assigning a priority order to the robots and allowing higher-priority robots to ignore the planned trajectories of lower-priority robots. The first-priority robot uses the FaSTrack block to plan a (time-dependent) trajectory through the environment while avoiding the obstacle maps. This trajectory is shared across the network with all lower-priority robots. The -th robot augments the trajectories from robots by their respective TEBs, and treats them as time-varying obstacles. The -th robot determines a safe trajectory that avoids these time-varying tubes as well as the predicted state distributions of humans, and publishes this trajectory for robots . This process continues until all robots have computed their trajectory. Each robot replans as quickly as it is able; in our experiments, this was between – ms.

### V-B Sequential Trajectory Planning in the SCAFFOLD Framework

By combining this method with FaSTrack for fast individual planning, the sequential nature of STP does not significantly affect overall planning time. In our implementation all computations are done on a centralized computer using the Robot Operating System (ROS), however this method can easily be performed in a decentralized manner. Note that STP does depend upon reliable, low-latency communication between the robots. If there are communication delays, techniques such as [desai2017drona] may be used to augment each robot’s TEB by a term relating to time delay.

## Vi Implementation and Experimental Results

We demonstrate SCAFFOLD’s feasibility in hardware with two robots and two humans, and its scalability in simulation with five robots and ten humans.

### Vi-a Hardware Demonstration

We implemented the SCAFFOLD framework in C++ and Python, using Robot Operating System (ROS) [quigley2009ros]. All computations for our hardware demonstration were done on a laptop computer (specs: 31.3 GB of memory, 216.4 GB disk, Intel Core i7 @ 2.70GHz x 8). As shown in Fig. 1, we used Crazyflie 2.0 quadcopters as our robots, and two human volunteers. The position and orientation of robots and humans were measured at roughly 235 Hz by an OptiTrack infrared motion capture system. The humans were instructed to move towards different places in the lab, while the quadcopters planned collision-free trajectories in three dimensions using a time-varying implementation of A. The quadcopters tracked these trajectories using the precomputed FaSTrack controller designed for a 6D near-hover quadcopter model tracking a 3D point [fridovich2018planning]. Human motion was predicted s into the future. Fig. 7 shows several snapshots of this scene over time. Note that the humans must move around each other to reach their goals—this is an unmodeled interaction affect. The predictions become less certain during this interaction, and the quadcopters plan more conservatively, giving the humans a wider berth. The full video of the hardware demonstration can be viewed in our video submission.

### Vi-B SCAFFOLD Framework Simulation Analysis

Due to the relatively small size of our motion capture arena, we demonstrate scalability of the SCAFFOLD framework through a large-scale simulation.
In this simulation, pedestrians are crossing through a region of the UC Berkeley campus.
We simulate the pedestrians’ motion using potential fields [goodrich2002potential], which “pull” each pedestrian toward his or her own goal and “push” them away from other pedestrians and robots.
These interaction effects between humans and robots are not incorporated into the state-action functions , and lead to increased model uncertainty (i.e., higher estimates of ) during such interactions.
The fleet of robots must reach their respective goals while maintaining safety with respect to their internal dynamics, humans, and each other. We ran our simulation on a desktop workstation with an Intel i7 Processor and 12 CPUs operating at 3.3 GHz.^{3}^{3}3The computation appears to be dominated by the prediction step, which we have not yet invested effort in optimizing.
Our simulation took seconds for all robots to reach their respective goals. Predictions over human motion took seconds to compute for each human. This computation can be done in parallel. Each robot took seconds to determine a plan. There was no significant difference in planning time between robots of varying priority.
Robots used time-varying A on a -dimensional grid with m resolution, and collision checks were performed at m along each trajectory segment. The resolution for human predictions was m and human motion was predicted s into the future.

## Vii Discussion & Conclusion

In this paper, we compose several techniques for robust and efficient planning together in a framework designed for fast multi-robot planning in environments with uncertain moving obstacles, such as humans. Each robot generates real-time motion plans while maintaining safety with respect to external disturbances and modeled dynamics via the FaSTrack framework. To maintain safety with respect to humans, robots sense humans’ states and form probabilistic, adaptive predictions over their future trajectories. For efficiency, we model these humans’ motions as independent, and to maintain robustness, we adapt prediction model confidence online. Finally, to remain safe with respect to other robots, we introduce multi-robot cooperation through STP, which relieves the computational complexity of planning in the joint state space of all robots by instead allowing robots to plan sequentially according to a fixed priority ordering.

We demonstrate our framework in hardware with two quadcopters navigating around two humans. We also present a larger simulation of five quadcopters and ten humans.

To further demonstrate our framework’s robustness, we are interested in exploring (a) non-grid based methods of planning and prediction, (b) the incorporation of sensor uncertainty, (c) optimization for timing and communication delays, and (d) recursive feasibility in planning. We are also interested in testing more sophisticated predictive models for humans, and other low-level motion planners.

## Acknowledgments

The authors would like to thank Joe Menke for his assistance in hardware and motion capture systems, and Daniel Hua and Claire Dong for their help early on.

Comments

There are no comments yet.