DeepAI
Log In Sign Up

Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022

08/22/2022
by   Xuesu Xiao, et al.
0

The BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2022 IEEE International Conference on Robotics and Automation (ICRA 2022) in Philadelphia, PA. The aim of the challenge was to evaluate state-of-the-art autonomous ground navigation systems for moving robots through highly constrained environments in a safe and efficient manner. Specifically, the task was to navigate a standardized, differential-drive ground robot from a predefined start location to a goal location as quickly as possible without colliding with any obstacles, both in simulation and in the real world. Five teams from all over the world participated in the qualifying simulation competition, three of which were invited to compete with each other at a set of physical obstacle courses at the conference center in Philadelphia. The competition results suggest that autonomous ground navigation in highly constrained spaces, despite seeming ostensibly simple even for experienced roboticists, is actually far from being a solved problem. In this article, we discuss the challenge, the approaches used by the top three winning teams, and lessons learned to direct future research.

READ FULL TEXT VIEW PDF

page 1

page 3

page 6

08/22/2021

From Agile Ground to Aerial Navigation: Learning from Learned Hallucination

This paper presents a self-supervised Learning from Learned Hallucinatio...
07/28/2020

Toward Agile Maneuvers in Highly Constrained Spaces: Learning from Hallucination

While classical approaches to autonomous robot navigation currently enab...
10/16/2020

Agile Robot Navigation through Hallucinated Learning and Sober Deployment

Learning from Hallucination (LfH) is a recent machine learning paradigm ...
08/31/2020

Benchmarking Metric Ground Navigation

Metric ground navigation addresses the problem of autonomously moving a ...
06/11/2021

Autonomous Fire Fighting with a UAV-UGV Team at MBZIRC 2020

Every day, burning buildings threaten the lives of occupants and first r...
01/28/2020

Taking Recoveries to Task: Recovery-Driven Development for Recipe-based Robot Tasks

Robot task execution when situated in real-world environments is fragile...
07/08/2021

Design and Deployment of an Autonomous Unmanned Ground Vehicle for Urban Firefighting Scenarios

Autonomous mobile robots have the potential to solve missions that are e...

I The BARN Challenge Overview

Designing autonomous robot navigation systems has been a topic of interest to the robotics community for decades [35, 36, 37, 31, 9]

. Indeed, there currently exist many such systems that allow robots to move from one point to another in a collision-free manner (e.g., open-source implementations in the Robot Operating System (ROS) 

[31, 9, 38] with extensions to different vehicle types [39]), which may create the perception that autonomous ground navigation is a solved problem. This perception may be reinforced by the fact that many mobile robot researchers have moved on to orthogonal navigation problems [50] beyond the traditional metric (geometric) formulation that only focuses on path optimality and obstacle avoidance. These orthogonal problems include, among others, learning navigation systems in a data-driven manner [30, 5, 18, 2], navigating in off-road [28, 15, 17] and social contexts [25, 22, 16], and multi-robot navigation [20, 4].

However, autonomous mobile robots still struggle in many ostensibly simple scenarios, especially during real-world deployment [13, 11, 23, 42]. For example, even when the problem is simply formulated as traditional metric navigation so that the only requirement is to avoid obstacles on the way to the goal, robots still often get stuck or collide with obstacles when trying to navigate in naturally cluttered daily households [13], in constrained outdoor structures including narrow walkways and ramps [23, 42], and in congested social spaces like classrooms, offices, and cafeterias [11]. In such scenarios, extensive engineering effort is typically required in order to deploy existing approaches, and this requirement presents a challenge for large-scale, unsupervised, real-world robot deployment. Overcoming this challenge requires systems that can both successfully and efficiently navigate a wide variety of environments with confidence.

The Benchmark Autonomous Robot Navigation (BARN) Challenge [43] was a competition at the 2022 IEEE International Conference on Robotics and Automation (ICRA 2022) in Philadelphia, PA that aimed to evaluate the capability of state-of-the-art navigation systems to solve the above-mentioned challenge, especially in highly-constrained environments, where robots need to squeeze between obstacles to navigate to the goal. To compete in The BARN Challenge, each participating team needed to develop an entire software stack for navigation for a standardized and provided mobile robot. In particular, the competition provided a Clearpath Jackal [14] with a 2D 270-field-of-view Hokuyo LiDAR for perception and a differential drive system with 2m/s maximum speed for actuation. The aim of each team was to develop navigation software stack needed to autonomously drive the robot from a given starting location through a dense obstacle filed and to a given goal, and to accomplish this task without any collisions with obstacles or any human interventions. The team whose system could best accomplish this task within the least amount of time would win the competition. The BARN Challenge had two phases: a qualifying phase evaluated in simulation, and a final phase evaluated in a set of physical obstacle courses. The qualifying phase took place before the ICRA 2022 conference using the BARN dataset [29], which is composed of 300 obstacle courses in Gazebo simulation randomly generated by cellular automata. The top three teams from the simulation phase were then invited to compete in three different physical obstacle courses set up by the organizers at ICRA 2022 in the Philadelphia Convention Center.

In this article, we report on the simulation qualifier and physical finals of The BARN Challenge at ICRA 2022, present the approaches used by the top three teams, and discuss lessons learned from the challenge that point out future research directions to solve the problem of autonomous ground navigation in highly constrained spaces.

Ii Simulation Qualifier

The BARN Challenge started on March 29th, 2022, two months before the ICRA 2022 conference, with a standardized simulation qualifier. The qualifier used the BARN dataset [29], which consists of 300 obstacle environments randomly generated by cellular automata (see examples in Fig. 1), each with a predefined start and goal. These obstacle environments range from relatively open spaces, where the robot barely needs to turn, to highly dense fields, where the robot needs to squeeze between obstacles with minimal clearance. The BARN environments are open to the public, and were intended to be used by the participating teams to develop their navigation stack. Another 50 unseen environments, which are not available to the public, were generated to evaluate the teams’ systems. A random BARN environment generator was also provided to teams so that they could generate their own unseen test environments.111https://github.com/dperille/jackal-map-creation

Fig. 1: Four Example BARN Environments in the Gazebo Simulator (ordered by ascending relative difficulty level)

.

In addition to the 300 BARN environments, six baseline approaches were also provided for the participants’ reference, ranging from classical sampling-based [9] and optimization-based navigation systems [31]

, to end-to-end machine learning methods 

[54, 45], and hybrid approaches [53]. All baselines were implementations of different local planners used in conjunction with Dijkstra’s search as the global planner in the ROS move_base navigation stack [34]. To facilitate participation, a training pipeline capable of running the standardized Jackal robot in the Gazebo simulator with ROS Melodic (in Ubuntu 18.04), with the option of being containerized in Docker or Singularity containers for fast and standardized setup and evaluation, was also provided.222https://github.com/Daffan/ros_jackal

Ii-a Rules

Each participating team was required to submit their developed navigation system as a (collection of) launchable ROS node(s). The challenge utilized a standardized evaluation pipeline333https://github.com/Daffan/nav-competition-icra2022 to run each team’s navigation system and compute a standardized performance metric that considered navigation success rate (collision or not reaching the goal count as failure), actual traversal time, and environment difficulty (measured by optimal traversal time). Specially, the score for navigating each environment was computed as

where the indicator function evaluates to if the robot reaches the navigation goal without any collisions, and evaluates to otherwise. AT denotes the actual traversal time, while OT denotes the optimal traversal time, as an indicator of the environment difficulty and measured by the shortest traversal time assuming the robot always travels at its maximum speed ():

The Path Length is provided by the BARN dataset based on Dijkstra’s search from the given start to goal. The clip function clips AT within 4OT and 8OT, in order to assure navigating extremely quickly or slowly in easy or difficult environments respectively won’t disproportionally scale the score. The overall score of each team is the score averaged over all 50 unseen test BARN environments, with 10 trials in each environment. Higher scores indicate better navigation performance. The six baselines score between and . The maximum possible score based on our metric is 0.25.

Ii-B Results

The simulation qualifier started on March 29th, 2022 and lasted through May 22th, 2022. In total, five teams from all over the world submitted their navigation systems. The performance of each submission was evaluated by a standard evaluation pipeline, and the results are shown in Tab. I.

Rank. Team/Method (University) Score
1 TRAIL (Temple University) 0.2415
2 LfLH (Baseline [45]) 0.2334
3 AMRL (UT Austin) 0.2310
4 AMR (UVA) 0.2200
5 E-Band (Baseline [31]) 0.2053
6 End-to-End (Baseline [30]) 0.2042
7 APPLR-DWA (Baseline [53]) 0.1979
8 Yiyuiii (Nanjing University) 0.1969
9 NavBot (Indian Institute of Science) 0.1733
10 Fast (m/s) DWA (Baseline [9]) 0.1709
11 Default (m/s) DWA (Baseline [9]) 0.1627
TABLE I: Simulation Results

All methods outperformed the baseline Dynamic Window Approach (DWA) [9], with both m/s and

m/s max speed, the latter of which is the default local planner for the Jackal robot. However, only one approach (from Temple University) outperformed all baselines. The top three teams from the simulation qualifier, i.e., Temple Robotics and Artificial Intelligence Lab (TRAIL) from Temple University, Autonomous Mobile Robotics Laboratory from The University of Texas at Austin (AMRL UT Austin), and Autonomous Mobile Robots Lab from The University of Virginia (AMR UVA), were invited to the physical finals at ICRA 2022.

Iii Physical Finals

The physical finals took place at ICRA 2022 in the Philadelphia Convention Center on May 25th and May 26th, 2022. Two physical Jackal robots with the same sensors and actuators were provided by the competition sponsor, Clearpath Robotics.

Iii-a Rules

Physical obstacle courses were set up using approximately 200 cardboard boxes in the convention center (Fig. 2). Because the goal of the challenge was to test a navigation system’s ability to perform local planning, all three physical obstacle courses have an obvious passage that connects the start and goal locations (i.e., the robot shouldn’t be confused by global planing at all), but the overall obstacle clearance when traversing this passage was designed to be very constrained, e.g., a few centimeters around the robot.

Fig. 2: One (Out of Three) Physical Obstacle Courses during the Finals

While it was the organizers’ original intention to run exactly the same navigation systems submitted by the three top teams and use the same scoring metric in the simulation qualifiers in the physical finals, these systems suffered from (surprisingly) poor navigation performance in the real world (not even being able to finish one single trial without any collisions). Therefore, the organizers decided to change the rules by giving each team 30 minutes before competing in each of the three physical obstacle courses in order to fine-tune their navigation systems. After all three teams had this chance to set up for a particular obstacle course, the actual physical finals started as a 30-minute timed session for each team. In each 30-minute session, a team tested their navigation system in the obstacle course and notified the organizers when they were ready to time a competition trial. Each team had the opportunity to run five timed trials (after notifying the organizers). The fastest three out of the five timed trials were counted, and the team that had the most successful trials (reaching the goal without any collision) was the winner. In the case of a tie, the team with the fasted average traversal time would be declared the winner.

Iii-B Results

The physical finals took place on May 25th and May 26th, 2022 (see the final award ceremony in Fig. 3). The three teams’ navigation performance is shown in Tab. II. Since all navigation systems navigated at roughly the same speed, the final results were determined solely by the success rate of the best three out of five timed trials for each team. Surprisingly, the best system in simulation by Temple University exhibited the lowest success rate, while UT Austin’s system enjoyed the highest rate of success.

Fig. 3: From Left to Right: Competition Sponsor (Clearpath Robotics), Competition Organizers, the Temple, UVA, and UT Austin teams
Rank. Team/Method (University) Success / Total Trials
1 AMRL (UT Austin) 8/9
2 AMR (UVA) 4/9
3 TRAIL (Temple University) 2/9
TABLE II: Physical Results

Iv Top Three Teams and Approaches

In this section, we report the approaches used by the three winning teams.

Iv-a The University of Texas at Austin

To enable robust, repeatable, and safe navigation in constrained spaces frequently found in BARN, the UT Austin team from AMRL444https://amrl.cs.utexas.edu/ utilized state-of-the-art classical approaches to handle localization, planning, and control along with an automated pipeline to visualize and debug continuous integration. To plan feasible paths to reach the goal location while avoiding obstacles, a medium-horizon kinematic planner from ROS move_base [34] was used, combined with a discrete path rollout greedy planner for local kinodynamic planning from AMRL’s graph navigation stack [3]

. This two-stage hierarchical planning generated safe motion plans for the robot to make progress towards the goal while reactively avoiding obstacles along its path using the LiDAR scans. Additionally, since the environment contains tight spaces that are challenging to navigate through, it was observed that accurate motion estimation of the robot was crucial to deploying a planning-based navigation controller in an unmapped environment. When executing sharp turns in constrained environments, poor estimates of the robot’s motion negatively interfered with costmap updates in

move_base and often prevented the mid-level planner from discovering any feasible path to the goal.

Towards addressing this problem, Episodic non-Markov Localization (EnML) [33]

was utilized, which fuses the LiDAR range scans with wheel odometry through non-markov bayesian updates. Combining EnML with two-stage hierarchical planning proved to be useful in safely handling constrained spaces. Additionally, the UT Austin team developed custom automated tools to generate visualizations for debugging that helped identify failure cases easily, perform manual hyperparameter tuning and accelerate bug fixes during the competition.

While classical approaches helped solve a majority of environments in the BARN challenge, significant challenges still remain for navigation in extremely constrained spaces. For example, the two-stage hierarchical planning module does not explore unobserved regions of the environment before committing to a kinematically feasible path. This sometimes leads to suboptimal paths causing longer time taken to reach the goal. We posit that a learnable mid-level planner with the ability to actively explore the environment appropriately to plan the optimal path may be a promising future direction of research to improve autonomous navigation in constrained spaces.

Iv-B University of Virginia

In order to quickly and robustly navigate through the unknown, cluttered BARN challenge environments, the UVA AMR team555https://www.bezzorobotics.com/ developed a mapless, “follow-the-gap” planning scheme which (a) detects open gaps for the robot to follow to reach a final goal and (b) plans local goals in order to reach these open gaps without colliding with intermediate obstacles. The framework expands upon the UVA AMR lab’s previous work [26]. Fig. 4 illustrates the framework displaying the laser scan point-cloud of a world from the BARN dataset along with the detected intermediate gaps and , vehicle position , and final goal position . Fig.  4 shows the local planner, which provides course corrections in order for the robot to avoid obstacles while reaching a selected gap goal.

Fig. 4: (UVA Team) Examples of (a) Detected Gaps in a Simulated BARN Environment and (b) Local Planner Obstacle Avoidance

The approach takes advantage of the fact that gaps start or end at discontinuities in the laser scan and leverages this principle to find intermediate gap goals for navigation [27]. Let be adjacent points in the laser scan and denote the maximum sensing range of the LiDAR. The first discontinuity, referred to as Type 1, occurs when the distance between the adjacent readings is larger than the circumscribed diameter of the robot: . The second discontinuity, Type 2, occurs when one of the two readings is outside the LiDAR’s sensing range: . If , the discontinuity is referred to as rising, otherwise it is falling. Below we describe how to leverage these discontinuities to identify gaps.

The first step in gap detection is to perform a forward pass from to in the laser point-cloud scan for rising type 1 and type 2 discontinuities. Let denote the location of the first rising discontinuity and . This point becomes the beginning of the gap. To determine the end, we find the point closest to such that . That is,

(1)

The process continues starting from . Once the forward pass is complete, a mirrored backward pass from to is done to find gaps via falling discontinuities. Each detected gap, defined as , a tuple of the start and end points, are added to a quadtree which keeps track of where all previously identified and visited gaps are located. If any gap already exists in the tree, it is ignored.

Once is updated, a gap is selected to be the intermediate goal if it is determined that the final goal is not admissible. In this context, admissibility is determined by checking if a given goal is navigable; that is, from the laser scan data, a path is known to exist from the robot position to the goal. The check is done by using a similar algorithm as discussed by Minguez and Montano [24], which, given any start point and end point , ensures no inflated obstacles block the robot along the line connecting the two points.

The process to select the gap goal from when is inadmissible is outlined in Algorithm 1. At each iteration, the algorithm finds the closest gap to the final goal . If is inadmissible from the robot’s current position, properties of quadtree queries are utilized to find all gaps which must be passed as the robot drives from to . The algorithm then iteratively finds the closest admissible gap to the robot which is also admissible to . Meaning, the robot knows that a feasible path from to and from to exists. If no satisfy this constraint for the given , the process repeats with as the next closest gap to and terminates once an admissible gap is found. For clarity, Fig. 4 shows an example of the goal selection process. The final goal is not admissible, nor is the closest gap to it, . However, to is admissible as well as to . Thus, is selected as the intermediate goal and the selection process repeats once the robot reaches .

1:  Input: quadtree , robot position , final goal
2:  Output: gap goal
3:  while  &  do
4:     
5:     
6:     # Returns children in descending order of dist. to
7:     
8:     for  do
9:        if  &  then
10:           
11:        end if
12:     end for
13:  end while
14:  return  
Algorithm 1 (UVA Team) Find Gap Goal

Even though the selected gap goal is admissible, a direct path to it may not be feasible given the configuration of the obstacles within an environment. For example, a robot navigating directly to in Fig. 4 will collide with the obstacles shown by the laser scan data. In order to prevent such issues from arising, local planner is utilized which re-plans the mobile robot’s trajectory at every timestep if collision is imminent. The direct path to the goal is formulated as a region , which accounts for the relative heading to the goal, and the diameter of the robot. The region is checked against the laser scan points for any obstacles; let represent all obstacle coordinates within region . If no obstacles are in , that is , the robot is sent directly to the gap goal, . If there are multiple obstacles within , the one closest to the robot is selected; let represent the distance to the closest obstacle and represent the direction of the obstacle with respect to the robot’s heading. The new desired heading is then computed by accounting the offset between goal and obstacle to the gap goal: , and the local goal is placed at a distance of in this desired direction (shown in teal in Fig. 4).

The inputs to the robot are angular and linear velocities, and are determined using proportional controllers:

(2)

where , , and are constant proportional gains, is the current heading of the robot, and and are the maximum angular and linear velocities respectively.

Iv-C Temple University

Fig. 5: (Temple Team) The system architecture of the DRL-VO control policy. Raw sensor data from the ZED camera and Hokuyo LiDAR, as well as the goal point, are fed into a preprocessing module to create intermediate data representations. These low-level intermediate features are fused in a feature extractor network to obtain high-level abstract features. The actor network uses these abstract features to generate steering commands to control the robot, while the critic network outputs the state value for training the policy.

The team at Temple666https://sites.temple.edu/trail/

used a deep reinforcement learning (DRL) based control policy, called DRL-VO

[52], originally designed for safe and efficient navigation through crowded dynamic environments. The system architecture of the DRL-VO control policy, shown in Fig. 5, is divided into two modules: preprocessing and DRL network.

Iv-C1 Preprocessing Module

Instead of directly feeding the raw sensor data into deep neural networks like other works

[30, 20, 7, 10], the DRL-VO control policy utilizes preprocessed data as the network input. There are three types of inputs that capture different aspects of the scene.

  1. Pedestrians: To track pedestrians, the raw RGB image data and point cloud data from a ZED camera are fed into the YOLOv3 [32] object detector to get pedestrian detections. These detections are passed into a multiple hypothesis tracker (MHT) [56] to estimate the number of pedestrians and their kinematics (i.e., position and velocity). These pedestrian kinematics are encoded into two occupancy grid-style maps.

  2. Scene Geometry: To track the geometry, the past 10 scans (0.5 s) of LiDAR data are collected. Each LiDAR scan is downsampled using a combination of minimum pooling and average pooling, and these downsampled LiDAR data are then reshaped and stacked to create an LiDAR map.

  3. Goal Location: To inform the robot where to go, the final goal point and its corresponding nominal path are fed into the pure pursuit algorithm [6] to extract the sub-goal point, which is fed into the DRL-VO network.

This novel preprocessed data representation is one key idea of the DRL-VO control policy, allowing it to bridge the sim-to-real gap and generalize to new scenarios better than other end-to-end policies.

Iv-C2 DRL Network Module

The DRL-VO control policy uses an early fusion network architecture to combine the pedestrian and LiDAR data at the input layer in order to obtain high-level abstract feature maps. This early fusion architecture facilitates the design of small networks with fewer parameters than late fusion works [40, 12], which is the key deploying them on resource-constrained robots. These high-level feature maps are combined with the sub-goal point and fed into the actor and critic networks to generate suitable control inputs and a state value, respectively.

Iv-C3 Training

To ensure that the DRL-VO policy leads the robot to navigate safely and efficiently, the team at Temple developed a new multi-objective reward function that rewards travel towards the goal, avoiding collisions with static objects, having a smooth path, and actively avoiding future collisions with pedestrians. This final term, which utilizes the concept of velocity obstacles (VO) [8, 47], is key to the success of the DRL-VO control policy. With this reward function, the DRL-VO policy is trained via the proximal policy optimization (PPO) algorithm [41] in a 3D lobby Gazebo simulation environment with 34 moving pedestrians using a Turtlebot2 robot with a ZED camera and a 2D Hokuyo LiDAR.

Iv-C4 Deployment

The Temple team directly deployed the DRL-VO policy trained on a Turtlebot2 in The BARN Challenge without any model fine-tuning. To achieve this, the team had to account for three key differences:

  1. Unknown Map: During development, the DRL-VO policy used a known occupancy grid map of the static environment for localization, which is not available in the BARN challenge. To account for this, the localization module (amcl) was removed from the software stack and replaced with a laser odometry module.

  2. Static Environment: The DRL-VO policy was designed to function in dynamic environments. To account for the fact that the environments in the BARN Challenge were all static and highly constrained, the pedestrian map was set to all zeros.

  3. Different Robot Model: The DRL-VO policy was trained on a Turtlebot2, which has a different maximum speed and footprint compared to the Jackal platform. In the BARN Challenge, the maximum speed of the robot was modified based on its proximity to obstacles. This led to the robot moving slowly (0.5 m/s, same speed as the Turtlebot2) when near obstacles and quickly (up to 2 m/s, maximum speed of the Jackal) when in an open area.

V Discussions

Based on each team’s approach and the navigation performance observed during the competition, we now discuss lessons learned from The BARN Challenge and point out promising future research directions to push the boundaries of efficient mobile robot navigation in highly constrained spaces.

V-a Generalizability of Learning Based Systems

One surprising discrepancy between the simulation qualifier and the physical finals is the contrasting performance of the DRL-VO approach by Temple University, which outperformed all baselines and other participants in simulation, but suffers from frequent collision with obstacles in the real world. Despite the fact that the organizers modified the rules during the physical finals to allow the teams to make last-minute modifications to their navigation systems, DRL-VO still did not perform well in all three physical obstacle courses. The TRAIL team believes this is due to two types of gap between the simulator and the real world: 1) the real world environments were all highly constrained, whereas the simulator environments contained both constrained and unconstrained environments, and 2) the DRL-VO policy was learned on a Turtlebot2 model (which has a smaller physical footprint than a Jackal). Most of the collisions during the hardware tests were light grazes on the side, so a robot with a smaller size may have remained collision-free.

The stark performance contrast between simulation and the real world suggests a generalizability gap for the reinforcement learning approach. It was not practical for the team to re-train a new system on site during the competition, given the impractically massive amount of training data required for reinforcement learning. How to address this generalizability gap and to make a navigation policy trained in simulation generalizable to the real world and different robot/sensor configurations remains to be investigated, even for such a simple, static obstacle avoidance problem.

Another potential way to address such inevitable generalizability gaps is to seek help from a secondary classical planner that identifies out-of-distribution scenarios in the real world and recovers from them through rule-based heuristics. In fact, for the last two physical courses, the Temple team tried to implement just such a “recovery planner” as a backup for DRL-VO: when a potential collision is detected (i.e., the robot faces an obstacle that is too close), the robot rotates in place to head towards an empty space in an attempt to better match the real-world distribution to that in the simulation during training. Although such a recovery planner did help in some scenarios, it is difficult for it to cover every difficult scenario and navigate through the entire obstacle course. Indeed, the Temple team spent time during the 30-minute timed sessions to fine tune the parameters of the recovery planner, but found it difficult to find a single set of parameters to recover the robot from all out-of-distribution scenarios while not to accidentally drive the robot into such scenarios throughout the entire course. On one hand, the simple nature of the recovery planner designed onsite during the competition contributed to the failure. On the other hand, tuning parameters of a planner to cover as many scenarios as possible remains a difficult problem, and will be discussed further below.

V-B Tunability of Classical Systems

Similar to Temple’s rule-based recovery planner, UT Austin team’s entire navigation system relies on classical methods: EnML localization, medium-horizon kinematic planner, and local rollout-based kinodynamic planner. Inevitably, these classical approaches have numerous tuning parameters, which need to be correctly tuned to cover as many scenarios as possible. A natural disadvantage of relying on a single set of parameters to cover all different difficult scenarios in the BARN Challenge (e.g., dense obstacle fields, narrow curving hallways, relatively open spaces) is the inevitable tradeoff or compromise to sacrifice performance in some scenarios in order to succeed in others or to decrease speed for better safety. Indeed, the UT Austin team’s strategy in the physical finals is to spend the first 20 minutes in the 30-minute timed session to fine tune the system parameters until a good set of parameters that allow successful navigation through the entire obstacle course is found, then finish three successful “safety trials” first, and finally re-tune the system to enable faster, more aggressive, but riskier navigation behaviors to reduce average traversal time. Although most such “speed trials” failed, luckily for the UT Austin team, other teams’ inability to safely finish three collision-free trials to the goal make them the winner of the BARN Challenge only with a higher success rate (not faster navigation).

Two orthogonal future research directions can potentially help with the tunability of navigation systems: (1) developing planners free of tunable parameters onsite during deployment, such as end-to-end learning approaches, but, as mentioned above, with significantly better sim-to-real transfer and generalizability; (2) enabling more intelligent parameter tuning of classical systems, rather than laborious manual tuning, for example, through automatic tuning [21] or even dynamic parameter policies [51] learned from teleoperated demonstration [49], corrective interventions [44], evaluative feedback [46], or reinforcement learning [53].

V-C Getting “Unstuck”

Although most of the failure trials during the physical finals were due to collision with obstacles, there were also many trials that did not succeed because the robot got stuck in some densely populated obstacle areas. In those places, the robot kept repeating the same behaviors multiple times, e.g., detecting imminent collision with obstacles, rotating in place, backing up, resuming navigation, detecting the same imminent collision again, and so on. Such behavior sometimes led to collision with an obstacle, sometimes got the robot stuck forever, and may also succeed in rare occasions. All three teams have experienced such behaviors, with the UT Austin and UVA teams being able to fix it by tuning parameters and the Temple team changing the threshold between DRL-VO and the recovery planner.

Similarly, in real-world autonomous robot navigation, how to get “unstuck” safely remains a common and challenging problem. No matter how intelligent an autonomous mobile robot is, it may still make mistakes in the real world, e.g., when facing scenarios out of the training distribution, corner cases not considered by the system developer, or situations where the current parameter set is not appropriate. It is very likely that the robot will repeat the same mistake over and over, e.g., getting stuck at the same place, which needs to be avoided. Future research should investigate ways to identify such “stuck” situations, balance the tradeoff between exploitation and exploration (i.e., when to keep trying the previous way vs. when to try out new ways to get unstuck), utilize previous successful exploratory experiences in future similar scenarios to not get stuck again [19], or leverage offline computation to correct such failure cases in the future [55].

V-D Tradeoff between Safety and Speed

While The BARN Challenge was originally designed to test existing navigation system’s speed of maneuvering through highly constrained obstacle environments, given the safety constraint of being collision-free, it ended up being a competition about safety alone. The UT Austin team won the competition simply by safely navigating eight out of nine physical trials, not by doing so with the fastest speed. All the teams, except the UT Austin team after they figured out an effective set of parameters for each physical obstacle course, struggled with simply reaching the goal without any collision. The challenge organizers also deployed the widely-used DWA planner [9] in the ROS move_base navigation stack in the physical obstacle courses, only to find out that, despite being relatively safe compared to the participating teams’ methods, it struggled with many narrow spaces and got stuck in those places very often. Such a fact shows that the current autonomous mobile robot navigation capability still lags farther behind than one may expect.

V-E Latency Compensation for High Speed

Only the UT Austin team attempted to pursue higher speed navigation ( m/s), doing so after an appropriate parameter set was found for the particular physical course and three successful “safety trials” have been achieved. However, most “speed trials” ended in collision. One contributing factor to such failure was improper latency compensation for various high speeds. The UT Austin team was the only team that explicitly considered latency compensation in their AMRL stack [3], through a latency parameter. During high-speed maneuvers, the robot inevitably needs to aggressively change its navigation speed to swerve through obstacles and to accelerate in open spaces. System latency caused by sensing, processing, computation, communication, and actuation will likely invalidate previously feasible plans. While simply tuning the latency parameter value can help to certain extent, a more intelligent and adaptive way to calculate and compensate system latency is necessary for the robot to take full advantage of its computing power before executing aggressive maneuvers.

V-F Navigation is More Than Planning

To plan agile navigation maneuvers through highly constrained obstacle environments, the robot first needs to accurately perceive its configuration with respect to the obstacles. Inaccurate localization or odometry during fast maneuvers with significant angular velocity usually produces significant drift, causing previously valid plans become infeasible. While all three teams’ local planners rely on raw perception to minimize such adverse effect, e.g., using high frequency laser scans and directly planning with respect to these raw features, their global planner usually depends on the results of localization or odometry techniques. For example, the Temple team used the Dijkstra’s global planner in move_base. An erroneous localization will cause an erroneous global plan, which in turn will affect the quality of the local plan. Such adverse effect will diminish when the navigation speed is low, because localization techniques may recover from drift over time. During high-speed navigation, however, the planner needs to quickly plan actions regardless of whether the drift has been fixed or not. As mentioned above, latency will start to play a role as well, because a good latency compensation technique will depend on an accurate localization and odometry model of the robot, i.e., being able to predict where the robot will be based on where the robot is and what action will be executed. Techniques for better odometry, localization [33], and kinodynamic models [48, 17, 1] during high-speed navigation will be necessary to allow mobile robots to move both fast and accurately at the same time.

Vi Conclusions

The results of The BARN Challenge at ICRA 2022 suggest that, contrary to the perception of many in the field, autonomous metric ground robot navigation can not yet be considered a solved problem. Indeed, even the competition organizers had initially assumed that obstacle avoidance alone was too simple a goal, and therefore emphasized navigation speed before the physical competition. However, each of the finalist teams experienced difficulty performing collision-free navigation, and this ultimately led the organizers to modify the competition rules to focus more on collision avoidance. This result suggests that state-of-the-art navigation systems still suffer from suboptimal performance due to potentially many aspects of the full navigation system (discussed in Section V). Therefore, while it is worthwhile to extend navigation research in directions orthogonal to metric navigation (e.g., purely vision-based, off-road, and social navigation), the community should also not overlook the problems that still remain in this space, especially when robots are expected to be extensively and reliably deployed in the real world.

References

  • [1] P. Atreya, H. Karnan, K. S. Sikand, X. Xiao, G. Warnell, S. Rabiee, P. Stone, and J. Biswas (2022) High-speed accurate robot control using learned forward kinodynamics and non-linear least squares optimization. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. . Cited by: §V-F.
  • [2] P. Atreya, H. Karnan, K. S. Sikand, X. Xiao, G. Warnell, S. Rabiee, P. Stone, and J. Biswas (2022) High-speed accurate robot control using learned forward kinodynamics and non-linear least squares optimization. In To Appear in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Cited by: §I.
  • [3] J. Biswas (2013) AMRL autonomy stack. GitHub. Note: https://github.com/ut-amrl/graph_navigation Cited by: §IV-A, §V-E.
  • [4] Y. F. Chen, M. Liu, M. Everett, and J. P. How (2017) Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. In 2017 IEEE international conference on robotics and automation (ICRA), pp. 285–292. Cited by: §I.
  • [5] H. L. Chiang, A. Faust, M. Fiser, and A. Francis (2019) Learning navigation behaviors end-to-end with autorl. IEEE Robotics and Automation Letters 4 (2), pp. 2007–2014. Cited by: §I.
  • [6] R. C. Coulter (1992) Implementation of the pure pursuit path tracking algorithm. Technical report Carnegie-Mellon UNIV Pittsburgh PA Robotics INST. Cited by: item 3.
  • [7] T. Fan, P. Long, W. Liu, and J. Pan (2020) Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. The International Journal of Robotics Research 39 (7), pp. 856–892. Cited by: §IV-C1.
  • [8] P. Fiorini and Z. Shiller (1998) Motion planning in dynamic environments using velocity obstacles. The International Journal of Robotics Research 17 (7), pp. 760–772. Cited by: §IV-C3.
  • [9] D. Fox, W. Burgard, and S. Thrun (1997) The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine 4 (1), pp. 23–33. Cited by: §I, §II-B, TABLE I, §II, §V-D.
  • [10] R. Guldenring, M. Görner, N. Hendrich, N. J. Jacobsen, and J. Zhang (2020) Learning local planners for human-aware navigation in indoor environments. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6053–6060. Cited by: §IV-C1.
  • [11] Home – everyday robots. Note: https://everydayrobots.com/Accessed: 2022-07-20 Cited by: §I.
  • [12] X. Huang, H. Deng, W. Zhang, R. Song, and Y. Li (2021) Towards multi-modal perception-based navigation: a deep reinforcement learning method. IEEE Robotics and Automation Letters 6 (3), pp. 4986–4993. Cited by: §IV-C2.
  • [13] IRobot – robot vacuum and mop. Note: https://www.irobot.com/Accessed: 2022-07-20 Cited by: §I.
  • [14] Jackal ugv - small weatherproof robot - clearpath. Note: https://clearpathrobotics.com/jackal-small-unmanned-ground-vehicle/Accessed: 2022-07-21 Cited by: §I.
  • [15] G. Kahn, P. Abbeel, and S. Levine (2021)

    Badgr: an autonomous self-supervised learning-based navigation system

    .
    IEEE Robotics and Automation Letters 6 (2), pp. 1312–1319. Cited by: §I.
  • [16] H. Karnan, A. Nair, X. Xiao, G. Warnell, S. Pirk, A. Toshev, J. Hart, J. Biswas, and P. Stone (2022) Socially compliant navigation dataset (scand): a large-scale dataset of demonstrations for social navigation. IEEE Robotics and Automation Letters (), pp. . Cited by: §I.
  • [17] H. Karnan, K. S. Sikand, P. Atreya, S. Rabiee, X. Xiao, G. Warnell, P. Stone, and J. Biswas (2022) VI-ikd: high-speed accurate off-road navigation using learned visual-inertial inverse kinodynamics. In To Appear in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. . Cited by: §I, §V-F.
  • [18] H. Karnan, G. Warnell, X. Xiao, and P. Stone (2022)

    Voila: visual-observation-only imitation learning for autonomous navigation

    .
    In 2022 International Conference on Robotics and Automation (ICRA), pp. 2497–2503. Cited by: §I.
  • [19] B. Liu, X. Xiao, and P. Stone (2021) A lifelong learning approach to mobile robot navigation. IEEE Robotics and Automation Letters 6 (2), pp. 1090–1096. Cited by: §V-C.
  • [20] P. Long, T. Fan, X. Liao, W. Liu, H. Zhang, and J. Pan (2018) Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6252–6259. Cited by: §I, §IV-C1.
  • [21] H. Ma, J. S. Smith, and P. A. Vela (2021) NavTuner: learning a scene-sensitive family of navigation policies. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 492–499. Cited by: §V-B.
  • [22] C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh (2021) Core challenges of social robot navigation: a survey. arXiv preprint arXiv:2103.05668. Cited by: §I.
  • [23] Meet scout. Note: https://www.aboutamazon.com/news/transportation/meet-scoutAccessed: 2022-07-20 Cited by: §I.
  • [24] J. Minguez and L. Montano (2004) Nearness diagram (nd) navigation: collision avoidance in troublesome scenarios. IEEE Transactions on Robotics and Automation 20 (1), pp. 45–59. External Links: Document Cited by: §IV-B.
  • [25] R. Mirsky, X. Xiao, J. Hart, and P. Stone (2021) Prevention and resolution of conflicts in social navigation–a survey. arXiv preprint arXiv:2106.12113. Cited by: §I.
  • [26] N. Mohammad and N. Bezzo (2022) A robust and fast occlusion-based frontier method for autonomous navigation in unknown cluttered environments. In To Appear in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vol. , pp. . External Links: Document Cited by: §IV-B.
  • [27] M. Mujahad, D. Fischer, B. Mertsching, and H. Jaddu (2010) Closest gap based (cg) reactive obstacle avoidance navigation for highly cluttered environments. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. , pp. 1805–1812. External Links: Document Cited by: §IV-B.
  • [28] Y. Pan, C. Cheng, K. Saigol, K. Lee, X. Yan, E. A. Theodorou, and B. Boots (2020) Imitation learning for agile autonomous driving. The International Journal of Robotics Research 39 (2-3), pp. 286–302. Cited by: §I.
  • [29] D. Perille, A. Truong, X. Xiao, and P. Stone (2020) Benchmarking metric ground navigation. In 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 116–121. Cited by: §I, §II.
  • [30] M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena (2017) From perception to decision: a data-driven approach to end-to-end motion planning for autonomous ground robots. In 2017 ieee international conference on robotics and automation (icra), pp. 1527–1533. Cited by: §I, TABLE I, §IV-C1.
  • [31] S. Quinlan and O. Khatib (1993) Elastic bands: connecting path planning and control. In [1993] Proceedings IEEE International Conference on Robotics and Automation, pp. 802–807. Cited by: §I, TABLE I, §II.
  • [32] J. Redmon and A. Farhadi (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767. Cited by: item 1.
  • [33] ros movebase navigation stack. Note: http://wiki.ros.org/move_baseAccessed: 2021-09-9 Cited by: §IV-A, §V-F.
  • [34] ros movebase navigation stack. Note: http://wiki.ros.org/move_baseAccessed: 2021-09-9 Cited by: §II, §IV-A.
  • [35] C. Rösmann, W. Feiten, T. Wösch, F. Hoffmann, and T. Bertram (2012) Trajectory modification considering dynamic constraints of autonomous robots. In ROBOTIK 2012; 7th German Conference on Robotics, pp. 1–6. Cited by: §I.
  • [36] C. Rösmann, W. Feiten, T. Wösch, F. Hoffmann, and T. Bertram (2013) Efficient trajectory optimization using a sparse model. In 2013 European Conference on Mobile Robots, pp. 138–143. Cited by: §I.
  • [37] C. Rösmann, F. Hoffmann, and T. Bertram (2015) Planning of multiple robot trajectories in distinctive topologies. In 2015 European Conference on Mobile Robots (ECMR), pp. 1–6. Cited by: §I.
  • [38] C. Rösmann, F. Hoffmann, and T. Bertram (2017) Integrated online trajectory planning and optimization in distinctive topologies. Robotics and Autonomous Systems 88, pp. 142–153. Cited by: §I.
  • [39] C. Rösmann, F. Hoffmann, and T. Bertram (2017) Kinodynamic trajectory optimization and control for car-like robots. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5681–5686. Cited by: §I.
  • [40] A. J. Sathyamoorthy, J. Liang, U. Patel, T. Guan, R. Chandra, and D. Manocha (2020) Densecavoid: real-time navigation in dense crowds using anticipatory behaviors. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 11345–11352. Cited by: §IV-C2.
  • [41] J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347. Cited by: §IV-C3.
  • [42] Starship. Note: https://www.starship.xyz/Accessed: 2022-07-20 Cited by: §I.
  • [43] The barn challenge. Note: https://people.cs.gmu.edu/~xxiao2/Research/BARN_Challenge/BARN_Challenge.htmlAccessed: 2022-08-20 Cited by: §I.
  • [44] Z. Wang, X. Xiao, B. Liu, G. Warnell, and P. Stone (2021) Appli: adaptive planner parameter learning from interventions. In 2021 IEEE international conference on robotics and automation (ICRA), pp. 6079–6085. Cited by: §V-B.
  • [45] Z. Wang, X. Xiao, A. J. Nettekoven, K. Umasankar, A. Singh, S. Bommakanti, U. Topcu, and P. Stone (2021) From agile ground to aerial navigation: learning from learned hallucination. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 148–153. Cited by: TABLE I, §II.
  • [46] Z. Wang, X. Xiao, G. Warnell, and P. Stone (2021) APPLE: adaptive planner parameter learning from evaluative feedback. IEEE Robotics and Automation Letters 6 (4), pp. 7744–7749. Cited by: §V-B.
  • [47] D. Wilkie, J. Van Den Berg, and D. Manocha (2009) Generalized velocity obstacles. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5573–5578. Cited by: §IV-C3.
  • [48] X. Xiao, J. Biswas, and P. Stone (2021) Learning inverse kinodynamics for accurate high-speed off-road navigation on unstructured terrain. IEEE Robotics and Automation Letters 6 (3), pp. 6054–6060. Cited by: §V-F.
  • [49] X. Xiao, B. Liu, G. Warnell, J. Fink, and P. Stone (2020) Appld: adaptive planner parameter learning from demonstration. IEEE Robotics and Automation Letters 5 (3), pp. 4541–4547. Cited by: §V-B.
  • [50] X. Xiao, B. Liu, G. Warnell, and P. Stone (2022) Motion planning and control for mobile robot navigation using machine learning: a survey. Autonomous Robots, pp. 1–29. Cited by: §I.
  • [51] X. Xiao, Z. Wang, Z. Xu, B. Liu, G. Warnell, G. Dhamankar, A. Nair, and P. Stone (2022) Appl: adaptive planner parameter learning. Robotics and Autonomous Systems 154, pp. 104132. Cited by: §V-B.
  • [52] Z. Xie and P. Dames (2022) DRL-vo: using velocity obstacles to learn safe navigation policies for crowded dynamic scenes. IEEE Transactions on Robotics. Note: Under review Cited by: §IV-C.
  • [53] Z. Xu, G. Dhamankar, A. Nair, X. Xiao, G. Warnell, B. Liu, Z. Wang, and P. Stone (2021) Applr: adaptive planner parameter learning from reinforcement. In 2021 IEEE international conference on robotics and automation (ICRA), pp. 6086–6092. Cited by: TABLE I, §II, §V-B.
  • [54] Z. Xu, B. Liu, X. Xiao, A. Nair, and P. Stone Benchmarking reinforcement learning techniques for autonomous navigation. Cited by: §II.
  • [55] Z. Xu, A. Nair, X. Xiao, and P. Stone Learning real-world autonomous navigation by self-supervised environment synthesis. Cited by: §V-C.
  • [56] K. Yoon, Y. Song, and M. Jeon (2018) Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views. IET Image Processing 12 (7), pp. 1175–1184. Cited by: item 1.