Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill

09/19/2023
by   Wenzhe Cai, et al.
0

Zero-shot object navigation is a challenging task for home-assistance robots. This task emphasizes visual grounding, commonsense inference and locomotion abilities, where the first two are inherent in foundation models. But for the locomotion part, most works still depend on map-based planning approaches. The gap between RGB space and map space makes it difficult to directly transfer the knowledge from foundation models to navigation tasks. In this work, we propose a Pixel-guided Navigation skill (PixNav), which bridges the gap between the foundation models and the embodied navigation task. It is straightforward for recent foundation models to indicate an object by pixels, and with pixels as the goal specification, our method becomes a versatile navigation policy towards all different kinds of objects. Besides, our PixNav is a pure RGB-based policy that can reduce the cost of home-assistance robots. Experiments demonstrate the robustness of the PixNav which achieves 80+ the local path-planning task. To perform long-horizon object navigation, we design an LLM-based planner to utilize the commonsense knowledge between objects and rooms to select the best waypoint. Evaluations across both photorealistic indoor simulators and real-world environments validate the effectiveness of our proposed navigation strategy. Code and video demos are available at https://github.com/wzcai99/Pixel-Navigator.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

research
01/30/2023

ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation

The ability to accurately locate and navigate to a specific object is a ...
research
06/15/2022

Zero-shot object goal visual navigation

Object goal visual navigation is a challenging task that aims to guide a...
research
03/10/2023

Zero-Shot Object Searching Using Large-scale Object Relationship Prior

Home-assistant robots have been a long-standing research topic, and one ...
research
11/11/2022

Control Transformer: Robot Navigation in Unknown Environments through PRM-Guided Return-Conditioned Sequence Modeling

Learning long-horizon tasks such as navigation has presented difficult c...
research
10/26/2022

ViNL: Visual Navigation and Locomotion Over Obstacles

We present Visual Navigation and Locomotion over obstacles (ViNL), which...
research
03/10/2023

Task and Motion Planning with Large Language Models for Object Rearrangement

Multi-object rearrangement is a crucial skill for service robots, and co...
research
06/08/2021

RobustNav: Towards Benchmarking Robustness in Embodied Navigation

As an attempt towards assessing the robustness of embodied navigation ag...

Please sign up or login with your details

Forgot password? Click here to reset