The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

03/05/2019
by   Chih-Yao Ma, et al.
8

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making. Specifically, the Vision and Language Navigation (VLN) task involves navigating to a goal purely from language instructions and visual information without explicit knowledge of the goal. Recent successful approaches have made in-roads in achieving good success rates for this task but rely on beam search, which thoroughly explores a large number of trajectories and is unrealistic for applications such as robotics. In this paper, inspired by the intuition of viewing the problem as search on a navigation graph, we propose to use a progress monitor developed in prior work as a learnable heuristic for search. We then propose two modules incorporated into an end-to-end architecture: 1) A learned mechanism to perform backtracking, which decides whether to continue moving forward or roll back to a previous state (Regret Module) and 2) A mechanism to help the agent decide which direction to go next by showing directions that are visited and their associated progress estimate (Progress Marker). Combined, the proposed approach significantly outperforms current state-of-the-art methods using greedy action selection, with 5 improvement on the test server in success rates, and more importantly 8 success rates normalized by the path length. Our code is available at https://github.com/chihyaoma/regretful-agent .

READ FULL TEXT

page 1

page 9

page 11

page 12

page 13

page 14

page 15

research
01/10/2019

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

The Vision-and-Language Navigation (VLN) task entails an agent following...
research
11/07/2020

Sim-to-Real Transfer for Vision-and-Language Navigation

We study the challenging problem of releasing a robot in a previously un...
research
11/28/2021

Explore the Potential Performance of Vision-and-Language Navigation Model: a Snapshot Ensemble Method

Vision-and-Language Navigation (VLN) is a challenging task in the field ...
research
10/19/2020

Language and Visual Entity Relationship Graph for Agent Navigation

Vision-and-Language Navigation (VLN) requires an agent to navigate in a ...
research
02/14/2022

One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones

We study the problem of developing autonomous agents that can follow hum...
research
06/07/2021

Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring

Despite recent progress, learning new tasks through language instruction...
research
08/09/2023

Bird's-Eye-View Scene Graph for Vision-Language Navigation

Vision-language navigation (VLN), which entails an agent to navigate 3D ...

Please sign up or login with your details

Forgot password? Click here to reset