Rethinking the Spatial Route Prior in Vision-and-Language Navigation

10/12/2021
by   Xinzhe Zhou, et al.
0

Vision-and-language navigation (VLN) is a trending topic which aims to navigate an intelligent agent to an expected position through natural language instructions. This work addresses the task of VLN from a previously-ignored aspect, namely the spatial route prior of the navigation scenes. A critically enabling innovation of this work is explicitly considering the spatial route prior under several different VLN settings. In a most information-rich case of knowing environment maps and admitting shortest-path prior, we observe that given an origin-destination node pair, the internal route can be uniquely determined. Thus, VLN can be effectively formulated as an ordinary classification problem over all possible destination nodes in the scenes. Furthermore, we relax it to other more general VLN settings, proposing a sequential-decision variant (by abandoning the shortest-path route prior) and an explore-and-exploit scheme (for addressing the case of not knowing the environment maps) that curates a compact and informative sub-graph to exploit. As reported by [34], the performance of VLN methods has been stuck at a plateau in past two years. Even with increased model complexity, the state-of-the-art success rate on R2R validation-unseen set has stayed around 62 and 73 evaluations on both R2R and R4R, and surprisingly found that utilizing the spatial route priors may be the key of breaking above-mentioned performance ceiling. For example, on R2R validation-unseen set, when the number of discrete nodes explored is about 40, our single-model success rate reaches 73 increases to 78 previous state-of-the-art VLN-BERT with 3 models ensembled.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2021

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

In the Vision-and-Language Navigation (VLN) task an embodied agent navig...
research
03/31/2020

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

In the Vision-and-Language Navigation (VLN) task, an agent with egocentr...
research
01/08/2020

High-Level Plan for Behavioral Robot Navigation with Natural Language Directions and R-NET

When the navigational environment is known, it can be represented as a g...
research
05/04/2018

A Distributed Routing Algorithm for Internet-wide Geocast

Geocast is the concept of sending data packets to nodes in a specified g...
research
05/06/2020

Diagnosing the Environment Bias in Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) requires an agent to follow natural...
research
05/23/2018

Wayfinding through an unfamiliar environment

Strategies for finding one's way through an unfamiliar environment may b...
research
11/28/2021

Explore the Potential Performance of Vision-and-Language Navigation Model: a Snapshot Ensemble Method

Vision-and-Language Navigation (VLN) is a challenging task in the field ...

Please sign up or login with your details

Forgot password? Click here to reset