ULN: Towards Underspecified Vision-and-Language Navigation

10/18/2022
by   Weixi Feng, et al.
0

Vision-and-Language Navigation (VLN) is a task to guide an embodied agent moving to a target position using language instructions. Despite the significant performance improvement, the wide use of fine-grained instructions fails to characterize more practical linguistic variations in reality. To fill in this gap, we introduce a new setting, namely Underspecified vision-and-Language Navigation (ULN), and associated evaluation datasets. ULN evaluates agents using multi-level underspecified instructions instead of purely fine-grained or coarse-grained, which is a more realistic and general setting. As a primary step toward ULN, we propose a VLN framework that consists of a classification module, a navigation agent, and an Exploitation-to-Exploration (E2E) module. Specifically, we propose to learn Granularity Specific Sub-networks (GSS) for the agent to ground multi-level instructions with minimal additional parameters. Then, our E2E module estimates grounding uncertainty and conducts multi-step lookahead exploration to improve the success rate further. Experimental results show that existing VLN models are still brittle to multi-level language underspecification. Our framework is more robust and outperforms the baselines on ULN by  10 across all levels.

READ FULL TEXT

page 1

page 18

page 19

page 20

research
12/08/2021

Contrastive Instruction-Trajectory Learning for Vision-Language Navigation

The vision-language navigation (VLN) task requires an agent to reach a t...
research
02/23/2022

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

Following language instructions to navigate in unseen environments is a ...
research
07/15/2020

Active Visual Information Gathering for Vision-Language Navigation

Vision-language navigation (VLN) is the task of entailing an agent to ca...
research
07/20/2023

Behavioral Analysis of Vision-and-Language Navigation Agents

To be successful, Vision-and-Language Navigation (VLN) agents must be ab...
research
02/14/2022

One Step at a Time: Long-Horizon Vision-and-Language Navigation with Milestones

We study the problem of developing autonomous agents that can follow hum...
research
03/02/2023

MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) aims to develop intelligent agents ...
research
03/07/2023

Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding

The main challenge in vision-and-language navigation (VLN) is how to und...

Please sign up or login with your details

Forgot password? Click here to reset