Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule

09/16/2020
by   Shuhei Kurita, et al.
0

Vision-and-language navigation (VLN) is a task in which an agent is embodied in a realistic 3D environment and follows an instruction to reach the goal node. While most of the previous studies have built and investigated a discriminative approach, we notice that there are in fact two possible approaches to building such a VLN agent: discriminative and generative. In this paper, we design and investigate a generative language-grounded policy which computes the distribution over all possible instructions given action and the transition history. In experiments, we show that the proposed generative approach outperforms the discriminative approach in the Room-2-Room (R2R) dataset, especially in the unseen environments. We further show that the combination of the generative and discriminative policies achieves close to the state-of-the art results in the R2R dataset, demonstrating that the generative and discriminative policies capture the different aspects of VLN.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 10

page 11

page 12

research
04/19/2021

Improving Cross-Modal Alignment in Vision Language Navigation via Syntactic Information

Vision language navigation is the task that requires an agent to navigat...
research
11/20/2017

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

A robot that can carry out a natural-language instruction has been a dre...
research
08/20/2021

Airbert: In-domain Pretraining for Vision-and-Language Navigation

Vision-and-language navigation (VLN) aims to enable embodied agents to n...
research
03/06/2019

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

We present FAST NAVIGATOR, a general framework for action decoding, whic...
research
09/05/2019

Robust Navigation with Language Pretraining and Stochastic Sampling

Core to the vision-and-language navigation (VLN) challenge is building r...
research
07/03/2019

Chasing Ghosts: Instruction Following as Bayesian State Tracking

A visually-grounded navigation instruction can be interpreted as a seque...
research
05/29/2019

Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation

Advances in learning and representations have reinvigorated work that co...

Please sign up or login with your details

Forgot password? Click here to reset