Ground then Navigate: Language-guided Navigation in Dynamic Scenes

09/24/2022
by   Kanishk Jain, et al.
0

We investigate the Vision-and-Language Navigation (VLN) problem in the context of autonomous driving in outdoor settings. We solve the problem by explicitly grounding the navigable regions corresponding to the textual command. At each timestamp, the model predicts a segmentation mask corresponding to the intermediate or the final navigable region. Our work contrasts with existing efforts in VLN, which pose this task as a node selection problem, given a discrete connected graph corresponding to the environment. We do not assume the availability of such a discretised map. Our work moves towards continuity in action space, provides interpretability through visual feedback and allows VLN on commands requiring finer manoeuvres like "park between the two cars". Furthermore, we propose a novel meta-dataset CARLA-NAV to allow efficient training and validation. The dataset comprises pre-recorded training sequences and a live environment for validation and testing. We provide extensive qualitative and quantitive empirical results to validate the efficacy of the proposed approach.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
12/11/2017

Autonomous UAV Navigation with Domain Adaptation

Unmanned Aerial Vehicle(UAV) autonomous driving gets popular attention i...
research
10/22/2022

DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving Agents

In the real world, autonomous driving agents navigate in highly dynamic ...
research
02/23/2022

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation

Following language instructions to navigate in unseen environments is a ...
research
05/25/2023

Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving

This paper addresses the problem of 3D referring expression comprehensio...
research
03/14/2022

Grounding Commands for Autonomous Vehicles via Layer Fusion with Region-specific Dynamic Layer Attention

Grounding a command to the visual environment is an essential ingredient...
research
04/06/2020

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments

We develop a language-guided navigation task set in a continuous 3D envi...
research
03/07/2022

Find a Way Forward: a Language-Guided Semantic Map Navigator

This paper attacks the problem of language-guided navigation in a new pe...

Please sign up or login with your details

Forgot password? Click here to reset