Towards Navigation by Reasoning over Spatial Configurations

05/14/2021
by   Yue Zhang, et al.
0

We deal with the navigation problem where the agent follows natural language instructions while observing the environment. Focusing on language understanding, we show the importance of spatial semantics in grounding navigation instructions into visual perceptions. We propose a neural agent that uses the elements of spatial configurations and investigate their influence on the navigation agent's reasoning ability. Moreover, we model the sequential execution order and align visual objects with spatial configurations in the instruction. Our neural agent improves strong baselines on the seen environments and shows competitive performance on the unseen environments. Additionally, the experimental results demonstrate that explicit modeling of spatial semantic elements in the instructions can improve the grounding and spatial reasoning of the model.

READ FULL TEXT
research
11/29/2018

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

We study the problem of jointly reasoning about language and vision thro...
research
01/09/2021

Are We There Yet? Learning to Localize in Embodied Instruction Following

Embodied instruction following is a challenging problem requiring an age...
research
09/26/2022

LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation

Understanding spatial and visual information is essential for a navigati...
research
07/21/2023

CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots

This work explores the capacity of large language models (LLMs) to addre...
research
07/13/2017

Representation Learning for Grounded Spatial Reasoning

The interpretation of spatial references is highly contextual, requiring...
research
08/26/2021

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

This paper presents a novel approach for the Vision-and-Language Navigat...
research
03/07/2023

Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding

The main challenge in vision-and-language navigation (VLN) is how to und...

Please sign up or login with your details

Forgot password? Click here to reset