MaAST: Map Attention with Semantic Transformersfor Efficient Visual Navigation

03/21/2021
by   Zachary Seymour, et al.
0

Visual navigation for autonomous agents is a core task in the fields of computer vision and robotics. Learning-based methods, such as deep reinforcement learning, have the potential to outperform the classical solutions developed for this task; however, they come at a significantly increased computational load. Through this work, we design a novel approach that focuses on performing better or comparable to the existing learning-based solutions but under a clear time/computational budget. To this end, we propose a method to encode vital scene semantics such as traversable paths, unexplored areas, and observed scene objects – alongside raw visual streams such as RGB, depth, and semantic segmentation masks – into a semantically informed, top-down egocentric map representation. Further, to enable the effective use of this information, we introduce a novel 2-D map attention mechanism, based on the successful multi-layer Transformer networks. We conduct experiments on 3-D reconstructed indoor PointGoal visual navigation and demonstrate the effectiveness of our approach. We show that by using our novel attention schema and auxiliary rewards to better utilize scene semantics, we outperform multiple baselines trained with only raw inputs or implicit semantic information while operating with an 80

READ FULL TEXT

page 1

page 3

page 6

research
04/20/2021

Visual Navigation with Spatial Attention

This work focuses on object goal visual navigation, aiming at finding th...
research
05/17/2022

GraphMapper: Efficient Visual Navigation by Scene Graph Generation

Understanding the geometric relationships between objects in a scene is ...
research
01/26/2022

Self-supervised 3D Semantic Representation Learning for Vision-and-Language Navigation

In the Vision-and-Language Navigation task, the embodied agent follows l...
research
08/26/2021

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

This paper presents a novel approach for the Vision-and-Language Navigat...
research
08/01/2023

Multi-goal Audio-visual Navigation using Sound Direction Map

Over the past few years, there has been a great deal of research on navi...
research
06/21/2022

A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Physically rearranging objects is an important capability for embodied a...
research
01/01/2023

Goal-guided Transformer-enabled Reinforcement Learning for Efficient Autonomous Navigation

Despite some successful applications of goal-driven navigation, existing...

Please sign up or login with your details

Forgot password? Click here to reset