Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

09/01/2020
by   Liqi Yan, et al.
0

Vision and voice are two vital keys for agents' interaction and learning. In this paper, we present a novel indoor navigation model called Memory Vision-Voice Indoor Navigation (MVV-IN), which receives voice commands and analyzes multimodal information of visual observation in order to enhance robots' environment understanding. We make use of single RGB images taken by a first-view monocular camera. We also apply a self-attention mechanism to keep the agent focusing on key areas. Memory is important for the agent to avoid repeating certain tasks unnecessarily and in order for it to adapt adequately to new scenes, therefore, we make use of meta-learning. We have experimented with various functional features extracted from visual observation. Comparative experiments prove that our methods outperform state-of-the-art baselines.

READ FULL TEXT

page 1

page 5

research
12/10/2020

Visual Perception Generalization for Vision-and-Language Navigation via Meta-Learning

Vision-and-language navigation (VLN) is a challenging task that requires...
research
12/03/2018

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

Learning is an inherently continuous phenomenon. When humans learn a new...
research
03/30/2021

Diagnosing Vision-and-Language Navigation: What Really Matters

Vision-and-language navigation (VLN) is a multimodal task where an agent...
research
09/21/2008

Evaluation of an Intelligent Assistive Technology for Voice Navigation of Spreadsheets

An integral part of spreadsheet auditing is navigation. For sufferers of...
research
08/01/2017

PROBE: Predictive Robust Estimation for Visual-Inertial Navigation

Navigation in unknown, chaotic environments continues to present a signi...
research
10/02/2022

Unsupervised Vision and Vision-motion Calibration Strategies for PointGoal Navigation in Indoor Environment

PointGoal navigation in indoor environment is a fundamental task for per...
research
03/30/2022

ESNI: Domestic Robots Design for Elderly and Disabled People

Our paper focuses on the research of the possibility for speech recognit...

Please sign up or login with your details

Forgot password? Click here to reset