Visual Perception Generalization for Vision-and-Language Navigation via Meta-Learning

12/10/2020
by   Ting Wang, et al.
0

Vision-and-language navigation (VLN) is a challenging task that requires an agent to navigate in real-world environments by understanding natural language instructions and visual information received in real-time. Prior works have implemented VLN tasks on continuous environments or physical robots, all of which use a fixed camera configuration due to the limitations of datasets, such as 1.5 meters height, 90 degrees horizontal field of view (HFOV), etc. However, real-life robots with different purposes have multiple camera configurations, and the huge gap in visual information makes it difficult to directly transfer the learned navigation model between various robots. In this paper, we propose a visual perception generalization strategy based on meta-learning, which enables the agent to fast adapt to a new camera configuration with a few shots. In the training phase, we first locate the generalization problem to the visual perception module, and then compare two meta-learning algorithms for better generalization in seen and unseen environments. One of them uses the Model-Agnostic Meta-Learning (MAML) algorithm that requires a few shot adaptation, and the other refers to a metric-based meta-learning method with a feature-wise affine transformation layer. The experiment results show that our strategy successfully adapts the learned navigation model to a new camera configuration, and the two algorithms show their advantages in seen and unseen environments respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

research
11/06/2020

A Few Shot Adaptation of Visual Navigation Skills to New Observations using Meta-Learning

Target-driven visual navigation is a challenging problem that requires a...
research
09/01/2020

Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning

Vision and voice are two vital keys for agents' interaction and learning...
research
12/03/2018

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

Learning is an inherently continuous phenomenon. When humans learn a new...
research
02/07/2021

Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks

In this paper, we study the generalization properties of Model-Agnostic ...
research
09/10/2022

Anticipating the Unseen Discrepancy for Vision and Language Navigation

Vision-Language Navigation requires the agent to follow natural language...
research
08/05/2020

A Neural-Symbolic Framework for Mental Simulation

We present a neural-symbolic framework for observing the environment and...
research
06/29/2023

AutoML in Heavily Constrained Applications

Optimizing a machine learning pipeline for a task at hand requires caref...

Please sign up or login with your details

Forgot password? Click here to reset