Gated-Attention Architectures for Task-Oriented Language Grounding

06/22/2017
by   Devendra Singh Chaplot, et al.
0

To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 12

research
04/23/2018

Attention Based Natural Language Grounding by Navigating Virtual Environment

In this work, we focus on the problem of grounding language by training ...
research
10/14/2019

Dynamic Attention Networks for Task Oriented Grounding

In order to successfully perform tasks specified by natural language ins...
research
10/27/2019

Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear Order

In this work, we analyze the performance of general deep reinforcement l...
research
04/08/2022

Grounding Hindsight Instructions in Multi-Goal Reinforcement Learning for Robotics

This paper focuses on robotic reinforcement learning with sparse rewards...
research
01/25/2020

Following Instructions by Imagining and Reaching Visual Goals

While traditional methods for instruction-following typically assume pri...
research
05/22/2018

Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents

Recently there has been a rising interest in training agents, embodied i...
research
11/14/2020

Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following

We study the problem of learning a robot policy to follow natural langua...

Please sign up or login with your details

Forgot password? Click here to reset