Grounding Complex Navigational Instructions Using Scene Graphs

06/03/2021
by   Michiel de Jong, et al.
3

Training a reinforcement learning agent to carry out natural language instructions is limited by the available supervision, i.e. knowing when the instruction has been carried out. We adapt the CLEVR visual question answering dataset to generate complex natural language navigation instructions and accompanying scene graphs, yielding an environment-agnostic supervised dataset. To demonstrate the use of this data set, we map the scenes to the VizDoom environment and use the architecture in <cit.> to train an agent to carry out these more complex language instructions.

READ FULL TEXT
research
04/08/2022

Grounding Hindsight Instructions in Multi-Goal Reinforcement Learning for Robotics

This paper focuses on robotic reinforcement learning with sparse rewards...
research
06/01/2022

SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment

Humans exhibit incredibly high levels of multi-modal understanding - com...
research
10/27/2022

Bridging the visual gap in VLN via semantically richer instructions

The Visual-and-Language Navigation (VLN) task requires understanding a t...
research
06/27/2021

Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

Forming and interpreting abstraction is a core process in human communic...
research
12/10/2017

Learning Interpretable Spatial Operations in a Rich 3D Blocks World

In this paper, we study the problem of mapping natural language instruct...
research
10/21/2019

Learning to Map Natural Language Instructions to Physical Quadcopter Control using Simulated Flight

We propose a joint simulation and real-world learning framework for mapp...
research
06/03/2019

Hierarchical Decision Making by Generating and Following Natural Language Instructions

We explore using latent natural language instructions as an expressive a...

Please sign up or login with your details

Forgot password? Click here to reset