Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation

05/25/2018
by   Alane Suhr, et al.
0

We propose a learning approach for mapping context-dependent sequential instructions to actions. We address the problem of discourse and state dependencies with an attention-based model that considers both the history of the interaction and the state of the world. To train from start and goal states without access to demonstrations, we propose SESTRA, a learning algorithm that takes advantage of single-step reward observations and immediate expected reward maximization. We evaluate on the SCONE domains, and show absolute accuracy improvements of 9.8 high-level logical representations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2018

Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction

We propose an approach for mapping natural language instructions and raw...
research
04/28/2017

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

We propose to directly map raw visual observations and text input to act...
research
11/02/2018

Value-based Search in Execution Space for Mapping Instructions to Programs

Training models to map natural language instructions to programs given t...
research
04/06/2017

Geometry of Policy Improvement

We investigate the geometry of optimal memoryless time independent decis...
research
02/25/2021

Reinforcement Learning of Implicit and Explicit Control Flow in Instructions

Learning to flexibly follow task instructions in dynamic environments po...
research
06/13/2012

Sampling First Order Logical Particles

Approximate inference in dynamic systems is the problem of estimating th...
research
11/20/2022

Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines

Natural and formal languages provide an effective mechanism for humans t...

Please sign up or login with your details

Forgot password? Click here to reset