Non-Markovian Control with Gated End-to-End Memory Policy Networks

05/31/2017
by   Julien Perez, et al.
0

Partially observable environments present an important open challenge in the domain of sequential control learning with delayed rewards. Despite numerous attempts during the two last decades, the majority of reinforcement learning algorithms and associated approximate models, applied to this context, still assume Markovian state transitions. In this paper, we explore the use of a recently proposed attention-based model, the Gated End-to-End Memory Network, for sequential control. We call the resulting model the Gated End-to-End Memory Policy Network. More precisely, we use a model-free value-based algorithm to learn policies for partially observed domains using this memory-enhanced neural network. This model is end-to-end learnable and it features unbounded memory. Indeed, because of its attention mechanism and associated non-parametric memory, the proposed model allows us to define an attention mechanism over the observation stream unlike recurrent models. We show encouraging results that illustrate the capability of our attention-based model in the context of the continuous-state non-stationary control problem of stock trading. We also present an OpenAI Gym environment for simulated stock exchange and explain its relevance as a benchmark for the field of non-Markovian decision process learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2017

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR

This thesis introduces the sequence to sequence model with Luong's atten...
research
12/24/2018

VMAV-C: A Deep Attention-based Reinforcement Learning Algorithm for Model-based Control

Recent breakthroughs in Go play and strategic games have witnessed the g...
research
12/04/2014

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results

We replace the Hidden Markov Model (HMM) which is traditionally used in ...
research
10/13/2016

Gated End-to-End Memory Networks

Machine reading using differentiable reasoning models has recently shown...
research
03/20/2018

GaAN: Gated Attention Networks for Learning on Large and Spatiotemporal Graphs

We propose a new network architecture, Gated Attention Networks (GaAN), ...
research
12/28/2017

Multi-timescale memory dynamics in a reinforcement learning network with attention-gated memory

Learning and memory are intertwined in our brain and their relationship ...
research
04/15/2018

Attention-Gated Networks for Improving Ultrasound Scan Plane Detection

In this work, we apply an attention-gated network to real-time automated...

Please sign up or login with your details

Forgot password? Click here to reset