Learning Markov State Abstractions for Deep Reinforcement Learning

06/08/2021
by   Cameron Allen, et al.
0

The fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov. However, when MDPs have rich observations, agents typically learn by way of an abstract state representation, and such representations are not guaranteed to preserve the Markov property. We introduce a novel set of conditions and prove that they are sufficient for learning a Markov abstract state representation. We then describe a practical training procedure that combines inverse model estimation and temporal contrastive learning to learn an abstraction that approximately satisfies these conditions. Our novel training objective is compatible with both online and offline training: it does not require a reward signal, but agents can capitalize on reward information when available. We empirically evaluate our approach on a visual gridworld domain and a set of continuous control benchmarks. Our approach learns representations that capture the underlying structure of the domain and lead to improved sample efficiency over state-of-the-art deep reinforcement learning with visual features – often matching or exceeding the performance achieved with hand-designed compact state information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Markov Abstractions for PAC Reinforcement Learning in Non-Markov Decision Processes

Our work aims at developing reinforcement learning algorithms that do no...
research
06/22/2021

Provably Efficient Representation Learning in Low-rank Markov Decision Processes

The success of deep reinforcement learning (DRL) is due to the power of ...
research
06/22/2016

Visualizing Dynamics: from t-SNE to SEMI-MDPs

Deep Reinforcement Learning (DRL) is a trending field of research, showi...
research
07/14/2022

Making Linear MDPs Practical via Contrastive Representation Learning

It is common to address the curse of dimensionality in Markov decision p...
research
06/09/2009

Feature Reinforcement Learning: Part I: Unstructured MDPs

General-purpose, intelligent, learning agents cycle through sequences of...
research
12/26/2022

Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral Similarities

How to learn an effective reinforcement learning-based model for control...
research
07/18/2018

Representational efficiency outweighs action efficiency in human program induction

The importance of hierarchically structured representations for tractabl...

Please sign up or login with your details

Forgot password? Click here to reset