Learning Latent State Spaces for Planning through Reward Prediction

12/09/2019
by   Aaron Havens, et al.
5

Model-based reinforcement learning methods typically learn models for high-dimensional state spaces by aiming to reconstruct and predict the original observations. However, drawing inspiration from model-free reinforcement learning, we propose learning a latent dynamics model directly from rewards. In this work, we introduce a model-based planning framework which learns a latent reward prediction model and then plans in the latent state-space. The latent representation is learned exclusively from multi-step reward prediction which we show to be the only necessary information for successful planning. With this framework, we are able to benefit from the concise model-free representation, while still enjoying the data-efficiency of model-based algorithms. We demonstrate our framework in multi-pendulum and multi-cheetah environments where several pendulums or cheetahs are shown to the agent but only one of which produces rewards. In these environments, it is important for the agent to construct a concise latent representation to filter out irrelevant observations. We find that our method can successfully learn an accurate latent reward prediction model in the presence of the irrelevant information while existing model-based methods fail. Planning in the learned latent state-space shows strong performance and high sample efficiency over model-free and model-based baselines.

READ FULL TEXT
research
03/20/2023

Deceptive Reinforcement Learning in Model-Free Domains

This paper investigates deceptive reinforcement learning for privacy pre...
research
07/11/2017

Value Prediction Network

This paper proposes a novel deep reinforcement learning (RL) architectur...
research
11/21/2016

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

Reinforcement learning is concerned with identifying reward-maximizing b...
research
11/23/2020

Evolutionary Planning in Latent Space

Planning is a powerful approach to reinforcement learning with several d...
research
05/16/2020

Mutual Information Maximization for Robust Plannable Representations

Extending the capabilities of robotics to real-world complex, unstructur...
research
12/08/2020

Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) methods have shown strong samp...
research
01/31/2019

Successor Features Support Model-based and Model-free Reinforcement Learning

One key challenge in reinforcement learning is the ability to generalize...

Please sign up or login with your details

Forgot password? Click here to reset