Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

07/01/2019
by   Alex X. Lee, et al.
7

Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these kinds of observation spaces present a number of challenges in practice, since the policy must now solve two problems: a representation learning problem, and a task learning problem. In this paper, we aim to explicitly learn representations that can accelerate reinforcement learning from images. We propose the stochastic latent actor-critic (SLAC) algorithm: a sample-efficient and high-performing RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs. SLAC learns a compact latent representation space using a stochastic sequential latent variable model, and then learns a critic model within this latent space. By learning a critic within a compact state space, SLAC can learn much more efficiently than standard RL methods. The proposed model improves performance substantially over alternative representations as well, such as variational autoencoders. In fact, our experimental evaluation demonstrates that the sample efficiency of our resulting method is comparable to that of model-based RL methods that directly use a similar type of model for control. Furthermore, our method outperforms both model-free and model-based alternatives in terms of final performance and sample efficiency, on a range of difficult image-based control tasks.

READ FULL TEXT

page 7

page 9

page 10

page 17

page 18

page 19

research
10/02/2019

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Training an agent to solve control tasks directly from high-dimensional ...
research
04/12/2023

Bi-level Latent Variable Model for Sample-Efficient Multi-Agent Reinforcement Learning

Despite their potential in real-world applications, multi-agent reinforc...
research
03/03/2020

Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?

Deep reinforcement learning (RL) algorithms have recently achieved remar...
research
06/07/2018

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

In this work, we take a representation learning perspective on hierarchi...
research
01/18/2022

Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning

Learning informative representations from image-based observations is of...
research
10/03/2022

Latent State Marginalization as a Low-cost Approach for Improving Exploration

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework...
research
01/01/2022

Transfer RL across Observation Feature Spaces via Model-Based Regularization

In many reinforcement learning (RL) applications, the observation space ...

Please sign up or login with your details

Forgot password? Click here to reset