Implicit Policy for Reinforcement Learning

06/10/2018
by   Yunhao Tang, et al.
1

We introduce Implicit Policy, a general class of expressive policies that can flexibly represent complex action distributions in reinforcement learning, with efficient algorithms to compute entropy regularized policy gradients. We empirically show that, despite its simplicity in implementation, entropy regularization combined with a rich policy class can attain desirable properties displayed under maximum entropy reinforcement learning framework, such as robustness and multi-modality.

READ FULL TEXT

page 8

page 20

research
12/02/2019

On-policy Reinforcement Learning with Entropy Regularization

Entropy regularization is an imported idea in reinforcement learning, wi...
research
02/22/2021

Action Redundancy in Reinforcement Learning

Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning p...
research
03/18/2021

Maximum Entropy Reinforcement Learning with Mixture Policies

Mixture models are an expressive hypothesis class that can approximate a...
research
06/14/2018

Maximum a Posteriori Policy Optimisation

We introduce a new algorithm for reinforcement learning called Maximum a...
research
11/27/2018

Understanding the impact of entropy on policy optimization

Entropy regularization is commonly used to improve policy optimization i...
research
05/16/2022

Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning

Maximum Tsallis entropy (MTE) framework in reinforcement learning has ga...
research
03/04/2021

Inverse Reinforcement Learning with Explicit Policy Estimates

Various methods for solving the inverse reinforcement learning (IRL) pro...

Please sign up or login with your details

Forgot password? Click here to reset