Entropy-Augmented Entropy-Regularized Reinforcement Learning and a Continuous Path from Policy Gradient to Q-Learning

05/18/2020
by   Donghoon Lee, et al.
0

Entropy augmented to reward is known to soften the greedy argmax policy to softmax policy. Entropy augmentation is reformulated and leads to a motivation to introduce an additional entropy term to the objective function in the form of KL-divergence to regularize optimization process. It results in a policy interpolating between the current policy and the softmax greedy policy. This policy is used to build a continuously parameterized algorithm which optimize policy and Q-function simultaneously and whose extreme limits correspond to policy gradient and Q-learning, respectively. Experiments show that there can be a performance gain using an intermediate algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

On the Global Convergence Rates of Softmax Policy Gradient Methods

We make three contributions toward better understanding policy gradient ...
research
05/22/2017

Deep Reinforcement Learning with Relative Entropy Stochastic Search

Many reinforcement learning methods for continuous control tasks are bas...
research
08/12/2021

A functional mirror ascent view of policy gradient methods with function approximation

We use functional mirror ascent to propose a general framework (referred...
research
08/22/2023

Careful at Estimation and Bold at Exploration

Exploration strategies in continuous action space are often heuristic du...
research
12/22/2017

A short variational proof of equivalence between policy gradients and soft Q learning

Two main families of reinforcement learning algorithms, Q-learning and p...
research
07/18/2022

MAD for Robust Reinforcement Learning in Machine Translation

We introduce a new distributed policy gradient algorithm and show that i...
research
01/27/2021

OffCon^3: What is state of the art anyway?

Two popular approaches to model-free continuous control tasks are SAC an...

Please sign up or login with your details

Forgot password? Click here to reset