Regularization of Soft Actor-Critic Algorithms with Automatic Temperature Adjustment

05/19/2023
by   Ben You, et al.
0

This work presents a comprehensive analysis to regularize the Soft Actor-Critic (SAC) algorithm with automatic temperature adjustment. The the policy evaluation, the policy improvement and the temperature adjustment are reformulated, addressing certain modification and enhancing the clarity of the original theory in a more explicit manner.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/03/2020

Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient

Exploration-exploitation dilemma has long been a crucial issue in reinfo...
research
12/31/2021

Actor Loss of Soft Actor Critic Explained

This technical report is devoted to explaining how the actor loss of sof...
research
03/01/2023

The Point to Which Soft Actor-Critic Converges

Soft actor-critic is a successful successor over soft Q-learning. While ...
research
07/02/2019

Modified Actor-Critics

Robot Learning, from a control point of view, often involves continuous ...
research
11/28/2021

Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

Maximum Entropy Reinforcement Learning (MaxEnt RL) algorithms such as So...
research
12/06/2021

Target Entropy Annealing for Discrete Soft Actor-Critic

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in ...
research
10/21/2021

Actor-critic is implicitly biased towards high entropy optimal policies

We show that the simplest actor-critic method – a linear softmax policy ...

Please sign up or login with your details

Forgot password? Click here to reset