Actor Loss of Soft Actor Critic Explained

12/31/2021
by   Thibault Lahire, et al.
0

This technical report is devoted to explaining how the actor loss of soft actor critic is obtained, as well as the associated gradient estimate. It gives the necessary mathematical background to derive all the presented equations, from the theoretical actor loss to the one implemented in practice. This necessitates a comparison of the reparameterization trick used in soft actor critic with the nabla log trick, which leads to open questions regarding the most efficient method to use.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2023

The Point to Which Soft Actor-Critic Converges

Soft actor-critic is a successful successor over soft Q-learning. While ...
research
06/28/2023

SARC: Soft Actor Retrospective Critic

The two-time scale nature of SAC, which is an actor-critic algorithm, is...
research
07/20/2023

An Exceptional Actor System (Functional Pearl)

The Glasgow Haskell Compiler is known for its feature-laden runtime syst...
research
05/19/2023

Regularization of Soft Actor-Critic Algorithms with Automatic Temperature Adjustment

This work presents a comprehensive analysis to regularize the Soft Actor...
research
04/13/2021

TASAC: Temporally Abstract Soft Actor-Critic for Continuous Control

We propose temporally abstract soft actor-critic (TASAC), an off-policy ...
research
06/19/2020

Band-limited Soft Actor Critic Model

Soft Actor Critic (SAC) algorithms show remarkable performance in comple...
research
09/27/2022

Regularized Soft Actor-Critic for Behavior Transfer Learning

Existing imitation learning methods mainly focus on making an agent effe...

Please sign up or login with your details

Forgot password? Click here to reset