Soft Actor-Critic Algorithms and Applications

by   Tuomas Haarnoja, et al.

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity and brittleness to hyperparameters. Both of these challenges limit the applicability of such methods to real-world domains. In this paper, we describe Soft Actor-Critic (SAC), our recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework. In this framework, the actor aims to simultaneously maximize expected return and entropy. That is, to succeed at the task while acting as randomly as possible. We extend SAC to incorporate a number of modifications that accelerate training and improve stability with respect to the hyperparameters, including a constrained formulation that automatically tunes the temperature hyperparameter. We systematically evaluate SAC on a range of benchmark tasks, as well as real-world challenging tasks such as locomotion for a quadrupedal robot and robotic manipulation with a dexterous hand. With these improvements, SAC achieves state-of-the-art performance, outperforming prior on-policy and off-policy methods in sample-efficiency and asymptotic performance. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving similar performance across different random seeds. These results suggest that SAC is a promising candidate for learning in real-world robotics tasks.


page 11

page 12


Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Model-free deep reinforcement learning (RL) algorithms have been demonst...

Learning to Walk via Deep Reinforcement Learning

Deep reinforcement learning suggests the promise of fully automated lear...

Soft Actor-Critic Algorithm with Truly Inequality Constraint

Soft actor-critic (SAC) in reinforcement learning is expected to be one ...

Asymmetric Actor Critic for Image-Based Robot Learning

Deep reinforcement learning (RL) has proven a powerful technique in many...

MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks

Fast and efficient transport protocols are the foundation of an increasi...

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Standard approaches to sequential decision-making exploit an agent's abi...

Evolve To Control: Evolution-based Soft Actor-Critic for Scalable Reinforcement Learning

Advances in Reinforcement Learning (RL) have successfully tackled sample...

Code Repositories


Modified versions of the SAC algorithm from spinningup for discrete action spaces and image observations.

view repo


A PyTorch implementation of Soft Actor-Critic(SAC).

view repo


A PyTorch implementation of GAIL and AIRL based on PPO.

view repo


A PyTorch implementation of Distribution Correction(DisCor) based on Soft Actor-Critic.

view repo


A library for reinforcement learning research

view repo

Please sign up or login with your details

Forgot password? Click here to reset