Soft Actor-Critic Algorithms and Applications

12/13/2018
by   Tuomas Haarnoja, et al.
12

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity and brittleness to hyperparameters. Both of these challenges limit the applicability of such methods to real-world domains. In this paper, we describe Soft Actor-Critic (SAC), our recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework. In this framework, the actor aims to simultaneously maximize expected return and entropy. That is, to succeed at the task while acting as randomly as possible. We extend SAC to incorporate a number of modifications that accelerate training and improve stability with respect to the hyperparameters, including a constrained formulation that automatically tunes the temperature hyperparameter. We systematically evaluate SAC on a range of benchmark tasks, as well as real-world challenging tasks such as locomotion for a quadrupedal robot and robotic manipulation with a dexterous hand. With these improvements, SAC achieves state-of-the-art performance, outperforming prior on-policy and off-policy methods in sample-efficiency and asymptotic performance. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving similar performance across different random seeds. These results suggest that SAC is a promising candidate for learning in real-world robotics tasks.

READ FULL TEXT

page 11

page 12

research
01/04/2018

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Model-free deep reinforcement learning (RL) algorithms have been demonst...
research
12/26/2018

Learning to Walk via Deep Reinforcement Learning

Deep reinforcement learning suggests the promise of fully automated lear...
research
03/08/2023

Soft Actor-Critic Algorithm with Truly Inequality Constraint

Soft actor-critic (SAC) in reinforcement learning is expected to be one ...
research
10/18/2017

Asymmetric Actor Critic for Image-Based Robot Learning

Deep reinforcement learning (RL) has proven a powerful technique in many...
research
02/02/2023

MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks

Fast and efficient transport protocols are the foundation of an increasi...
research
05/05/2023

Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

Standard approaches to sequential decision-making exploit an agent's abi...
research
07/24/2020

Evolve To Control: Evolution-based Soft Actor-Critic for Scalable Reinforcement Learning

Advances in Reinforcement Learning (RL) have successfully tackled sample...

Please sign up or login with your details

Forgot password? Click here to reset