Revisiting Discrete Soft Actor-Critic

09/21/2022
by   Haibin Zhou, et al.
0

We study the adaption of soft actor-critic (SAC) from continuous action space to discrete action space. We revisit vanilla SAC and provide an in-depth understanding of its Q value underestimation and performance instability issues when applied to discrete settings. We thereby propose entropy-penalty and double average Q-learning with Q-clip to address these issues. Extensive experiments on typical benchmarks with discrete action space, including Atari games and a large-scale MOBA game, show the efficacy of our proposed method. Our code is at:https://github.com/coldsummerday/Revisiting-Discrete-SAC.

READ FULL TEXT
research
10/16/2019

Soft Actor-Critic for Discrete Action Settings

Soft Actor-Critic is a state-of-the-art reinforcement learning algorithm...
research
12/06/2021

Target Entropy Annealing for Discrete Soft Actor-Critic

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in ...
research
06/23/2022

CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

Training a game-playing reinforcement learning agent requires multiple i...
research
05/08/2022

Simultaneous Double Q-learning with Conservative Advantage Learning for Actor-Critic Methods

Actor-critic Reinforcement Learning (RL) algorithms have achieved impres...
research
10/05/2020

Using Soft Actor-Critic for Low-Level UAV Control

Unmanned Aerial Vehicles (UAVs), or drones, have recently been used in s...
research
05/24/2021

GMAC: A Distributional Perspective on Actor-Critic Framework

In this paper, we devise a distributional framework on actor-critic as a...
research
11/15/2018

Seq2Seq Mimic Games: A Signaling Perspective

We study the emergence of communication in multiagent adversarial settin...

Please sign up or login with your details

Forgot password? Click here to reset