CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

06/23/2022
by   Tairan Huang, et al.
2

Training a game-playing reinforcement learning agent requires multiple interactions with the environment. Ignorant random exploration may cause a waste of time and resources. It's essential to alleviate such waste. As discussed in this paper, under the settings of the off-policy actor critic algorithms, we demonstrate that the critic can bring more expected discounted rewards than or at least equal to the actor. Thus, the Q value predicted by the critic is a better signal to redistribute the action originally sampled from the policy distribution predicted by the actor. This paper introduces the novel Critic Guided Action Redistribution (CGAR) algorithm and tests it on the OpenAI MuJoCo tasks. The experimental results demonstrate that our method improves the sample efficiency and achieves state-of-the-art performance. Our code can be found at https://github.com/tairanhuang/CGAR.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2019

Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics

Value-based reinforcement-learning algorithms are currently state-of-the...
research
02/08/2021

Adversarially Guided Actor-Critic

Despite definite success in deep reinforcement learning problems, actor-...
research
06/25/2022

Guided Exploration in Reinforcement Learning via Monte Carlo Critic Optimization

The class of deep deterministic off-policy algorithms is effectively app...
research
09/21/2022

Revisiting Discrete Soft Actor-Critic

We study the adaption of soft actor-critic (SAC) from continuous action ...
research
12/03/1998

Training Reinforcement Neurocontrollers Using the Polytope Algorithm

A new training algorithm is presented for delayed reinforcement learning...
research
02/25/2020

Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration

Off-policy reinforcement learning (RL) is concerned with learning a rewa...
research
05/24/2021

GMAC: A Distributional Perspective on Actor-Critic Framework

In this paper, we devise a distributional framework on actor-critic as a...

Please sign up or login with your details

Forgot password? Click here to reset