Behavior-Guided Actor-Critic: Improving Exploration via Learning Policy Behavior Representation for Deep Reinforcement Learning

04/09/2021
by   Ammar Fayad, et al.
0

In this work, we propose Behavior-Guided Actor-Critic (BAC), an off-policy actor-critic deep RL algorithm. BAC mathematically formulates the behavior of the policy through autoencoders by providing an accurate estimation of how frequently each state-action pair was visited while taking into consideration state dynamics that play a crucial role in determining the trajectories produced by the policy. The agent is encouraged to change its behavior consistently towards less-visited state-action pairs while attaining good performance by maximizing the expected discounted sum of rewards, resulting in an efficient exploration of the environment and good exploitation of all high reward regions. One prominent aspect of our approach is that it is applicable to both stochastic and deterministic actors in contrast to maximum entropy deep reinforcement learning algorithms. Results show considerably better performances of BAC when compared to several cutting-edge learning algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/14/2019

Off-Policy Actor-Critic in an Ensemble: Achieving Maximum General Entropy and Effective Environment Exploration in Deep Reinforcement Learning

We propose a new policy iteration theory as an important extension of so...
research
02/25/2020

Off-Policy Deep Reinforcement Learning with Analogous Disentangled Exploration

Off-policy reinforcement learning (RL) is concerned with learning a rewa...
research
09/09/2019

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

The exploration mechanism used by a Deep Reinforcement Learning (RL) age...
research
02/08/2021

Adversarially Guided Actor-Critic

Despite definite success in deep reinforcement learning problems, actor-...
research
02/20/2017

Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning

Reinforcement Learning algorithms can learn complex behavioral patterns ...
research
12/22/2021

Newsvendor Model with Deep Reinforcement Learning

I present a deep reinforcement learning (RL) solution to the mathematica...
research
05/24/2022

Concurrent Credit Assignment for Data-efficient Reinforcement Learning

The capability to widely sample the state and action spaces is a key ing...

Please sign up or login with your details

Forgot password? Click here to reset