Error Controlled Actor-Critic

09/06/2021
by   Xingen Gao, et al.
0

On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 7

page 9

page 11

research
02/26/2018

Addressing Function Approximation Error in Actor-Critic Methods

In value-based reinforcement learning methods such as deep Q-learning, f...
research
06/19/2020

Band-limited Soft Actor Critic Model

Soft Actor Critic (SAC) algorithms show remarkable performance in comple...
research
05/02/2010

Adaptive Bases for Reinforcement Learning

We consider the problem of reinforcement learning using function approxi...
research
10/10/2022

Actor-Critic or Critic-Actor? A Tale of Two Time Scales

We revisit the standard formulation of tabular actor-critic algorithm as...
research
12/20/2017

Pseudorehearsal in actor-critic agents with neural network function approximation

Catastrophic forgetting has a significant negative impact in reinforceme...
research
05/24/2023

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

Actor-critic (AC) methods are widely used in reinforcement learning (RL)...
research
06/24/2021

Mix and Mask Actor-Critic Methods

Shared feature spaces for actor-critic methods aims to capture generaliz...

Please sign up or login with your details

Forgot password? Click here to reset