Addressing Function Approximation Error in Actor-Critic Methods

02/26/2018
by   Scott Fujimoto, et al.
0

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies. We show that this problem persists in an actor-critic setting and propose novel mechanisms to minimize its effects on both the actor and critic. Our algorithm takes the minimum value between a pair of critics to restrict overestimation and delays policy updates to reduce per-update error. We evaluate our method on the suite of OpenAI gym tasks, outperforming the state of the art in every environment tested.

READ FULL TEXT

page 6

page 14

research
09/22/2021

Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods

In value-based deep reinforcement learning methods, approximation of val...
research
09/06/2021

Error Controlled Actor-Critic

On error of value function inevitably causes an overestimation phenomeno...
research
09/18/2020

GRAC: Self-Guided and Self-Regularized Actor-Critic

Deep reinforcement learning (DRL) algorithms have successfully been demo...
research
02/07/2021

Deep Reinforcement Learning with Dynamic Optimism

In recent years, deep off-policy actor-critic algorithms have become a d...
research
09/08/2021

ADER:Adapting between Exploration and Robustness for Actor-Critic Methods

Combining off-policy reinforcement learning methods with function approx...
research
06/20/2023

Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap

Warm-Start reinforcement learning (RL), aided by a prior policy obtained...
research
03/02/2021

Offline Reinforcement Learning with Pseudometric Learning

Offline Reinforcement Learning methods seek to learn a policy from logge...

Please sign up or login with your details

Forgot password? Click here to reset