Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

09/07/2019
by   Wenjie Shi, et al.
0

Model-free deep reinforcement learning (RL) algorithms have been widely used for a range of complex control tasks. However, slow convergence and sample inefficiency remain challenging problems in RL, especially when handling continuous and high-dimensional state spaces. To tackle this problem, we propose a general acceleration method for model-free, off-policy deep RL algorithms by drawing the idea underlying regularized Anderson acceleration (RAA), which is an effective approach to accelerating the solving of fixed point problems with perturbations. Specifically, we first explain how policy iteration can be applied directly with Anderson acceleration. Then we extend RAA to the case of deep RL by introducing a regularization term to control the impact of perturbation induced by function approximation errors. We further propose two strategies, i.e., progressive update and adaptive restart, to enhance the performance. The effectiveness of our method is evaluated on a variety of benchmark tasks, including Atari 2600 and MuJoCo. Experimental results show that our approach substantially improves both the learning speed and final performance of state-of-the-art deep RL algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2016

Continuous Deep Q-Learning with Model-based Acceleration

Model-free reinforcement learning has been successfully applied to a ran...
research
10/17/2021

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Anderson mixing has been heuristically applied to reinforcement learning...
research
09/25/2018

Anderson Acceleration for Reinforcement Learning

Anderson acceleration is an old and simple method for accelerating the c...
research
10/16/2022

Entropy Regularized Reinforcement Learning with Cascading Networks

Deep Reinforcement Learning (Deep RL) has had incredible achievements on...
research
04/13/2021

Muesli: Combining Improvements in Policy Optimization

We propose a novel policy update that combines regularized policy optimi...
research
05/14/2019

Control Regularization for Reduced Variance Reinforcement Learning

Dealing with high variance is a significant challenge in model-free rein...
research
12/13/2020

Reinforcement Learning with Subspaces using Free Energy Paradigm

In large-scale problems, standard reinforcement learning algorithms suff...

Please sign up or login with your details

Forgot password? Click here to reset