Control Regularization for Reduced Variance Reinforcement Learning

05/14/2019
by   Richard Cheng, et al.
4

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/22/2020

Stability-Guaranteed Reinforcement Learning for Contact-rich Manipulation

Reinforcement learning (RL) has had its fair share of success in contact...
12/06/2021

Functional Regularization for Reinforcement Learning via Learned Fourier Features

We propose a simple architecture for deep reinforcement learning by embe...
09/07/2019

Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning

Model-free deep reinforcement learning (RL) algorithms have been widely ...
07/01/2019

FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

In recent years significant progress has been made in dealing with chall...
11/27/2017

Divide-and-Conquer Reinforcement Learning

Standard model-free deep reinforcement learning (RL) algorithms sample a...
07/04/2020

Discount Factor as a Regularizer in Reinforcement Learning

Specifying a Reinforcement Learning (RL) task involves choosing a suitab...
11/01/2018

Temporal Regularization in Markov Decision Process

Several applications of Reinforcement Learning suffer from instability d...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.