Regularization Matters in Policy Optimization

10/21/2019
by   Zhuang Liu, et al.
21

Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., L_2 regularization, dropout) have been largely ignored in RL methods, possibly because agents are typically trained and evaluated in the same environment. In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks. Interestingly, we find conventional regularization techniques on the policy networks can often bring large improvement on the task performance, and the improvement is typically more significant when the task is more difficult. We also compare with the widely used entropy regularization and find L_2 regularization is generally better. Our findings are further confirmed to be robust against the choice of training hyperparameters. We also study the effects of regularizing different components and find that only regularizing the policy network is typically enough. We hope our study provides guidance for future practices in regularizing policy optimization algorithms.

READ FULL TEXT

page 5

page 13

page 18

page 21

page 22

page 23

page 24

page 25

research
10/20/2020

Iterative Amortized Policy Optimization

Policy networks are a central feature of deep reinforcement learning (RL...
research
10/26/2021

EnTRPO: Trust Region Policy Optimization Method with Entropy Regularization

Trust Region Policy Optimization (TRPO) is a popular and empirically suc...
research
03/21/2020

Deep Reinforcement Learning with Smooth Policy

Deep neural networks have been widely adopted in modern reinforcement le...
research
10/05/2018

Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods

Recent analyses of certain gradient descent optimization methods have sh...
research
11/27/2018

Understanding the impact of entropy in policy learning

Entropy regularization is commonly used to improve policy optimization i...
research
11/27/2018

Understanding the impact of entropy on policy optimization

Entropy regularization is commonly used to improve policy optimization i...
research
10/07/2020

Proximal Policy Optimization with Relative Pearson Divergence

Deep reinforcement learning (DRL) is one of the promising approaches for...

Please sign up or login with your details

Forgot password? Click here to reset