Deep Reinforcement Learning with Smooth Policy

03/21/2020
by   Qianli Shen, et al.
54

Deep neural networks have been widely adopted in modern reinforcement learning (RL) algorithms with great empirical successes in various domains. However, the large search space of training a neural network requires a significant amount of data, which makes the current RL algorithms not sample efficient. Motivated by the fact that many environments with continuous state space have smooth transitions, we propose to learn a smooth policy that behaves smoothly with respect to states. In contrast to policies parameterized by linear/reproducing kernel functions, where simple regularization techniques suffice to control smoothness, for neural network based reinforcement learning algorithms, there is no readily available solution to learn a smooth policy. In this paper, we develop a new training framework —Smooth Regularized Reinforcement Learning (SR^2L), where the policy is trained with smoothness-inducing regularization. Such regularization effectively constrains the search space of the learning algorithms and enforces smoothness in the learned policy. We apply the proposed framework to both on-policy (TRPO) and off-policy algorithm (DDPG). Through extensive experiments, we demonstrate that our method achieves improved sample efficiency.

READ FULL TEXT

page 9

page 14

research
11/03/2021

Smooth Imitation Learning via Smooth Costs and Smooth Policies

Imitation learning (IL) is a popular approach in the continuous control ...
research
02/15/2022

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

This paper proposes a new regularization technique for reinforcement lea...
research
11/30/2020

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

Although deep reinforcement learning (RL) has been successfully applied ...
research
10/21/2019

Regularization Matters in Policy Optimization

Deep Reinforcement Learning (Deep RL) has been receiving increasingly mo...
research
04/20/2023

Efficient Deep Reinforcement Learning Requires Regulating Overfitting

Deep reinforcement learning algorithms that learn policies by trial-and-...
research
08/27/2020

Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning

Reinforcement Learning (RL) can be used to fit a mapping from patient st...
research
03/13/2023

Path Planning using Reinforcement Learning: A Policy Iteration Approach

With the impact of real-time processing being realized in the recent pas...

Please sign up or login with your details

Forgot password? Click here to reset