Parameter Space Noise for Exploration

06/06/2017
by   Matthias Plappert, et al.
0

Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and require significantly more samples. Combining parameter noise with traditional RL methods allows to combine the best of both worlds. We demonstrate that both off- and on-policy methods benefit from this approach through experimental comparison of DQN, DDPG, and TRPO on high-dimensional discrete action environments as well as continuous control tasks. Our results show that RL with parameter noise learns more efficiently than traditional RL with action space noise and evolutionary strategies individually.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

Combine PPO with NES to Improve Exploration

We introduce two approaches for combining neural evolution strategy (NES...
research
02/22/2022

A Comparative Study of Deep Reinforcement Learning-based Transferable Energy Management Strategies for Hybrid Electric Vehicles

The deep reinforcement learning-based energy management strategies (EMS)...
research
01/02/2022

Toward Causal-Aware RL: State-Wise Action-Refined Temporal Difference

Although it is well known that exploration plays a key role in Reinforce...
research
05/30/2021

Shaped Policy Search for Evolutionary Strategies using Waypoints

In this paper, we try to improve exploration in Blackbox methods, partic...
research
06/08/2022

Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Many deep reinforcement learning algorithms rely on simple forms of expl...
research
02/23/2023

Reinforcement Learning for Combining Search Methods in the Calibration of Economic ABMs

Calibrating agent-based models (ABMs) in economics and finance typically...
research
09/18/2018

Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning

This paper proposes an exploration method for deep reinforcement learnin...

Please sign up or login with your details

Forgot password? Click here to reset