NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration

06/19/2020
by   Shuai Han, et al.
0

Deep reinforcement learning has been applied more and more widely nowadays, especially in various complex control tasks. Effective exploration for noisy networks is one of the most important issues in deep reinforcement learning. Noisy networks tend to produce stable outputs for agents. However, this tendency is not always enough to find a stable policy for an agent, which decreases efficiency and stability during the learning process. Based on NoisyNets, this paper proposes an algorithm called NROWAN-DQN, i.e., Noise Reduction and Online Weight Adjustment NoisyNet-DQN. Firstly, we develop a novel noise reduction method for NoisyNet-DQN to make the agent perform stable actions. Secondly, we design an online weight adjustment strategy for noise reduction, which improves stable performance and gets higher scores for the agent. Finally, we evaluate this algorithm in four standard domains and analyze properties of hyper-parameters. Our results show that NROWAN-DQN outperforms prior algorithms in all these domains. In addition, NROWAN-DQN also shows better stability. The variance of the NROWAN-DQN score is significantly reduced, especially in some action-sensitive environments. This means that in some environments where high stability is required, NROWAN-DQN will be more appropriate than NoisyNets-DQN.

READ FULL TEXT
research
06/25/2020

Noise, overestimation and exploration in Deep Reinforcement Learning

We will discuss some statistical noise related phenomena, that were inve...
research
12/01/2019

Adversary A3C for Robust Reinforcement Learning

Asynchronous Advantage Actor Critic (A3C) is an effective Reinforcement ...
research
06/30/2017

Noisy Networks for Exploration

We introduce NoisyNet, a deep reinforcement learning agent with parametr...
research
05/06/2020

Safe Reinforcement Learning through Meta-learned Instincts

An important goal in reinforcement learning is to create agents that can...
research
10/14/2019

On the Reduction of Variance and Overestimation of Deep Q-Learning

The breakthrough of deep Q-Learning on different types of environments r...
research
11/11/2019

Multi-Path Policy Optimization

Recent years have witnessed a tremendous improvement of deep reinforceme...
research
09/15/2022

Human-level Atari 200x faster

The task of building general agents that perform well over a wide range ...

Please sign up or login with your details

Forgot password? Click here to reset