Gradient Monitored Reinforcement Learning

This paper presents a novel neural network training approach for faster convergence and better generalization abilities in deep reinforcement learning. Particularly, we focus on the enhancement of training and evaluation performance in reinforcement learning algorithms by systematically reducing gradient's variance and thereby providing a more targeted learning process. The proposed method which we term as Gradient Monitoring(GM), is an approach to steer the learning in the weight parameters of a neural network based on the dynamic development and feedback from the training process itself. We propose different variants of the GM methodology which have been proven to increase the underlying performance of the model. The one of the proposed variant, Momentum with Gradient Monitoring (M-WGM), allows for a continuous adjustment of the quantum of back-propagated gradients in the network based on certain learning parameters. We further enhance the method with Adaptive Momentum with Gradient Monitoring (AM-WGM) method which allows for automatic adjustment between focused learning of certain weights versus a more dispersed learning depending on the feedback from the rewards collected. As a by-product, it also allows for automatic derivation of the required deep network sizes during training as the algorithm automatically freezes trained weights. The approach is applied to two discrete (Multi-Robot Co-ordination problem and Atari games) and one continuous control task (MuJoCo) using Advantage Actor-Critic (A2C) and Proximal Policy Optimization (PPO) respectively. The results obtained particularly underline the applicability and performance improvements of the methods in terms of generalization capability.

READ FULL TEXT

page 1

page 7

page 14

research
02/04/2016

Asynchronous Methods for Deep Reinforcement Learning

We propose a conceptually simple and lightweight framework for deep rein...
research
10/28/2019

Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Current Deep Reinforcement Learning algorithms still heavily rely on han...
research
10/02/2019

Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients

In recent years, advances in deep learning have enabled the application ...
research
09/10/2015

Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies

This paper proposes GProp, a deep reinforcement learning algorithm for c...
research
03/20/2018

Natural Gradient Deep Q-learning

This paper presents findings for training a Q-learning reinforcement lea...
research
07/04/2010

A Reinforcement Learning Model Using Neural Networks for Music Sight Reading Learning Problem

Music Sight Reading is a complex process in which when it is occurred in...
research
09/21/2017

Neural Optimizer Search with Reinforcement Learning

We present an approach to automate the process of discovering optimizati...

Please sign up or login with your details

Forgot password? Click here to reset