Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective

05/11/2021
by   Florin Gogianu, et al.
7

Most of the recent deep reinforcement learning advances take an RL-centric perspective and focus on refinements of the training objective. We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. Constraining the Lipschitz constant of a single layer using spectral normalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of a more elaborated agent on the challenging Atari domain. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics and show that is sufficient to modulate the parameter updates to recover most of the performance of spectral normalisation. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.

READ FULL TEXT

page 5

page 14

page 17

page 18

page 19

page 22

page 23

page 24

research
03/06/2017

Neural Episodic Control

Deep reinforcement learning methods attain super-human performance in a ...
research
01/06/2018

Faster Deep Q-learning using Neural Episodic Control

The Research on deep reinforcement learning to estimate Q-value by deep ...
research
10/06/2017

Rainbow: Combining Improvements in Deep Reinforcement Learning

The deep reinforcement learning community has made several independent i...
research
02/10/2022

Abstraction for Deep Reinforcement Learning

We characterise the problem of abstraction in the context of deep reinfo...
research
12/10/2021

Deep Q-Network with Proximal Iteration

We employ Proximal Iteration for value-function optimization in reinforc...
research
02/02/2023

Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function

Probabilistic dynamics model ensemble is widely used in existing model-b...
research
04/25/2019

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning

Rather than proposing a new method, this paper investigates an issue pre...

Please sign up or login with your details

Forgot password? Click here to reset