Bridging the Gap Between Target Networks and Functional Regularization

10/21/2022
by   Alexandre Piché, et al.
0

Bootstrapping is behind much of the successes of Deep Reinforcement Learning. However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values. Target Networks are employed to stabilize training by using an additional set of lagging parameters to estimate the target values. Despite the popularity of Target Networks, their effect on the optimization is still misunderstood. In this work, we show that they act as an implicit regularizer. This regularizer has disadvantages such as being inflexible and non convex. To overcome these issues, we propose an explicit Functional Regularization that is a convex regularizer in function space and can easily be tuned. We analyze the convergence of our method theoretically and empirically demonstrate that replacing Target Networks with the more theoretically grounded Functional Regularization approach leads to better sample efficiency and performance improvements.

READ FULL TEXT
research
06/04/2021

Beyond Target Networks: Improving Deep Q-learning with Functional Regularization

Target networks are at the core of recent success in Reinforcement Learn...
research
12/09/2021

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

Despite overparameterization, deep networks trained via supervised learn...
research
07/04/2020

Discount Factor as a Regularizer in Reinforcement Learning

Specifying a Reinforcement Learning (RL) task involves choosing a suitab...
research
05/06/2021

Inverse Scale Space Iterations for Non-Convex Variational Problems Using Functional Lifting

Non-linear filtering approaches allow to obtain decompositions of images...
research
06/01/2016

Self-Paced Learning: an Implicit Regularization Perspective

Self-paced learning (SPL) mimics the cognitive mechanism of humans and a...
research
11/07/2016

Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning

Instability and variability of Deep Reinforcement Learning (DRL) algorit...
research
08/16/2019

Pseudo-task Regularization for ConvNet Transfer Learning

This paper is about regularizing deep convolutional networks (ConvNets) ...

Please sign up or login with your details

Forgot password? Click here to reset