Deep Q-Network with Proximal Iteration

by   Kavosh Asadi, et al.

We employ Proximal Iteration for value-function optimization in reinforcement learning. Proximal Iteration is a computationally efficient technique that enables us to bias the optimization procedure towards more desirable solutions. As a concrete application of Proximal Iteration in deep reinforcement learning, we endow the objective function of the Deep Q-Network (DQN) agent with a proximal term to ensure that the online-network component of DQN remains in the vicinity of the target network. The resultant agent, which we call DQN with Proximal Iteration, or DQNPro, exhibits significant improvements over the original DQN on the Atari benchmark. Our results accentuate the power of employing sound optimization techniques for deep reinforcement learning.


page 15

page 16


Controlling an Autonomous Vehicle with Deep Reinforcement Learning

We present a control approach for autonomous vehicles based on deep rein...

Proximal Deterministic Policy Gradient

This paper introduces two simple techniques to improve off-policy Reinfo...

Natural Gradient Deep Q-learning

This paper presents findings for training a Q-learning reinforcement lea...

Computational complexity of Inexact Proximal Point Algorithm for Convex Optimization under Holderian Growth

Several decades ago the Proximal Point Algorithm (PPA) stated to gain a ...

Optimization and passive flow control using single-step deep reinforcement learning

This research gauges the ability of deep reinforcement learning (DRL) te...

Proximal Mapping for Deep Regularization

Underpinning the success of deep learning is effective regularizations t...

Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective

Most of the recent deep reinforcement learning advances take an RL-centr...