Deep Quality-Value (DQV) Learning

09/30/2018
by   Matthia Sabatelli, et al.
0

We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning. Similarly to Advantage-Actor-Critic methods, DQV uses a Value neural network for estimating the temporal-difference errors which are then used by a second Quality network for directly learning the state-action values. We first test DQV's update rules with Multilayer Perceptrons as function approximators on two classic RL problems, and then extend DQV with the use of Deep Convolutional Neural Networks, `Experience Replay' and `Target Neural Networks' for tackling four games of the Atari Arcade Learning environment. Our results show that DQV learns significantly faster and better than Deep Q-Learning and Double Deep Q-Learning, suggesting that our algorithm can potentially be a better performing synchronous temporal difference algorithm than what is currently present in DRL.

READ FULL TEXT

page 6

page 7

research
09/19/2022

MAN: Multi-Action Networks Learning

Learning control policies with large action spaces is a challenging prob...
research
09/01/2019

Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms

This paper makes one step forward towards characterizing a new family of...
research
04/19/2022

Network Topology Optimization via Deep Reinforcement Learning

Topology impacts important network performance metrics, including link u...
research
09/22/2021

Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods

In value-based deep reinforcement learning methods, approximation of val...
research
09/10/2021

Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA

This paper explores a Deep Reinforcement Learning (DRL) approach for des...
research
05/06/2021

Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning

Deep Reinforcement Learning (DRL) has shown outstanding performance on i...
research
12/21/2018

Introducing Neuromodulation in Deep Neural Networks to Learn Adaptive Behaviours

In this paper, we propose a new deep neural network architecture, called...

Please sign up or login with your details

Forgot password? Click here to reset