Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms

09/01/2019
by   Matthia Sabatelli, et al.
24

This paper makes one step forward towards characterizing a new family of model-free Deep Reinforcement Learning (DRL) algorithms. The aim of these algorithms is to jointly learn an approximation of the state-value function (V), alongside an approximation of the state-action value function (Q). Our analysis starts with a thorough study of the Deep Quality-Value Learning (DQV) algorithm, a DRL algorithm which has been shown to outperform popular techniques such as Deep-Q-Learning (DQN) and Double-Deep-Q-Learning (DDQN) sabatelli2018deep. Intending to investigate why DQV's learning dynamics allow this algorithm to perform so well, we formulate a set of research questions which help us characterize a new family of DRL algorithms. Among our results, we present some specific cases in which DQV's performance can get harmed and introduce a novel off-policy DRL algorithm, called DQV-Max, which can outperform DQV. We then study the behavior of the V and Q functions that are learned by DQV and DQV-Max and show that both algorithms might perform so well on several DRL test-beds because they are less prone to suffer from the overestimation bias of the Q function.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

QVMix and QVMix-Max: Extending the Deep Quality-Value Family of Algorithms to Cooperative Multi-Agent Reinforcement Learning

This paper introduces four new algorithms that can be used for tackling ...
research
06/09/2023

Value function estimation using conditional diffusion models for control

A fairly reliable trend in deep reinforcement learning is that the perfo...
research
09/30/2018

Deep Quality-Value (DQV) Learning

We introduce a novel Deep Reinforcement Learning (DRL) algorithm called ...
research
11/07/2016

Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning

Instability and variability of Deep Reinforcement Learning (DRL) algorit...
research
03/20/2020

Deep Reinforcement Learning with Weighted Q-Learning

Overestimation of the maximum action-value is a well-known problem that ...
research
09/10/2021

Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA

This paper explores a Deep Reinforcement Learning (DRL) approach for des...
research
05/21/2017

Shallow Updates for Deep Reinforcement Learning

Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQ...

Please sign up or login with your details

Forgot password? Click here to reset