GAN Q-learning

05/13/2018
by   Thang Doan, et al.
0

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adversarial networks (GANs) and analyze its performance in simple tabular environments, as well as OpenAI Gym. We empirically show that our algorithm leverages the flexibility and blackbox approach of deep learning models while providing a viable alternative to other state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2018

Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN

The recently proposed distributional approach to reinforcement learning ...
research
01/30/2019

A Comparative Analysis of Expected and Distributional Reinforcement Learning

Since their introduction a year ago, distributional approaches to reinfo...
research
04/03/2023

A Tutorial Introduction to Reinforcement Learning

In this paper, we present a brief survey of Reinforcement Learning (RL),...
research
03/24/2020

Distributional Reinforcement Learning with Ensembles

It is well-known that ensemble methods often provide enhanced performanc...
research
05/20/2018

Nonlinear Distributional Gradient Temporal-Difference Learning

We devise a distributional variant of gradient temporal-difference (TD) ...
research
05/25/2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

While distributional reinforcement learning (RL) has demonstrated empiri...
research
09/17/2021

Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations

In real scenarios, state observations that an agent observes may contain...

Please sign up or login with your details

Forgot password? Click here to reset