The Potential of the Return Distribution for Exploration in RL

06/11/2018
by   Thomas M. Moerland, et al.
0

This paper studies the potential of the return distribution for exploration in deterministic environments. We study network losses and propagation mechanisms for Gaussian, Categorical and Mixture of Gaussian distributions. Combined with exploration policies that leverage this return distribution, we solve, for example, a randomized Chain task of length 100, which has not been reported before when learning with neural networks.

READ FULL TEXT
research
11/29/2017

Efficient exploration with Double Uncertain Value Networks

This paper studies directed exploration for reinforcement learning agent...
research
08/03/2023

Bag of Policies for Distributional Deep Exploration

Efficient exploration in complex environments remains a major challenge ...
research
03/20/2021

Bayesian Distributional Policy Gradients

Distributional Reinforcement Learning (RL) maintains the entire probabil...
research
07/24/2020

Distributional Reinforcement Learning with Maximum Mean Discrepancy

Distributional reinforcement learning (RL) has achieved state-of-the-art...
research
03/29/2022

When to Go, and When to Explore: The Benefit of Post-Exploration in Intrinsic Motivation

Go-Explore achieved breakthrough performance on challenging reinforcemen...
research
07/04/2023

A Scalable Reinforcement Learning-based System Using On-Chain Data for Cryptocurrency Portfolio Management

On-chain data (metrics) of blockchain networks, akin to company fundamen...
research
02/01/2022

A Statistical Model of Serve Return Impact Patterns in Professional Tennis

The spread in the use of tracking systems in sport has made fine-grained...

Please sign up or login with your details

Forgot password? Click here to reset