Distributional Hamilton-Jacobi-Bellman Equations for Continuous-Time Reinforcement Learning

05/24/2022
by   Harley Wiltzer, et al.
0

Continuous-time reinforcement learning offers an appealing formalism for describing control problems in which the passage of time is not naturally divided into discrete increments. Here we consider the problem of predicting the distribution of returns obtained by an agent interacting in a continuous-time, stochastic environment. Accurate return predictions have proven useful for determining optimal policies for risk-sensitive control, learning state representations, multiagent coordination, and more. We begin by establishing the distributional analogue of the Hamilton-Jacobi-Bellman (HJB) equation for Itô diffusions and the broader class of Feller-Dynkin processes. We then specialize this equation to the setting in which the return distribution is approximated by N uniformly-weighted particles, a common design choice in distributional algorithms. Our derivation highlights additional terms due to statistical diffusivity which arise from the proper handling of distributions in the continuous-time setting. Based on this, we propose a tractable algorithm for approximately solving the distributional HJB based on a JKO scheme, which can be implemented in an online control algorithm. We demonstrate the effectiveness of such an algorithm in a synthetic control problem.

READ FULL TEXT
research
12/23/2019

Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time

In this paper, we introduce Hamilton-Jacobi-Bellman (HJB) equations for ...
research
01/31/2022

On solutions of the distributional Bellman equation

In distributional reinforcement learning not only expected returns but t...
research
04/02/2021

Distributional Offline Continuous-Time Reinforcement Learning with Neural Physics-Informed PDEs (SciPhy RL for DOCTR-L)

This paper addresses distributional offline continuous-time reinforcemen...
research
07/21/2017

A Distributional Perspective on Reinforcement Learning

In this paper we argue for the fundamental importance of the value distr...
research
06/28/2022

Risk Perspective Exploration in Distributional Reinforcement Learning

Distributional reinforcement learning demonstrates state-of-the-art perf...
research
12/14/2021

Conjugated Discrete Distributions for Distributional Reinforcement Learning

In this work we continue to build upon recent advances in reinforcement ...
research
06/22/2021

Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models

Differential equations in general and neural ODEs in particular are an e...

Please sign up or login with your details

Forgot password? Click here to reset