Contractivity of Bellman Operator in Risk Averse Dynamic Programming with Infinite Horizon

08/03/2022
by   Martin Šmíd, et al.
0

The paper deals with a risk averse dynamic programming problem with infinite horizon. First, the required assumptions are formulated to have the problem well defined. Then the Bellman equation is derived, which may be also seen as a standalone reinforcement learning problem. The fact that the Bellman operator is contraction is proved, guaranteeing convergence of various solution algorithms used for dynamic programming as well as reinforcement learning problems, which we demonstrate on the value iteration algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2023

Beyond dynamic programming

In this paper, we present Score-life programming, a novel theoretical ap...
research
03/02/2020

Risk-Averse Learning by Temporal Difference Methods

We consider reinforcement learning with performance evaluated by a dynam...
research
05/07/2015

Optimal Neuron Selection: NK Echo State Networks for Reinforcement Learning

This paper introduces the NK Echo State Network. The problem of learning...
research
07/03/2020

A Unifying View of Optimism in Episodic Reinforcement Learning

The principle of optimism in the face of uncertainty underpins many theo...
research
02/08/2023

Dynamic Programming for Pure-Strategy Subgame Perfection in an Arbitrary Game

This paper uses value functions to characterize the pure-strategy subgam...
research
10/28/2011

Risk-sensitive Markov control processes

We introduce a general framework for measuring risk in the context of Ma...

Please sign up or login with your details

Forgot password? Click here to reset