The Value Function Polytope in Reinforcement Learning

01/31/2019
by   Robert Dadashi, et al.
10

We establish geometric and topological properties of the space of value functions in finite state-action Markov decision processes. Our main contribution is the characterization of the nature of its shape: a general polytope (Aigner et al., 2010). To demonstrate this result, we exhibit several properties of the structural relationship between policies and value functions including the line theorem, which shows that the value functions of policies constrained on all but one state describe a line segment. Finally, we use this novel perspective to introduce visualizations to enhance the understanding of the dynamics of reinforcement learning algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

On the connection between Bregman divergence and value in regularized Markov decision processes

In this short note we derive a relationship between the Bregman divergen...
research
09/23/2020

CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in Coq

Reinforcement learning algorithms solve sequential decision-making probl...
research
12/28/2018

Differential Temporal Difference Learning

Value functions derived from Markov decision processes arise as a centra...
research
01/30/2022

The Geometry of Robust Value Functions

The space of value functions is a fundamental concept in reinforcement l...
research
01/31/2019

A Geometric Perspective on Optimal Representations for Reinforcement Learning

This paper proposes a new approach to representation learning based on g...
research
07/04/2010

Computational Model of Music Sight Reading: A Reinforcement Learning Approach

Although the Music Sight Reading process has been studied from the cogni...
research
01/17/2022

Detecting danger in gridworlds using Gromov's Link Condition

Gridworlds have been long-utilised in AI research, particularly in reinf...

Please sign up or login with your details

Forgot password? Click here to reset