Metrics and continuity in reinforcement learning

02/02/2021
by   Charline Le Lan, et al.
10

In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible. Instead, researchers often leverage state similarity (whether explicitly or implicitly) to build models that can generalize well from a limited set of samples. The notion of state similarity used, and the neighbourhoods and topologies they induce, is thus of crucial importance, as it will directly affect the performance of the algorithms. Indeed, a number of recent works introduce algorithms assuming the existence of "well-behaved" neighbourhoods, but leave the full specification of such topologies for future work. In this paper we introduce a unified formalism for defining these topologies through the lens of metrics. We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process specifying the reinforcement learning problem. We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.

READ FULL TEXT

page 1

page 7

page 19

page 20

research
07/11/2012

Metrics for Finite Markov Decision Processes

We present metrics for measuring the similarity of states in a finite Ma...
research
09/07/2016

Unifying task specification in reinforcement learning

Reinforcement learning tasks are typically specified as Markov decision ...
research
12/14/2022

Reinforcement Learning in System Identification

System identification, also known as learning forward models, transfer f...
research
03/08/2021

A Taxonomy of Similarity Metrics for Markov Decision Processes

Although the notion of task similarity is potentially interesting in a w...
research
03/01/2022

On the Generalization of Representations in Reinforcement Learning

In reinforcement learning, state representations are used to tractably d...
research
03/24/2023

Sequential Knockoffs for Variable Selection in Reinforcement Learning

In real-world applications of reinforcement learning, it is often challe...

Please sign up or login with your details

Forgot password? Click here to reset