Extreme State Aggregation Beyond MDPs

07/12/2014
by   Marcus Hutter, et al.
0

We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp. MDP) assumptions on the environment. State aggregation and more generally feature reinforcement learning is concerned with mapping histories/raw-states to reduced/aggregated states. The idea behind both is that the resulting reduced process (approximately) forms a small stationary finite-state MDP, which can then be efficiently solved or learnt. We considerably generalize existing aggregation results by showing that even if the reduced process is not an MDP, the (q-)value functions and (optimal) policies of an associated MDP with same state-space size solve the original problem, as long as the solution can approximately be represented as a function of the reduced states. This implies an upper bound on the required state space size that holds uniformly for all RL problems. It may also explain why RL algorithms designed for MDPs sometimes perform well beyond MDPs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2020

Exact Reduction of Huge Action Spaces in General Reinforcement Learning

The reinforcement learning (RL) framework formalizes the notion of learn...
research
09/30/2019

Learning Compact Models for Planning with Exogenous Processes

We address the problem of approximate model minimization for MDPs in whi...
research
12/26/2021

Reducing Planning Complexity of General Reinforcement Learning with Non-Markovian Abstractions

The field of General Reinforcement Learning (GRL) formulates the problem...
research
10/16/2018

The Concept of Criticality in Reinforcement Learning

Reinforcement learning methods carry a well known bias-variance trade-of...
research
07/18/2021

A note on the article "On Exploiting Spectral Properties for Solving MDP with Large State Space"

We improve a theoretical result of the article "On Exploiting Spectral P...
research
02/21/2023

Reinforcement Learning in a Birth and Death Process: Breaking the Dependence on the State Space

In this paper, we revisit the regret of undiscounted reinforcement learn...
research
01/06/2021

Learn Dynamic-Aware State Embedding for Transfer Learning

Transfer reinforcement learning aims to improve the sample efficiency of...

Please sign up or login with your details

Forgot password? Click here to reset