A Simple Approach for State-Action Abstraction using a Learned MDP Homomorphism

Animals are able to rapidly infer from limited experience when sets of state action pairs have equivalent reward and transition dynamics. On the other hand, modern reinforcement learning systems must painstakingly learn through trial and error that sets of state action pairs are value equivalent – requiring an often prohibitively large amount of samples from their environment. MDP homomorphisms have been proposed that reduce the observed MDP of an environment to an abstract MDP, which can enable more sample efficient policy learning. Consequently, impressive improvements in sample efficiency have been achieved when a suitable MDP homomorphism can be constructed a priori – usually by exploiting a practioner's knowledge of environment symmetries. We propose a novel approach to constructing a homomorphism in discrete action spaces, which uses a partial model of environment dynamics to infer which state action pairs lead to the same state – reducing the size of the state-action space by a factor equal to the cardinality of the action space. We call this method equivalent effect abstraction. In a gridworld setting, we demonstrate empirically that equivalent effect abstraction can improve sample efficiency in a model-free setting and planning efficiency for modelbased approaches. Furthermore, we show on cartpole that our approach outperforms an existing method for learning homomorphisms, while using 33x less training data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/09/2023

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

Reinforcement learning on high-dimensional and complex problems relies o...
research
09/15/2022

Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Abstraction has been widely studied as a way to improve the efficiency a...
research
06/07/2022

Discrete State-Action Abstraction via the Successor Representation

When reinforcement learning is applied with sparse rewards, agents must ...
research
01/06/2021

Learn Dynamic-Aware State Embedding for Transfer Learning

Transfer reinforcement learning aims to improve the sample efficiency of...
research
06/08/2023

Active Inference in Hebbian Learning Networks

This work studies how brain-inspired neural ensembles equipped with loca...
research
06/30/2020

MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

This paper introduces MDP homomorphic networks for deep reinforcement le...
research
06/27/2022

Causal Dynamics Learning for Task-Independent State Abstraction

Learning dynamics models accurately is an important goal for Model-Based...

Please sign up or login with your details

Forgot password? Click here to reset