Expert-Guided Symmetry Detection in Markov Decision Processes

11/19/2021
by   Giorgio Angelotti, et al.
0

Learning a Markov Decision Process (MDP) from a fixed batch of trajectories is a non-trivial task whose outcome's quality depends on both the amount and the diversity of the sampled regions of the state-action space. Yet, many MDPs are endowed with invariant reward and transition functions with respect to some transformations of the current state and action. Being able to detect and exploit these structures could benefit not only the learning of the MDP but also the computation of its subsequent optimal control policy. In this work we propose a paradigm, based on Density Estimation methods, that aims to detect the presence of some already supposed transformations of the state-action space for which the MDP dynamics is invariant. We tested the proposed approach in a discrete toroidal grid environment and in two notorious environments of OpenAI's Gym Learning Suite. The results demonstrate that the model distributional shift is reduced when the dataset is augmented with the data obtained by using the detected symmetries, allowing for a more thorough and data-efficient learning of the transition functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2021

Exploiting Expert-guided Symmetry Detection in Markov Decision Processes

Offline estimation of the dynamical model of a Markov Decision Process (...
research
06/17/2023

FP-IRL: Fokker-Planck-based Inverse Reinforcement Learning – A Physics-Constrained Approach to Markov Decision Processes

Inverse Reinforcement Learning (IRL) is a compelling technique for revea...
research
12/12/2012

Anytime State-Based Solution Methods for Decision Processes with non-Markovian Rewards

A popular approach to solving a decision process with non-Markovian rewa...
research
10/30/2022

Reward Shaping Using Convolutional Neural Network

In this paper, we propose Value Iteration Network for Reward Shaping (VI...
research
05/03/2021

Learning Good State and Action Representations via Tensor Decomposition

The transition kernel of a continuous-state-action Markov decision proce...
research
09/26/2019

Markov Decision Process for Video Generation

We identify two pathological cases of temporal inconsistencies in video ...
research
10/27/2021

Play to Grade: Testing Coding Games as Classifying Markov Decision Process

Contemporary coding education often presents students with the task of d...

Please sign up or login with your details

Forgot password? Click here to reset