A general Markov decision process formalism for action-state entropy-regularized reward maximization

02/02/2023
by   Dmytro Grytskyy, et al.
0

Previous work has separately addressed different forms of action, state and action-state entropy regularization, pure exploration and space occupation. These problems have become extremely relevant for regularization, generalization, speeding up learning and providing robust solutions at unprecedented levels. However, solutions of those problems are hectic, ranging from convex and non-convex optimization, and unconstrained optimization to constrained optimization. Here we provide a general dual function formalism that transforms the constrained optimization problem into an unconstrained convex one for any mixture of action and state entropies. The cases with pure action entropy and pure state entropy are understood as limits of the mixture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2023

Inverse Reinforcement Learning With Constraint Recovery

In this work, we propose a novel inverse reinforcement learning (IRL) al...
research
07/13/2023

Towards a resolution of the spin alignment problem

Consider minimizing the entropy of a mixture of states by choosing each ...
research
10/20/2021

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

The problem of constrained Markov decision process (CMDP) is investigate...
research
12/22/2021

Entropy-Regularized Partially Observed Markov Decision Processes

We investigate partially observed Markov decision processes (POMDPs) wit...
research
10/17/2021

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

We study entropy-regularized constrained Markov decision processes (CMDP...
research
06/05/2021

Navigating to the Best Policy in Markov Decision Processes

We investigate the classical active pure exploration problem in Markov D...
research
05/22/2023

A Convex Optimization Framework for Regularized Geodesic Distances

We propose a general convex optimization problem for computing regulariz...

Please sign up or login with your details

Forgot password? Click here to reset