On the Convergence of Policy Gradient in Robust MDPs

12/20/2022
by   Qiuhao Wang, et al.
0

Robust Markov decision processes (RMDPs) are promising models that provide reliable policies under ambiguities in model parameters. As opposed to nominal Markov decision processes (MDPs), however, the state-of-the-art solution methods for RMDPs are limited to value-based methods, such as value iteration and policy iteration. This paper proposes Double-Loop Robust Policy Gradient (DRPG), the first generic policy gradient method for RMDPs with a global convergence guarantee in tabular problems. Unlike value-based methods, DRPG does not rely on dynamic programming techniques. In particular, the inner-loop robust policy evaluation problem is solved via projected gradient descent. Finally, our experimental results demonstrate the performance of our algorithm and verify our theoretical guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2023

Policy Gradient for s-Rectangular Robust Markov Decision Processes

We present a novel robust policy gradient method (RPG) for s-rectangular...
research
05/30/2023

Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

We propose a policy gradient algorithm for robust infinite-horizon Marko...
research
08/01/2019

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Policy gradient methods are among the most effective methods in challeng...
research
10/13/2022

Reinforcement Learning with Unbiased Policy Evaluation and Linear Function Approximation

We provide performance guarantees for a variant of simulation-based poli...
research
05/08/2012

Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds

Approximate dynamic programming is a popular method for solving large Ma...
research
07/12/2022

Compactly Restrictable Metric Policy Optimization Problems

We study policy optimization problems for deterministic Markov decision ...
research
05/30/2023

Solving Robust MDPs through No-Regret Dynamics

Reinforcement Learning is a powerful framework for training agents to na...

Please sign up or login with your details

Forgot password? Click here to reset