Policy Gradient for s-Rectangular Robust Markov Decision Processes

01/31/2023
by   Navdeep Kumar, et al.
0

We present a novel robust policy gradient method (RPG) for s-rectangular robust Markov Decision Processes (MDPs). We are the first to derive the adversarial kernel in a closed form and demonstrate that it is a one-rank perturbation of the nominal kernel. This allows us to derive an RPG that is similar to the one used in non-robust MDPs, except with a robust Q-value function and an additional correction term. Both robust Q-values and correction terms are efficiently computable, thus the time complexity of our method matches that of non-robust MDPs, which is significantly faster compared to existing black box methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2023

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

We present an efficient robust value iteration for -rectangular robust M...
research
12/20/2022

On the Convergence of Policy Gradient in Robust MDPs

Robust Markov decision processes (RMDPs) are promising models that provi...
research
07/29/2015

A Gauss-Newton Method for Markov Decision Processes

Approximate Newton methods are a standard optimization tool which aim to...
research
05/30/2023

Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

We propose a policy gradient algorithm for robust infinite-horizon Marko...
research
02/02/2023

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

Robust Markov Decision Processes (MDPs) are getting more attention for l...
research
05/30/2023

Solving Robust MDPs through No-Regret Dynamics

Reinforcement Learning is a powerful framework for training agents to na...
research
10/12/2021

Twice regularized MDPs and the equivalence between robustness and regularization

Robust Markov decision processes (MDPs) aim to handle changing or partia...

Please sign up or login with your details

Forgot password? Click here to reset