Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis

07/09/2021
by   Nithia Vijayan, et al.
0

We propose policy-gradient algorithms for solving the problem of control in a risk-sensitive reinforcement learning (RL) context. The objective of our algorithm is to maximize the distorted risk measure (DRM) of the cumulative reward in an episodic Markov decision process (MDP). We derive a variant of the policy gradient theorem that caters to the DRM objective. Using this theorem in conjunction with a likelihood ratio (LR) based gradient estimation scheme, we propose policy gradient algorithms for optimizing DRM in both on-policy and off-policy RL settings. We derive non-asymptotic bounds that establish the convergence of our algorithms to an approximate stationary point of the DRM objective.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/22/2022

Approximate gradient ascent methods for distortion risk measures

We propose approximate gradient ascent algorithms for risk-sensitive rei...
research
01/26/2023

On the Global Convergence of Risk-Averse Policy Gradient Methods with Dynamic Time-Consistent Risk Measures

Risk-sensitive reinforcement learning (RL) has become a popular tool to ...
research
01/06/2021

Smoothed functional-based gradient algorithms for off-policy reinforcement learning

We consider the problem of control in an off-policy reinforcement learni...
research
06/20/2023

Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity

This paper focuses on reinforcement learning for the regularized robust ...
research
02/10/2020

Statistically Efficient Off-Policy Policy Gradients

Policy gradient methods in reinforcement learning update policy paramete...
research
07/25/2023

Submodular Reinforcement Learning

In reinforcement learning (RL), rewards of states are typically consider...
research
12/23/2019

Direct and indirect reinforcement learning

Reinforcement learning (RL) algorithms have been successfully applied to...

Please sign up or login with your details

Forgot password? Click here to reset