Regret Bounds for Risk-Sensitive Reinforcement Learning

10/11/2022
by   O. Bastani, et al.
0

In safety-critical applications of reinforcement learning such as healthcare and robotics, it is often desirable to optimize risk-sensitive objectives that account for tail outcomes rather than expected reward. We prove the first regret bounds for reinforcement learning under a general class of risk-sensitive objectives including the popular CVaR objective. Our theory is based on a novel characterization of the CVaR objective as well as a novel optimistic MDP construction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/22/2020

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret

We study risk-sensitive reinforcement learning in episodic Markov decisi...
research
07/02/2023

Is Risk-Sensitive Reinforcement Learning Properly Resolved?

Due to the nature of risk management in learning applicable policies, ri...
research
06/06/2022

Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path

In this paper, we study a novel episodic risk-sensitive Reinforcement Le...
research
11/28/2017

Risk-sensitive Inverse Reinforcement Learning via Semi- and Non-Parametric Methods

The literature on Inverse Reinforcement Learning (IRL) typically assumes...
research
11/06/2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

We study risk-sensitive reinforcement learning (RL) based on the entropi...
research
08/19/2022

A Risk-Sensitive Approach to Policy Optimization

Standard deep reinforcement learning (DRL) aims to maximize expected rew...
research
05/26/2017

Risk-Sensitive Cooperative Games for Human-Machine Systems

Autonomous systems can substantially enhance a human's efficiency and ef...

Please sign up or login with your details

Forgot password? Click here to reset