Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path

06/06/2022
by   Yihan Du, et al.
0

In this paper, we study a novel episodic risk-sensitive Reinforcement Learning (RL) problem, named Iterated CVaR RL, where the objective is to maximize the tail of the reward-to-go at each step. Different from existing risk-aware RL formulations, Iterated CVaR RL focuses on safety-at-all-time, by enabling the agent to tightly control the risk of getting into catastrophic situations at each stage, and is applicable to important risk-sensitive tasks that demand strong safety guarantees throughout the decision process, such as autonomous driving, clinical treatment planning and robotics. We investigate Iterated CVaR RL with two performance metrics, i.e., Regret Minimization and Best Policy Identification. For both metrics, we design efficient algorithms ICVaR-RM and ICVaR-BPI, respectively, and provide matching upper and lower bounds with respect to the number of episodes K. We also investigate an interesting limiting case of Iterated CVaR RL, called Worst Path RL, where the objective becomes to maximize the minimum possible cumulative reward, and propose an efficient algorithm with constant upper and lower bounds. Finally, the techniques we develop for bounding the change of CVaR due to the value function shift and decomposing the regret via a distorted visitation distribution are novel and can find applications in other risk-sensitive online learning problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2023

Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation

Risk-sensitive reinforcement learning (RL) aims to optimize policies tha...
research
02/16/2022

Branching Reinforcement Learning

In this paper, we propose a novel Branching Reinforcement Learning (Bran...
research
11/06/2021

Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning

We study risk-sensitive reinforcement learning (RL) based on the entropi...
research
10/11/2022

Regret Bounds for Risk-Sensitive Reinforcement Learning

In safety-critical applications of reinforcement learning such as health...
research
06/09/2022

Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk

Though deep reinforcement learning (DRL) has obtained substantial succes...
research
08/24/2023

Extreme Risk Mitigation in Reinforcement Learning using Extreme Value Theory

Risk-sensitive reinforcement learning (RL) has garnered significant atte...
research
10/22/2018

Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint

The classic objective in a reinforcement learning (RL) problem is to fin...

Please sign up or login with your details

Forgot password? Click here to reset