Risk-sensitive Actor-free Policy via Convex Optimization

06/30/2023
by   Ruoqi Zhang, et al.
0

Traditional reinforcement learning methods optimize agents without considering safety, potentially resulting in unintended consequences. In this paper, we propose an optimal actor-free policy that optimizes a risk-sensitive criterion based on the conditional value at risk. The risk-sensitive objective function is modeled using an input-convex neural network ensuring convexity with respect to the actions and enabling the identification of globally optimal actions through simple gradient-following methods. Experimental results demonstrate the efficacy of our approach in maintaining effective risk control.

READ FULL TEXT

page 1

page 2

page 3

research
12/26/2021

Reinforcement Learning with Dynamic Convex Risk Measures

We develop an approach for solving time-consistent risk-sensitive stocha...
research
06/12/2014

Algorithms for CVaR Optimization in MDPs

In many sequential decision-making problems we may want to manage risk b...
research
06/20/2020

Entropic Risk Constrained Soft-Robust Policy Optimization

Having a perfect model to compute the optimal policy is often infeasible...
research
07/02/2023

Is Risk-Sensitive Reinforcement Learning Properly Resolved?

Due to the nature of risk management in learning applicable policies, ri...
research
11/09/2019

Worst Cases Policy Gradients

Recent advances in deep reinforcement learning have demonstrated the cap...
research
01/24/2022

TOPS: Transition-based VOlatility-controlled Policy Search and its Global Convergence

Risk-averse problems receive far less attention than risk-neutral contro...
research
01/24/2019

Fairness risk measures

Ensuring that classifiers are non-discriminatory or fair with respect to...

Please sign up or login with your details

Forgot password? Click here to reset