Entropic Risk Constrained Soft-Robust Policy Optimization

06/20/2020
by   Reazul Hasan Russel, et al.
0

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties. In this paper, we propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty. We demonstrate the usefulness of our algorithms on several problem domains.

READ FULL TEXT
research
03/11/2018

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive an optimal behavior that ac...
research
12/18/2022

Risk-Sensitive Reinforcement Learning with Exponential Criteria

While risk-neutral reinforcement learning has shown experimental success...
research
08/05/2021

Lyapunov Robust Constrained-MDPs: Soft-Constrained Robustly Stable Policy Optimization under Model Uncertainty

Safety and robustness are two desired properties for any reinforcement l...
research
06/01/2020

Robust Reinforcement Learning with Wasserstein Constraint

Robust Reinforcement Learning aims to find the optimal policy with some ...
research
01/24/2022

TOPS: Transition-based VOlatility-controlled Policy Search and its Global Convergence

Risk-averse problems receive far less attention than risk-neutral contro...
research
06/30/2023

Risk-sensitive Actor-free Policy via Convex Optimization

Traditional reinforcement learning methods optimize agents without consi...
research
05/10/2022

Efficient Risk-Averse Reinforcement Learning

In risk-averse reinforcement learning (RL), the goal is to optimize some...

Please sign up or login with your details

Forgot password? Click here to reset