Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint

10/22/2018
by   Prashanth L. A., et al.
10

The classic objective in a reinforcement learning (RL) problem is to find a policy that minimizes, in expectation, a long-run objective such as the infinite-horizon discounted or long-run average cost. In many practical applications, optimizing the expected value alone is not sufficient, and it may be necessary to include a risk measure in the optimization process, either as the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., mean-variance tradeoff, exponential utility, the percentile performance, value at risk, conditional value at risk, prospect theory and its later enhancement, cumulative prospect theory. In this article, we focus on the combination of risk criteria and reinforcement learning in a constrained optimization framework, i.e., a setting where the goal to find a policy that optimizes the usual objective of infinite-horizon discounted/average cost, while ensuring that an explicit risk constraint is satisfied. We introduce the risk-constrained RL framework, cover popular risk measures based on variance, conditional value-at-risk and cumulative prospect theory, and present a template for a risk-sensitive RL algorithm. We survey some of our recent work on this topic, covering problems encompassing discounted cost, average cost, and stochastic shortest path settings, together with the aforementioned risk measures in a constrained framework. This non-exhaustive survey is aimed at giving a flavor of the challenges involved in solving a risk-sensitive RL problem, and outlining some potential future research directions.

READ FULL TEXT

page 21

page 25

page 29

page 32

page 33

page 34

research
01/14/2023

Risk-Averse Reinforcement Learning via Dynamic Time-Consistent Risk Measures

Traditional reinforcement learning (RL) aims to maximize the expected to...
research
12/28/2020

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy

While deep reinforcement learning has achieved tremendous successes in v...
research
11/28/2022

Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability

Constrained reinforcement learning (RL) is an area of RL whose objective...
research
10/03/2020

Policy Gradient with Expected Quadratic Utility Maximization: A New Mean-Variance Approach in Reinforcement Learning

In real-world decision-making problems, risk management is critical. Amo...
research
09/12/2023

Risk-Aware Reinforcement Learning through Optimal Transport Theory

In the dynamic and uncertain environments where reinforcement learning (...
research
06/06/2022

Risk-Sensitive Reinforcement Learning: Iterated CVaR and the Worst Path

In this paper, we study a novel episodic risk-sensitive Reinforcement Le...
research
10/28/2011

Risk-sensitive Markov control processes

We introduce a general framework for measuring risk in the context of Ma...

Please sign up or login with your details

Forgot password? Click here to reset