Policy Gradient with Expected Quadratic Utility Maximization: A New Mean-Variance Approach in Reinforcement Learning

by   Masahiro Kato, et al.

In real-world decision-making problems, risk management is critical. Among various risk management approaches, the mean-variance criterion is one of the most widely used in practice. In this paper, we suggest expected quadratic utility maximization (EQUM) as a new framework for policy gradient style reinforcement learning (RL) algorithms with mean-variance control. The quadratic utility function is a common objective of risk management in finance and economics. The proposed EQUM framework has several interpretations, such as reward-constrained variance minimization and regularization, as well as agent utility maximization. In addition, the computation of the EQUM framework is easier than that of existing mean-variance RL methods, which require double sampling. In experiments, we demonstrate the effectiveness of the proposed framework in the benchmarks of RL and financial data.



There are no comments yet.


page 1

page 2

page 3

page 4


Policy Gradients with Variance Related Risk Criteria

Managing risk in dynamic decision problems is of cardinal importance in ...

Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis

We propose policy-gradient algorithms for solving the problem of control...

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

In real-world decision-making problems, for instance in the fields of fi...

Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint

The classic objective in a reinforcement learning (RL) problem is to fin...

Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient

Improving the resilience of a network protects the system from natural d...

Robust Risk-Aware Reinforcement Learning

We present a reinforcement learning (RL) approach for robust optimisatio...

A Block Coordinate Ascent Algorithm for Mean-Variance Optimization

Risk management in dynamic decision problems is a primary concern in man...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.