Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

04/25/2019
by   Haoran Wang, et al.
0

We approach the continuous-time mean-variance (MV) portfolio selection with reinforcement learning (RL). The problem is to achieve the best tradeoff between exploration and exploitation, and is formulated as an entropy-regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time-decaying variance. We then establish connections between the entropy-regularized MV and the classical MV, including the solvability equivalence and the convergence as exploration weighting parameter decays to zero. Finally, we prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm outperforms both an adaptive control based method and a deep neural networks based algorithm by a large margin in our simulations.

READ FULL TEXT
research
04/25/2019

Continuous-Time Mean-Variance Portfolio Optimization via Reinforcement Learning

We consider continuous-time Mean-variance (MV) portfolio optimization pr...
research
12/04/2018

Exploration versus exploitation in reinforcement learning: a stochastic control approach

We consider reinforcement learning (RL) in continuous time and study the...
research
07/26/2019

Large scale continuous-time mean-variance portfolio allocation via reinforcement learning

We propose to solve large scale Markowitz mean-variance (MV) portfolio a...
research
01/09/2020

Regularity and stability of feedback relaxed controls

This paper proposes a relaxed control regularization with general explor...
research
10/05/2018

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Proximal Policy Optimization (PPO) is a highly popular model-free reinfo...
research
03/19/2018

Impulsive Control for G-AIMD Dynamics with Relaxed and Hard Constraints

Motivated by various applications from Internet congestion control to po...
research
08/17/2022

Choquet regularization for reinforcement learning

We propose Choquet regularizers to measure and manage the level of explo...

Please sign up or login with your details

Forgot password? Click here to reset