Large scale continuous-time mean-variance portfolio allocation via reinforcement learning

07/26/2019
by   Haoran Wang, et al.
0

We propose to solve large scale Markowitz mean-variance (MV) portfolio allocation problem using reinforcement learning (RL). By adopting the recently developed continuous-time exploratory control framework, we formulate the exploratory MV problem in high dimensions. We further show the optimality of a multivariate Gaussian feedback policy, with time-decaying variance, in trading off exploration and exploitation. Based on a provable policy improvement theorem, we devise a scalable and data-efficient RL algorithm and conduct large scale empirical tests using data from the S&P 500 stocks. We found that our method consistently achieves over 10 econometric methods and the deep RL method by large margins, for both long and medium terms of investment with monthly and daily trading.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2019

Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework

We approach the continuous-time mean-variance (MV) portfolio selection w...
research
04/25/2019

Continuous-Time Mean-Variance Portfolio Optimization via Reinforcement Learning

We consider continuous-time Mean-variance (MV) portfolio optimization pr...
research
08/17/2022

Choquet regularization for reinforcement learning

We propose Choquet regularizers to measure and manage the level of explo...
research
12/04/2018

Exploration versus exploitation in reinforcement learning: a stochastic control approach

We consider reinforcement learning (RL) in continuous time and study the...
research
07/02/2022

q-Learning in Continuous Time

We study the continuous-time counterpart of Q-learning for reinforcement...
research
10/05/2018

PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation

Proximal Policy Optimization (PPO) is a highly popular model-free reinfo...
research
03/05/2021

Automatic Exploration Process Adjustment for Safe Reinforcement Learning with Joint Chance Constraint Satisfaction

In reinforcement learning (RL) algorithms, exploratory control inputs ar...

Please sign up or login with your details

Forgot password? Click here to reset