Learning the Trading Algorithm in Simulated Markets with Non-stationary Continuum Bandits

08/04/2022
by   Bingde Liu, et al.
0

The basic Multi-Armed Bandits (MABs) problem is trying to maximize the rewards obtained from bandits with different unknown probability distributions of payoff for pulling different arms, given that only a finite number of attempts can be made. When studying trading algorithms in the market, we are looking at one of the most complex variants of MABs problems, namely the Non-stationary Continuum Bandits (NCBs) problem. The Bristol Stock Exchange (BSE) is a simple simulation of an electronic financial exchange based on a continuous double auction running via a limit order book. The market can be populated by automated trader agents with different trading algorithms. Within them, the PRSH algorithm embodies some basic ideas for solving NCBs problems. However, it faces the difficulty to adjust hyperparameters and adapt to changes in complex market conditions. We propose a new algorithm called PRB, which solves Continuum Bandits problem by Bayesian optimization, and solves Non-stationary Bandits problem by a novel "bandit-over-bandit" framework. With BSE, we use as many kinds of trader agents as possible to simulate the real market environment under two different market dynamics. We then examine the optimal hyperparameters of the PRSH algorithm and the PRB algorithm under different market dynamics respectively. Finally, by having trader agents using both algorithms trade in the market at the same time, we demonstrate that the PRB algorithm has better performance than the PRSH algorithm under both market dynamics. In particular, we perform rigorous hypothesis testing on all experimental results to ensure their correctness.

READ FULL TEXT
research
09/05/2020

Unifying Clustered and Non-stationary Bandits

Non-stationary bandits and online clustering of bandits lift the restric...
research
01/03/2022

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

Many past attempts at modeling repeated Cournot games assume that demand...
research
12/24/2020

A Regret bound for Non-stationary Multi-Armed Bandits with Fairness Constraints

The multi-armed bandits' framework is the most common platform to study ...
research
05/31/2022

Decentralized Competing Bandits in Non-Stationary Matching Markets

Understanding complex dynamics of two-sided online matching markets, whe...
research
02/01/2021

Generalized non-stationary bandits

In this paper, we study a non-stationary stochastic bandit problem, whic...
research
07/28/2017

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity

The key challenge in multiagent learning is learning a best response to ...
research
09/21/2021

Exploring Coevolutionary Dynamics of Competitive Arms-Races Between Infinitely Diverse Heterogenous Adaptive Automated Trader-Agents

We report on a series of experiments in which we study the coevolutionar...

Please sign up or login with your details

Forgot password? Click here to reset