DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift

11/17/2022
by   Defu Cao, et al.
0

In electronic trading markets, limit order books (LOBs) provide information about pending buy/sell orders at various price levels for a given security. Recently, there has been a growing interest in using LOB data for resolving downstream machine learning tasks (e.g., forecasting). However, dealing with out-of-distribution (OOD) LOB data is challenging since distributional shifts are unlabeled in current publicly available LOB datasets. Therefore, it is critical to build a synthetic LOB dataset with labeled OOD samples serving as a testbed for developing models that generalize well to unseen scenarios. In this work, we utilize a multi-agent market simulator to build a synthetic LOB dataset, named DSLOB, with and without market stress scenarios, which allows for the design of controlled distributional shift benchmarking. Using the proposed synthetic dataset, we provide a holistic analysis on the forecasting performance of three different state-of-the-art forecasting methods. Our results reflect the need for increased researcher efforts to develop algorithms with robustness to distributional shifts in high-frequency time series data.

READ FULL TEXT
research
05/09/2017

Benchmark Dataset for Mid-Price Prediction of Limit Order Book data

Presently, managing prediction of metrics in high frequency financial ma...
research
08/29/2023

Biquality Learning: a Framework to Design Algorithms Dealing with Closed-Set Distribution Shifts

Training machine learning models from data with weak supervision and dat...
research
09/19/2018

Machine Learning for Forecasting Mid Price Movement using Limit Order Book Data

Forecasting the movements of stock prices is one the most challenging pr...
research
08/02/2021

Learning who is in the market from time series: market participant discovery through adversarial calibration of multi-agent simulators

In electronic trading markets often only the price or volume time series...
research
02/17/2021

Deep Learning for Market by Order Data

Market by order (MBO) data - a detailed feed of individual trade instruc...
research
12/04/2017

Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis

Financial time-series forecasting has long been a challenging problem be...
research
06/30/2022

Shifts 2.0: Extending The Dataset of Real Distributional Shifts

Distributional shift, or the mismatch between training and deployment da...

Please sign up or login with your details

Forgot password? Click here to reset