Bridging the Gap Between Theory and Practice on Insertion-Intensive Database

03/02/2020
by   Sepanta Zeighami, et al.
0

With the prevalence of online platforms, today, data is being generated and accessed by users at a very high rate. Besides, applications such as stock trading or high frequency trading require guaranteed low delays for performing an operation on a database. It is consequential to design databases that guarantee data insertion and query at a consistently high rate without introducing any long delay during insertion. In this paper, we propose Nested B-trees (NB-trees), an index that can achieve a consistently high insertion rate on large volumes of data, while providing asymptotically optimal query performance that is very efficient in practice. Nested B-trees support insertions at rates higher than LSM-trees, the state-of-the-art index for insertion-intensive workloads, while avoiding their long insertion delays and improving on their query performance. They approach the query performance of B-trees when complemented with Bloom filters. In our experiments, NB-trees had worst-case delays up to 1000 smaller than LevelDB, RocksDB and bLSM, commonly used LSM-tree data-stores, could perform queries more than 4 times faster than LevelDB and 1.5 times faster than bLSM and RocksDB, while also outperforming them in terms of average insertion rate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/25/2018

Approximation of trees by self-nested trees

The class of self-nested trees presents remarkable compression propertie...
research
03/31/2022

Prefix Filter: Practically and Theoretically Better Than Bloom

Many applications of approximate membership query data structures, or fi...
research
08/25/2021

Learning GraphQL Query Costs (Extended Version)

GraphQL is a query language for APIs and a runtime for executing those q...
research
07/29/2020

Aggregate Analytic Window Query over Spatial Data

Analytic window query is a commonly used query in the relational databas...
research
07/01/2022

The "AI+R"-tree: An Instance-optimized R-tree

The emerging class of instance-optimized systems has shown potential to ...
research
08/24/2020

The Case for Learned Spatial Indexes

Spatial data is ubiquitous. Massive amounts of data are generated every ...
research
03/25/2019

Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB

To monitor critical infrastructure, high quality sensors sampled at a hi...

Please sign up or login with your details

Forgot password? Click here to reset