Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads

06/23/2020
by   Jialin Ding, et al.
0

Filtering data based on predicates is one of the most fundamental operations for any modern data warehouse. Techniques to accelerate the execution of filter expressions include clustered indexes, specialized sort orders (e.g., Z-order), multi-dimensional indexes, and, for high selectivity queries, secondary indexes. However, these schemes are hard to tune and their performance is inconsistent. Recent work on learned multi-dimensional indexes has introduced the idea of automatically optimizing an index for a particular dataset and workload. However, the performance of that work suffers in the presence of correlated data and skewed query workloads, both of which are common in real applications. In this paper, we introduce Tsunami, which addresses these limitations to achieve up to 6X faster query performance and up to 8X smaller index size than existing learned multi-dimensional indexes, in addition to up to 11X faster query performance and 170X smaller index size than optimally-tuned traditional indexes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2019

Learning Multi-dimensional Indexes

Scanning and filtering over multi-dimensional tables are key operations ...
research
02/12/2021

Spatial Interpolation-based Learned Index for Range and kNN Queries

A corpus of recent work has revealed that the learned index can improve ...
research
12/12/2020

Cortex: Harnessing Correlations to Boost Query Performance

Databases employ indexes to filter out irrelevant records, which reduces...
research
08/24/2020

The Case for Learned Spatial Indexes

Spatial data is ubiquitous. Massive amounts of data are generated every ...
research
06/29/2020

Leveraging Soft Functional Dependencies for Indexing Multi-dimensional Data

A new proposal in database indexing has been for index structures to aut...
research
05/22/2023

Z-ordered Range Refinement for Multi-dimensional Range Queries

The z-order curve is a space-filling curve and is now attracting the int...
research
08/25/2023

ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges

The scale and complexity of workloads in modern cloud services have brou...

Please sign up or login with your details

Forgot password? Click here to reset