Learned Indexes for Dynamic Workloads

02/02/2019
by   Chuzhe Tang, et al.
0

The recent proposal of learned index structures opens up a new perspective on how traditional range indexes can be optimized. However, the current learned indexes assume the data distribution is relatively static and the access pattern is uniform, while real-world scenarios consist of skew query distribution and evolving data. In this paper, we demonstrate that the missing consideration of access patterns and dynamic data distribution notably hinders the applicability of learned indexes. To this end, we propose solutions for learned indexes for dynamic workloads (called Doraemon). To improve the latency for skew queries, Doraemon augments the training data with access frequencies. To address the slow model re-training when data distribution shifts, Doraemon caches the previously-trained models and incrementally fine-tunes them for similar access patterns and data distribution. Our preliminary result shows that, Doraemon improves the query latency by 45.1 re-training time to 1/20.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2021

Shift-Table: A Low-latency Learned Index for Range Queries using Model Correction

Indexing large-scale databases in main memory is still challenging today...
research
04/21/2022

A Learned Index for Exact Similarity Search in Metric Spaces

Indexing is an effective way to support efficient query processing in la...
research
06/19/2023

On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing

A fundamental problem in data management is to find the elements in an a...
research
10/22/2022

OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries

State-of-the-art algorithms for Approximate Nearest Neighbor Search (ANN...
research
05/08/2019

A Scalable Learned Index Scheme in Storage Systems

Index structures are important for efficient data access, which have bee...
research
01/05/2022

Mixture of basis for interpretable continual learning with distribution shifts

Continual learning in environments with shifting data distributions is a...
research
10/12/2022

How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization

Despite the clear performance benefits of data augmentations, little is ...

Please sign up or login with your details

Forgot password? Click here to reset