CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

03/01/2021
by   Jiaoyi Zhang, et al.
0

Learned indexes, which use machine learning models to replace traditional index structures, have shown promising results in recent studies. However, our understanding of this new type of index structure is still at an early stage with many details that need to be carefully examined and improved. In this paper, we propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework proposed by Kraska et al. and a cost-based construction algorithm to construct the optimal indexes in a wide variety of application scenarios. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it and a partial greedy step to speed up. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines under various data distribution and workload requirements. Among them, CARMI can obtain an average of 2.52X speedup compared to B-tree, while using only about 0.56X memory space of B-tree on average.

READ FULL TEXT
research
10/14/2019

The PGM-index: a multicriteria, compressed and learned approach to data indexing

The recent introduction of learned indexes has shaken the foundations of...
research
08/01/2020

The Price of Tailoring the Index to Your Data: Poisoning Attacks on Learned Index Structures

The concept of learned index structures relies on the idea that the inpu...
research
06/05/2023

A Simple Yet High-Performing On-disk Learned Index: Can We Have Our Cake and Eat it Too?

While in-memory learned indexes have shown promising performance as comp...
research
11/16/2018

The Potential of Learned Index Structures for Index Compression

Inverted indexes are vital in providing fast key-word-based search. For ...
research
04/12/2021

Updatable Learned Index with Precise Positions

Index plays an essential role in modern database engines to accelerate t...
research
01/04/2021

A Pluggable Learned Index Method via Sampling and Gap Insertion

Database indexes facilitate data retrieval and benefit broad application...
research
06/26/2023

AirIndex: Versatile Index Tuning Through Data and Storage

The end-to-end lookup latency of a hierarchical index – such as a B-tree...

Please sign up or login with your details

Forgot password? Click here to reset