A Simple Yet High-Performing On-disk Learned Index: Can We Have Our Cake and Eat it Too?

06/05/2023
by   Hai Lan, et al.
0

While in-memory learned indexes have shown promising performance as compared to B+-tree, most widely used databases in real applications still rely on disk-based operations. Based on our experiments, we observe that directly applying the existing learned indexes on disk suffers from several drawbacks and cannot outperform a standard B+-tree in most cases. Therefore, in this work we make the first attempt to show how the idea of learned index can benefit the on-disk index by proposing AULID, a fully on-disk updatable learned index that can achieve state-of-the-art performance across multiple workload types. The AULID approach combines the benefits from both traditional indexing techniques and the learned indexes to reduce the I/O cost, the main overhead under disk setting. Specifically, three aspects are taken into consideration in reducing I/O costs: (1) reduce the overhead in updating the index structure; (2) induce shorter paths from root to leaf node; (3) achieve better locality to minimize the number of block reads required to complete a scan. Five principles are proposed to guide the design of AULID which shows remarkable performance gains and meanwhile is easy to implement. Our evaluation shows that AULID has comparable storage costs to a B+-tree and is much smaller than other learned indexes, and AULID is up to 2.11x, 8.63x, 1.72x, 5.51x, and 8.02x more efficient than FITing-tree, PGM, B+-tree, ALEX, and LIPP.

READ FULL TEXT

page 9

page 10

page 12

research
05/02/2023

Updatable Learned Indexes Meet Disk-Resident DBMS – From Evaluations to Design Choices

Although many updatable learned indexes have been proposed in recent yea...
research
06/19/2023

COLE: A Column-based Learned Storage for Blockchain Systems

Blockchain systems suffer from high storage costs as every node needs to...
research
11/26/2019

Cracking In-Memory Database Index A Case Study for Adaptive Radix Tree Index

Indexes provide a method to access data in databases quickly. It can imp...
research
03/01/2021

CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

Learned indexes, which use machine learning models to replace traditiona...
research
05/11/2022

LSI: A Learned Secondary Index Structure

Learned index structures have been shown to achieve favorable lookup per...
research
02/08/2022

OSM-tree: A Sortedness-Aware Index

Indexes facilitate efficient querying when the selection predicate is on...
research
05/03/2021

APEX: A High-Performance Learned Index on Persistent Memory

The recently released persistent memory (PM) has been gaining popularity...

Please sign up or login with your details

Forgot password? Click here to reset