On Distribution Dependent Sub-Logarithmic Query Time of Learned Indexing

06/19/2023
by   Sepanta Zeighami, et al.
0

A fundamental problem in data management is to find the elements in an array that match a query. Recently, learned indexes are being extensively used to solve this problem, where they learn a model to predict the location of the items in the array. They are empirically shown to outperform non-learned methods (e.g., B-trees or binary search that answer queries in O(log n) time) by orders of magnitude. However, success of learned indexes has not been theoretically justified. Only existing attempt shows the same query time of O(log n), but with a constant factor improvement in space complexity over non-learned methods, under some assumptions on data distribution. In this paper, we significantly strengthen this result, showing that under mild assumptions on data distribution, and the same space complexity as non-learned methods, learned indexes can answer queries in O(loglog n) expected query time. We also show that allowing for slightly larger but still near-linear space overhead, a learned index can achieve O(1) expected query time. Our results theoretically prove learned indexes are orders of magnitude faster than non-learned methods, theoretically grounding their empirical success.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2021

Minmax-optimal list searching with O(log_2log_2 n) average cost

We find a searching method on ordered lists that surprisingly outperform...
research
07/16/2020

Planar Distance Oracles with Better Time-Space Tradeoffs

In a recent breakthrough, Charalampopoulos, Gawrychowski, Mozes, and Wei...
research
03/01/2019

Superseding traditional indexes by orchestrating learning and geometry

We design the first learned index that solves the dictionary problem wit...
research
02/02/2019

Learned Indexes for Dynamic Workloads

The recent proposal of learned index structures opens up a new perspecti...
research
10/14/2019

The PGM-index: a multicriteria, compressed and learned approach to data indexing

The recent introduction of learned indexes has shaken the foundations of...
research
04/06/2021

Sorted Range Reporting

In sorted range selection problem, the aim is to preprocess a given arra...
research
02/06/2023

Using Learned Indexes to Improve Time Series Indexing Performance on Embedded Sensor Devices

Efficiently querying data on embedded sensor and IoT devices is challeng...

Please sign up or login with your details

Forgot password? Click here to reset