
Superseding traditional indexes by orchestrating learning and geometry
We design the first learned index that solves the dictionary problem wit...
read it

CARMI: A CacheAware Learned Index with a Costbased Construction Algorithm
Learned indexes, which use machine learning models to replace traditiona...
read it

GraCT: A Grammarbased Compressed Index for Trajectory Data
We introduce a compressed data structure for the storage of free traject...
read it

ALEX: An Updatable Adaptive Learned Index
Recent work on "learned indexes" has revolutionized the way we look at t...
read it

Dynamic Interleaving of Content and Structure for Robust Indexing of SemiStructured Hierarchical Data (Extended Version)
We propose a robust index for semistructured hierarchical data that sup...
read it

Spatial Indexing for SystemLevel Evaluation of 5G Heterogeneous Cellular Networks
System level simulations of large 5G networks are essential to evaluate ...
read it

PLEX: Towards Practical Learned Indexing
Latest research proposes to replace existing index structures with learn...
read it
The PGMindex: a multicriteria, compressed and learned approach to data indexing
The recent introduction of learned indexes has shaken the foundations of the decadesold field of indexing data structures. Combining, or even replacing, classic design elements such as Btree nodes with machine learning models has proven to give outstanding improvements in the space footprint and time efficiency of data systems. However, these novel approaches are based on heuristics, thus they lack any guarantees both in their time and space requirements. We propose the Piecewise Geometric Model index (shortly, PGMindex), which achieves guaranteed I/Ooptimality in query operations, learns an optimal number of linear models, and its peculiar recursive construction makes it a purely learned data structure, rather than a hybrid of traditional and learned indexes (such as RMI and FITingtree). We show that the PGMindex improves the space of the FITingtree by 63.3 more than four orders of magnitude, while achieving their same or even better query time efficiency. We complement this result by proposing three variants of the PGMindex. First, we design a compressed PGMindex that further reduces its space footprint by exploiting the repetitiveness at the level of the learned linear models it is composed of. Second, we design a PGMindex that adapts itself to the distribution of the queries, thus resulting in the first known distributionaware learned index to date. Finally, given its flexibility in the offered spacetime tradeoffs, we propose the multicriteria PGMindex that efficiently autotune itself in a few seconds over hundreds of millions of keys to the possibly evolving spacetime constraints imposed by the application of use. We remark to the reader that this paper is an extended and improved version of our previous paper titled "Superseding traditional indexes by orchestrating learning and geometry" (arXiv:1903.00507).
READ FULL TEXT
Comments
There are no comments yet.