Enhancing In-Memory Spatial Indexing with Learned Search

09/12/2023
by   Varun Pandey, et al.
0

Spatial data is ubiquitous. Massive amounts of data are generated every day from a plethora of sources such as billions of GPS-enabled devices (e.g., cell phones, cars, and sensors), consumer-based applications (e.g., Uber and Strava), and social media platforms (e.g., location-tagged posts on Facebook, Twitter, and Instagram). This exponential growth in spatial data has led the research community to build systems and applications for efficient spatial data processing. In this study, we apply a recently developed machine-learned search technique for single-dimensional sorted data to spatial indexing. Specifically, we partition spatial data using six traditional spatial partitioning techniques and employ machine-learned search within each partition to support point, range, distance, and spatial join queries. Adhering to the latest research trends, we tune the partitioning techniques to be instance-optimized. By tuning each partitioning technique for optimal performance, we demonstrate that: (i) grid-based index structures outperform tree-based index structures (from 1.23× to 2.47×), (ii) learning-enhanced variants of commonly used spatial index structures outperform their original counterparts (from 1.44× to 53.34× faster), (iii) machine-learned search within a partition is faster than binary search by 11.79 dimension, (iv) the benefit of machine-learned search diminishes in the presence of other compute-intensive operations (e.g. scan costs in higher selectivity queries, Haversine distance computation, and point-in-polygon tests), and (v) index lookup is the bottleneck for tree-based structures, which could potentially be reduced by linearizing the indexed partitions.

READ FULL TEXT
research
08/24/2020

The Case for Learned Spatial Indexes

Spatial data is ubiquitous. Massive amounts of data are generated every ...
research
07/22/2020

R*-Grove: Balanced Spatial Partitioning for Large-scale Datasets

The rapid growth of big spatial data urged the research community to dev...
research
07/18/2023

Two-layer Space-oriented Partitioning for Non-point Data

Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiqui...
research
05/18/2020

A Two-level Spatial In-Memory Index

Very large volumes of spatial data increasingly become available and dem...
research
06/08/2023

Learned spatial data partitioning

Due to the significant increase in the size of spatial data, it is essen...
research
04/22/2021

HINT: A Hierarchical Index for Intervals in Main Memory

Indexing intervals is a fundamental problem, finding a wide range of app...
research
06/29/2020

Hands-off Model Integration in Spatial Index Structures

Spatial indexes are crucial for the analysis of the increasing amounts o...

Please sign up or login with your details

Forgot password? Click here to reset