HINT: A Hierarchical Index for Intervals in Main Memory

04/22/2021
by   George Christodoulou, et al.
0

Indexing intervals is a fundamental problem, finding a wide range of applications. Recent work on managing large collections of intervals in main memory focused on overlap joins and temporal aggregation problems. In this paper, we propose novel and efficient in-memory indexing techniques for intervals, with a focus on interval range queries, which are a basic component of many search and analysis tasks. First, we propose an optimized version of a single-level (flat) domain-partitioning approach, which may have large space requirements due to excessive replication. Then, we propose a hierarchical partitioning approach, which assigns each interval to at most two partitions per level and has controlled space requirements. Novel elements of our techniques include the division of the intervals at each partition into groups based on whether they begin inside or before the partition boundaries, reducing the information stored at each partition to the absolutely necessary, and the effective handling of data sparsity and skew. Experimental results on real and synthetic interval sets of different characteristics show that our approaches are typically one order of magnitude faster than the state-of-the-art.

READ FULL TEXT
research
07/18/2023

Two-layer Space-oriented Partitioning for Non-point Data

Non-point spatial objects (e.g., polygons, linestrings, etc.) are ubiqui...
research
05/18/2020

A Two-level Spatial In-Memory Index

Very large volumes of spatial data increasingly become available and dem...
research
08/31/2021

Hierarchical Bitmap Indexing for Range and Membership Queries on Multidimensional Arrays

Traditional indexing techniques commonly employed in da­ta­ba­se systems...
research
09/12/2023

Enhancing In-Memory Spatial Indexing with Learned Search

Spatial data is ubiquitous. Massive amounts of data are generated every ...
research
06/04/2018

Efficient Genomic Interval Queries Using Augmented Range Trees

Efficient large-scale annotation of genomic intervals is essential for p...
research
12/30/2018

Joint Overlap Analysis of Multiple Genomic Interval Sets

Next-generation sequencing (NGS) technologies have produced large volume...
research
08/28/2020

Cache-Efficient Sweeping-Based Interval Joins for Extended Allen Relation Predicates (Extended Version)

We develop a family of efficient plane-sweeping interval join algorithms...

Please sign up or login with your details

Forgot password? Click here to reset