DeepAI AI Chat
Log In Sign Up

HINT: A Hierarchical Index for Intervals in Main Memory

by   George Christodoulou, et al.
Fedora Summer University
University of Mainz

Indexing intervals is a fundamental problem, finding a wide range of applications. Recent work on managing large collections of intervals in main memory focused on overlap joins and temporal aggregation problems. In this paper, we propose novel and efficient in-memory indexing techniques for intervals, with a focus on interval range queries, which are a basic component of many search and analysis tasks. First, we propose an optimized version of a single-level (flat) domain-partitioning approach, which may have large space requirements due to excessive replication. Then, we propose a hierarchical partitioning approach, which assigns each interval to at most two partitions per level and has controlled space requirements. Novel elements of our techniques include the division of the intervals at each partition into groups based on whether they begin inside or before the partition boundaries, reducing the information stored at each partition to the absolutely necessary, and the effective handling of data sparsity and skew. Experimental results on real and synthetic interval sets of different characteristics show that our approaches are typically one order of magnitude faster than the state-of-the-art.


A Two-level Spatial In-Memory Index

Very large volumes of spatial data increasingly become available and dem...

Hierarchical Bitmap Indexing for Range and Membership Queries on Multidimensional Arrays

Traditional indexing techniques commonly employed in da­ta­ba­se systems...

Efficient Genomic Interval Queries Using Augmented Range Trees

Efficient large-scale annotation of genomic intervals is essential for p...

LES3: Learning-based Exact Set Similarity Search

Set similarity search is a problem of central interest to a wide variety...

Exponential bases for partitions of intervals

For a partition of [0,1] into intervals I_1,…,I_n we prove the existence...

Joint Overlap Analysis of Multiple Genomic Interval Sets

Next-generation sequencing (NGS) technologies have produced large volume...

Vector Programming Using Structural Recursion

Vector programming is an important topic in many Introduction to Compute...