String Indexing with Compressed Patterns

09/26/2019
by   Philip Bille, et al.
0

Given a string S of length n, the classic string indexing problem is to preprocess S into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern. This captures the common client-server scenario, where a client submits a query and communicates it in compressed form to a server. Instead of the server decompressing the query before processing it, we consider how to efficiently process the compressed query directly. Our main result is a novel linear space data structure that achieves near-optimal query time for patterns compressed with the classic Lempel-Ziv compression scheme. Along the way we develop several data structural techniques of independent interest, including a novel data structure that compactly encodes all LZ77 compressed suffixes of a string in linear space and a general decomposition of tries that reduces the search time from logarithmic in the size of the trie to logarithmic in the length of the pattern.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2017

Compressed Indexing with Signature Grammars

The compressed indexing problem is to preprocess a string S of length n ...
research
09/18/2017

Compressed Representations of Conjunctive Query Results

Relational queries, and in particular join queries, often generate large...
research
04/16/2019

Compressed Indexes for Fast Search of Semantic Data

The sheer increase in volume of RDF data demands efficient solutions for...
research
02/12/2019

Compressed Range Minimum Queries

Given a string S of n integers in [0,σ), a range minimum query RMQ(i, j)...
research
11/19/2020

Subpath Queries on Compressed Graphs: a Survey

Text indexing is a classical algorithmic problem that has been studied f...
research
12/09/2020

Compressed Bounding Volume Hierarchies for Collision Detection Proximity Query

We present a novel representation of compressed data structure for simul...
research
02/04/2021

Gapped Indexing for Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a c...

Please sign up or login with your details

Forgot password? Click here to reset