String Indexing for Top-k Close Consecutive Occurrences

07/08/2020
by   Philip Bille, et al.
0

The classic string indexing problem is to preprocess a string S into a compact data structure that supports efficient subsequent pattern matching queries, that is, given a pattern string P, report all occurrences of P within S. In this paper, we study a basic and natural extension of string indexing called the string indexing for top-k close consecutive occurrences problem (SITCCO). Here, a consecutive occurrence is a pair (i,j), i < j, such that P occurs at positions i and j in S and there is no occurrence of P between i and j, and their distance is defined as j-i. Given a pattern P and a parameter k, the goal is to report the top-k consecutive occurrences of P in S of minimal distance. The challenge is to compactly represent S while supporting queries in time close to length of P and k. We give two new time-space trade-offs for the problem. Our first result achieves near-linear space and optimal query time, and our second result achieves linear space and near optimal query time. Along the way, we develop several techniques of independent interest, including a new translation of the problem into a line segment intersection problem and a new recursive clustering technique for trees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2021

Gapped Indexing for Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a c...
research
11/30/2022

Gapped String Indexing in Subquadratic Space and Sublinear Query Time

In Gapped String Indexing, the goal is to compactly represent a string S...
research
04/03/2023

Compressed Indexing for Consecutive Occurrences

The fundamental question considered in algorithms on strings is that of ...
research
01/23/2023

Sliding Window String Indexing in Streams

Given a string S over an alphabet Σ, the 'string indexing problem' is to...
research
01/24/2021

Longest segment of balanced parentheses – an exercise in program inversion in a segment problem (Functional Pearl)

Given a string of parentheses, the task is to find a longest consecutive...
research
06/28/2020

Random Access in Persistent Strings

We consider compact representations of collections of similar strings th...
research
01/29/2019

Simulating the DNA String Graph in Succinct Space

Converting a set of sequencing reads into a lossless compact data struct...

Please sign up or login with your details

Forgot password? Click here to reset