Compressed Indexing for Consecutive Occurrences

04/03/2023
by   Paweł Gawrychowski, et al.
0

The fundamental question considered in algorithms on strings is that of indexing, that is, preprocessing a given string for specific queries. By now we have a number of efficient solutions for this problem when the queries ask for an exact occurrence of a given pattern P. However, practical applications motivate the necessity of considering more complex queries, for example concerning near occurrences of two patterns. Recently, Bille et al. [CPM 2021] introduced a variant of such queries, called gapped consecutive occurrences, in which a query consists of two patterns P_1 and P_2 and a range [a,b], and one must find all consecutive occurrences (q_1,q_2) of P_1 and P_2 such that q_2-q_1 ∈ [a,b]. By their results, we cannot hope for a very efficient indexing structure for such queries, even if a=0 is fixed (although at the same time they provided a non-trivial upper bound). Motivated by this, we focus on a text given as a straight-line program (SLP) and design an index taking space polynomial in the size of the grammar that answers such queries in time optimal up to polylog factors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2020

String Indexing for Top-k Close Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a c...
research
02/04/2021

Gapped Indexing for Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a c...
research
11/30/2022

Gapped String Indexing in Subquadratic Space and Sublinear Query Time

In Gapped String Indexing, the goal is to compactly represent a string S...
research
06/28/2022

Extending Shinohara's Algorithm for Computing Descriptive (Angluin-Style) Patterns to Subsequence Patterns

The introduction of pattern languages in the seminal work [Angluin, “Fin...
research
06/13/2019

On Longest Common Property Preserved Substring Queries

We revisit the problem of longest common property preserving substring q...
research
09/20/2017

ProbeSim: Scalable Single-Source and Top-k SimRank Computations on Dynamic Graphs

Single-source and top-k SimRank queries are two important types of simil...
research
11/19/2020

Subpath Queries on Compressed Graphs: a Survey

Text indexing is a classical algorithmic problem that has been studied f...

Please sign up or login with your details

Forgot password? Click here to reset