Internal Dictionary Matching

We introduce data structures answering queries concerning the occurrences of patterns from a given dictionary D in fragments of a given string T of length n. The dictionary is internal in the sense that each pattern in D is given as a fragment of T. This way, D takes space proportional to the number of patterns d=|D| rather than their total length, which could be Θ(n· d). In particular, we consider the following types of queries: reporting and counting all occurrences of patterns from D in a fragment T[i..j] and reporting distinct patterns from D that occur in T[i..j]. We show how to construct, in O((n+d) log^O(1) n) time, a data structure that answers each of these queries in time O(log^O(1) n+|output|). The case of counting patterns is much more involved and needs a combination of a locally consistent parsing with orthogonal range searching. Reporting distinct patterns, on the other hand, uses the structure of maximal repetitions in strings. Finally, we provide tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/12/2020

Counting Distinct Patterns in Internal Dictionary Matching

We consider the problem of preprocessing a text T of length n and a dict...
research
10/18/2022

Simplex Range Searching Revisited: How to Shave Logs in Multi-Level Data Structures

We revisit the classic problem of simplex range searching and related pr...
research
07/27/2020

Internal Quasiperiod Queries

Internal pattern matching requires one to answer queries about factors o...
research
07/21/2020

New Data Structures for Orthogonal Range Reporting and Range Minima Queries

In this paper we present new data structures for two extensively studied...
research
02/04/2021

Gapped Indexing for Consecutive Occurrences

The classic string indexing problem is to preprocess a string S into a c...
research
07/09/2022

Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions

Let two static sequences of strings P and S, representing prefix and suf...
research
05/10/2023

Acceleration of FM-index Queries Through Prefix-free Parsing

FM-indexes are a crucial data structure in DNA alignment, for example, b...

Please sign up or login with your details

Forgot password? Click here to reset