Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching

08/11/2023
by   Kento Iseri, et al.
0

A parameterized string (p-string) is a string over an alphabet (Σ_s∪Σ_p), where Σ_s and Σ_p are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings x and y are said to parameterized match (p-match) if and only if x can be transformed into y by applying a bijection on Σ_p to every occurrence of p-symbols in x. The indexing problem for p-matching is to preprocess a p-string T of length n so that we can efficiently find the occurrences of substrings of T that p-match with a given pattern. Extending the Burrows-Wheeler Transform (BWT) based index for exact string pattern matching, Ganguly et al. [SODA 2017] proposed the first compact index (named pBWT) for p-matching, and posed an open problem on how to construct it in compact space, i.e., in O(n |Σ_s∪Σ_p|) bits of space. Hashimoto et al. [SPIRE 2022] partially solved this problem by showing how to construct some components of pBWTs for T in O(n |Σ_p| n/ n) time in an online manner while reading the symbols of T from right to left. In this paper, we improve the time complexity to O(n |Σ_p| n/ n). We remark that removing the multiplicative factor of |Σ_p| from the complexity is of great interest because it has not been achieved for over a decade in the construction of related data structures like parameterized suffix arrays even in the offline setting. We also show that our data structure can support backward search, a core procedure of BWT-based indexes, at any stage of the online construction, making it the first compact index for p-matching that can be constructed in compact space and even in an online manner.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/03/2018

Right-to-left online construction of parameterized position heaps

Two strings of equal length are said to parameterized match if there is ...
research
12/18/2020

The Parameterized Suffix Tray

Let Σ and Π be disjoint alphabets, respectively called the static alphab...
research
06/30/2022

Computing the Parameterized Burrows–Wheeler Transform Online

Parameterized strings are a generalization of strings in that their char...
research
02/17/2020

DAWGs for parameterized matching: online construction and related indexing structures

Two strings x and y over Σ∪Π of equal length are said to parameterized m...
research
01/11/2023

Linear Time Online Algorithms for Constructing Linear-size Suffix Trie

The suffix trees are fundamental data structures for various kinds of st...
research
07/17/2022

On the Practical Power of Automata in Pattern Matching

The classical pattern matching paradigm is that of seeking occurrences o...
research
08/21/2022

Teaching the Burrows-Wheeler Transform via the Positional Burrows-Wheeler Transform

The Burrows-Wheeler Transform (BWT) is often taught in undergraduate cou...

Please sign up or login with your details

Forgot password? Click here to reset