Construction of Sparse Suffix Trees and LCE Indexes in Optimal Time and Space

05/08/2021
by   Dmitry Kosolobov, et al.
0

The notions of synchronizing and partitioning sets are recently introduced variants of locally consistent parsings with great potential in problem-solving. In this paper we propose a deterministic algorithm that constructs for a given readonly string of length n over the alphabet {0,1,…,n^𝒪(1)} a version of τ-partitioning set with size 𝒪(b) and τ = n/b using 𝒪(b) space and 𝒪(1/ϵn) time provided b ≥ n^ϵ, for ϵ > 0. As a corollary, for b ≥ n^ϵ and constant ϵ > 0, we obtain linear construction algorithms with 𝒪(b) space on top of the string for two major small-space indexes: a sparse suffix tree, which is a compacted trie built on b chosen suffixes of the string, and a longest common extension (LCE) index, which occupies 𝒪(b) space and allows us to compute the longest common prefix for any pair of substrings in 𝒪(n/b) time. For both, the 𝒪(b) construction storage is asymptotically optimal since the tree itself takes 𝒪(b) space and any LCE index with 𝒪(n/b) query time must occupy at least 𝒪(b) space by a known trade-off (at least for b ≥Ω(n / log n)). In case of arbitrary b ≥Ω(log^2 n), we present construction algorithms for the partitioning set, sparse suffix tree, and LCE index with 𝒪(nlog_b n) running time and 𝒪(b) space, thus also improving the state of the art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Position Heaps for Cartesian-tree Matching on Strings and Tries

The Cartesian-tree pattern matching is a recently introduced scheme of p...
research
01/29/2019

Online Algorithms for Constructing Linear-size Suffix Trie

The suffix trees are fundamental data structures for various kinds of st...
research
05/30/2020

Longest Square Subsequence Problem Revisited

The longest square subsequence (LSS) problem consists of computing a lon...
research
06/01/2022

Near-Optimal Search Time in δ-Optimal Space

Two recent lower bounds on the compressiblity of repetitive sequences, δ...
research
12/02/2018

Locally Consistent Parsing for Text Indexing in Small Space

We consider two closely related problems of text indexing in a sub-linea...
research
07/01/2019

On Slicing Sorted Integer Sequences

Representing sorted integer sequences in small space is a central proble...
research
11/30/2018

Faster Attractor-Based Indexes

String attractors are a novel combinatorial object encompassing most kno...

Please sign up or login with your details

Forgot password? Click here to reset