Pointer-Machine Algorithms for Fully-Online Construction of Suffix Trees and DAWGs on Multiple Strings

05/02/2020
by   Shunsuke Inenaga, et al.
0

We deal with the problem of maintaining the suffix tree indexing structure for a fully-online collection of multiple strings, where a new character can be prepended to any string in the collection at any time. The only previously known algorithm for the problem, recently proposed by Takagi et al. [Algorithmica 82(5): 1346-1377 (2020)], runs in O(N logσ) time and O(N) space on the word RAM model, where N denotes the total length of the strings and σ denotes the alphabet size. Their algorithm makes heavy use of the nearest marked ancestor (NMA) data structure on semi-dynamic trees, that can answer queries and supports insertion of nodes in O(1) amortized time on the word RAM model. In this paper, we present a simpler fully-online right-to-left algorithm that builds the suffix tree for a given string collection in O(N (logσ + log d)) time and O(N) space, where d is the maximum number of in-coming Weiner links to a node of the suffix tree. We note that d is bounded by the height of the suffix tree, which is further bounded by the length of the longest string in the collection. The advantage of this new algorithm is that it works on the pointer machine model, namely, it does not use the complicated NMA data structures that involve table look-ups. As a byproduct, we also obtain a pointer-machine algorithm for building the directed acyclic word graph (DAWG) for a fully-online left-to-right collection of multiple strings, which runs in O(N (logσ + log d)) time and O(N) space again without the aid of the NMA data structures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2019

Online Algorithms for Constructing Linear-size Suffix Trie

The suffix trees are fundamental data structures for various kinds of st...
research
06/28/2020

Random Access in Persistent Strings

We consider compact representations of collections of similar strings th...
research
04/09/2019

Suffix Trees, DAWGs and CDAWGs for Forward and Backward Tries

The suffix tree, DAWG, and CDAWG are fundamental indexing structures of ...
research
01/29/2019

Fully-functional bidirectional Burrows-Wheeler indexes

Given a string T on an alphabet of size σ, we describe a bidirectional B...
research
04/12/2018

Fast Prefix Search in Little Space, with Applications

It has been shown in the indexing literature that there is an essential ...
research
10/12/2020

Incomplete Directed Perfect Phylogeny in Linear Time

Reconstructing the evolutionary history of a set of species is a central...
research
02/25/2021

A Linear Time Algorithm for Constructing Hierarchical Overlap Graphs

The hierarchical overlap graph (HOG) is a graph that encodes overlaps fr...

Please sign up or login with your details

Forgot password? Click here to reset