Suffix sorting via matching statistics

07/03/2022
by   Zsuzsanna Lipták, et al.
0

We introduce a new algorithm for constructing the generalized suffix array of a collection of highly similar strings. As a first step, we construct a compressed representation of the matching statistics of the collection with respect to a reference string. We then use this data structure to distribute suffixes into a partial order, and subsequently to speed up suffix comparisons to complete the generalized suffix array. Our experimental evidence with a prototype implementation (a tool we call sacamats) shows that on string collections with highly similar strings we can construct the suffix array in time competitive with or faster than the fastest available methods. Along the way, we describe a heuristic for fast computation of the matching statistics of two strings, which may be of independent interest.

READ FULL TEXT

page 3

page 7

page 11

page 13

page 14

page 15

research
01/13/2023

Computing matching statistics on Wheeler DFAs

Matching statistics were introduced to solve the approximate string matc...
research
12/21/2018

A Simple Algorithm for Computing the Document Array

We present a simple algorithm for computing the document array given the...
research
07/19/2018

The colored longest common prefix array computed via sequential scans

Due to the increased availability of large datasets of biological sequen...
research
05/04/2023

Prefix Sorting DFAs: a Recursive Algorithm

In the past thirty years, numerous algorithms for building the suffix ar...
research
10/31/2021

Computing Matching Statistics on Repetitive Texts

Computing the matching statistics of a string P[1..m] with respect to a ...
research
06/28/2020

Random Access in Persistent Strings

We consider compact representations of collections of similar strings th...
research
04/18/2022

Practical KMP/BM Style Pattern-Matching on Indeterminate Strings

In this paper we describe two simple, fast, space-efficient algorithms f...

Please sign up or login with your details

Forgot password? Click here to reset