Faster Repetition-Aware Compressed Suffix Trees based on Block Trees

02/08/2019
by   Manuel Cáceres, et al.
0

Suffix trees are a fundamental data structure in stringology, but their space usage, though linear, is an important problem for its applications. We design and implement a new compressed suffix tree targeted to highly repetitive texts, such as large genomic collections of the same species. Our suffix tree tree builds on Block Trees, a recent Lempel-Ziv-bounded data structure that captures the repetitiveness of its input. We use Block Trees to compress the topology of the suffix tree, and augment the Block Tree nodes with data that speeds up suffix tree navigation. Our compressed suffix tree is slightly larger than previous repetition-aware suffix trees based on grammars, but outperforms them in time, often by orders of magnitude. The component that represents the tree topology achieves a speed comparable to that of general-purpose compressed trees, while using 2.3--10 times less space, and might be of interest in other scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

Splay Top Trees

The top tree data structure is an important and fundamental tool in dyna...
research
03/04/2018

Two-Dimensional Block Trees

The Block Tree (BT) is a novel compact data structure designed to compre...
research
06/09/2015

Self Organizing Maps Whose Topologies Can Be Learned With Adaptive Binary Search Trees Using Conditional Rotations

Numerous variants of Self-Organizing Maps (SOMs) have been proposed in t...
research
06/15/2023

Modules and PQ-trees in Robinson spaces

A Robinson space is a dissimilarity space (X,d) on n points for which th...
research
08/16/2016

Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees

Efficient methods for storing and querying are critical for scaling high...
research
09/05/2018

Randomized Incremental Construction of Net-Trees

Net-trees are a general purpose data structure for metric data that have...
research
06/17/2016

Adding Context to Concept Trees

Concept Trees are a type of database that can organise arbitrary textual...

Please sign up or login with your details

Forgot password? Click here to reset