Grammar Compressed Sequences with Rank/Select Support

by   Alberto Ordóñez, et al.

Sequence representations supporting not only direct access to their symbols, but also rank/select operations, are a fundamental building block in many compressed data structures. Several recent applications need to represent highly repetitive sequences, and classical statistical compression proves ineffective. We introduce, instead, grammar-based representations for repetitive sequences, which use up to 6 compressed representations, and support direct access and rank/select operations within tens of microseconds. We demonstrate the impact of our structures in text indexing applications.


page 1

page 2

page 3

page 4


Learning Directly from Grammar Compressed Text

Neural networks using numerous text data have been successfully applied ...

Proving tree algorithms for succinct data structures

Succinct data structures give space-efficient representations of large a...

Grammar-Compressed Indexes with Logarithmic Search Time

Let a text T[1..n] be the only string generated by a context-free gramma...

Optimal Rank and Select Queries on Dictionary-Compressed Text

Let γ be the size of a string attractor for a string S of length n over ...

Engineering Compact Data Structures for Rank and Select Queries on Bit Vectors

Bit vectors are fundamental building blocks of many succinct data struct...

Compressed Data Structures for Binary Relations in Practice

Binary relations are commonly used in Computer Science for modeling data...

FM-Indexing Grammars Induced by Suffix Sorting for Long Patterns

The run-length compressed Burrows-Wheeler transform (RLBWT) used in conj...