Compression by Contracting Straight-Line Programs

07/01/2021
by   Moses Ganardi, et al.
0

In grammar-based compression a string is represented by a context-free grammar, also called a straight-line program (SLP), that generates only that string. We refine a recent balancing result stating that one can transform an SLP of size g in linear time into an equivalent SLP of size O(g) so that the height of the unique derivation tree is O(log N) where N is the length of the represented string (FOCS 2019). We introduce a new class of balanced SLPs, called contracting SLPs, where for every rule A →β_1 …β_k the string length of every variable β_i on the right-hand side is smaller by a constant factor than the string length of A. In particular, the derivation tree of a contracting SLP has the property that every subtree has logarithmic height in its leaf size. We show that a given SLP of size g can be transformed in linear time into an equivalent contracting SLP of size O(g) with rules of constant length. We present an application to the navigation problem in compressed unranked trees, represented by forest straight-line programs (FSLPs). We extend a linear space data structure by Reh and Sieber (2020) by the operation of moving to the i-th child in time O(log d) where d is the degree of the current node. Contracting SLPs are also applied to the finger search problem over SLP-compressed strings where one wants to access positions near to a pre-specified finger position, ideally in O(log d) time where d is the distance between the accessed position and the finger. We give a linear space solution where one can access symbols or move the finger in time O(log d + log^(t) N) for any constant t where log^(t) N is the t-fold logarithm of N. This improves a previous solution by Bille, Christiansen, Cording, and Gørtz (2018) with access/move time O(log d + loglog N).

READ FULL TEXT

page 1

page 3

page 5

page 13

page 21

research
02/10/2019

Balancing Straight-Line Programs

It is shown that a context-free grammar of size m that produces a single...
research
06/27/2022

Balancing Run-Length Straight-Line Programs*

It was recently proved that any SLP generating a given string w can be t...
research
04/11/2020

Grammar-compressed Self-index with Lyndon Words

We introduce a new class of straight-line programs (SLPs), named the Lyn...
research
12/15/2017

Optimal top dag compression

It is shown that for a given ordered node-labelled tree of size n and wi...
research
11/09/2021

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processin...
research
10/18/2022

Computing MEMs on Repetitive Text Collections

We consider the problem of computing the Maximal Exact Matches (MEMs) of...
research
10/05/2022

Double-Ended Palindromic Trees: A Linear-Time Data Structure and Its Applications

The palindromic tree (a.k.a. eertree) is a linear-size data structure th...

Please sign up or login with your details

Forgot password? Click here to reset