Bidirectional Text Compression in External Memory

07/07/2019
by   Patrick Dinklage, et al.
0

Bidirectional compression algorithms work by substituting repeated substrings by references that, unlike in the famous LZ77-scheme, can point to either direction. We present such an algorithm that is particularly suited for an external memory implementation. We evaluate it experimentally on large data sets of size up to 128 GiB (using only 16 GiB of RAM) and show that it is significantly faster than all known LZ77 compressors, while producing a roughly similar number of factors. We also introduce an external memory decompressor for texts compressed with any uni- or bidirectional compression scheme.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/12/2022

Efficient Construction of the BWT for Repetitive Text Using String Compression

We present a new semi-external algorithm that builds the Burrows-Wheeler...
research
06/13/2018

O(n n)-time text compression by LZ-style longest first substitution

Mauer et al. [A Lempel-Ziv-style Compression Method for Repetitive Texts...
research
01/13/2022

Optimal alphabet for single text compression

A text can be viewed via different representations, i.e. as a sequence o...
research
08/14/2019

Re-Pair In-Place

Re-Pair is a grammar compression scheme with favorably good compression ...
research
03/04/2020

Approximating Optimal Bidirectional Macro Schemes

Lempel-Ziv is an easy-to-compute member of a wide family of so-called ma...
research
02/24/2021

Preserved central model for faster bidirectional compression in distributed settings

We develop a new approach to tackle communication constraints in a distr...
research
04/14/2023

M2T: Masking Transformers Twice for Faster Decoding

We show how bidirectional transformers trained for masked token predicti...

Please sign up or login with your details

Forgot password? Click here to reset