Substring Complexities on Run-length Compressed Strings

05/25/2022
by   Akiyoshi Kawamoto, et al.
0

Let S_T(k) denote the set of distinct substrings of length k in a string T, then the k-th substring complexity is defined by its cardinality |S_T(k)|. Recently, δ = max{ |S_T(k)| / k : k ≥ 1 } is shown to be a good compressibility measure of highly-repetitive strings. In this paper, given T of length n in the run-length compressed form of size r, we show that δ can be computed in 𝐶_𝗌𝗈𝗋𝗍(r, n) time and O(r) space, where 𝐶_𝗌𝗈𝗋𝗍(r, n) = O(min (r r, r _r n)) is the time complexity for sorting r O( n)-bit integers in O(r) space in the Word-RAM model with word size Ω( n).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2022

Minimal Absent Words on Run-Length Encoded Strings

A string w is called a minimal absent word (MAW) for another string T if...
research
11/20/2017

A Separation Between Run-Length SLPs and LZ77

In this paper we give an infinite family of strings for which the length...
research
01/18/2022

Computing Longest (Common) Lyndon Subsequences

Given a string T with length n whose characters are drawn from an ordere...
research
04/18/2018

On Abelian Longest Common Factor with and without RLE

We consider the Abelian longest common factor problem in two scenarios: ...
research
02/14/2020

On Extensions of Maximal Repeats in Compressed Strings

This paper provides an upper bound for several subsets of maximal repeat...
research
07/09/2022

Online algorithms for finding distinct substrings with length and multiple prefix and suffix conditions

Let two static sequences of strings P and S, representing prefix and suf...
research
02/16/2018

Online LZ77 Parsing and Matching Statistics with RLBWTs

Lempel-Ziv 1977 (LZ77) parsing, matching statistics and the Burrows-Whee...

Please sign up or login with your details

Forgot password? Click here to reset