String Attractors and Infinite Words

06/01/2022
by   Antonio Restivo, et al.
0

The notion of string attractor has been introduced in [Kempa and Prezza, 2018] in the context of Data Compression and it represents a set of positions of a finite word in which all of its factors can be "attracted". The smallest size γ^* of a string attractor for a finite word is a lower bound for several repetitiveness measures associated with the most common compression schemes, including BWT-based and LZ-based compressors. The combinatorial properties of the measure γ^* have been studied in [Mantaci et al., 2021]. Very recently, a complexity measure, called string attractor profile function, has been introduced for infinite words, by evaluating γ^* on each prefix. Such a measure has been studied for automatic sequences and linearly recurrent infinite words [Schaeffer and Shallit, 2021]. In this paper, we study the relationship between such a complexity measure and other well-known combinatorial notions related to repetitiveness in the context of infinite words, such as the factor complexity and the recurrence. Furthermore, we introduce new string attractor-based complexity measures, in which the structure and the distribution of positions in a string attractor of the prefixes of infinite words are considered. We show that such measures provide a finer classification of some infinite families of words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2023

String attractors of fixed points of k-bonacci-like morphisms

Firstly studied by Kempa and Prezza in 2018 as the cement of text compre...
research
07/10/2019

String Attractors and Combinatorics on Words

The notion of string attractor has recently been introduced in [Prezza, ...
research
07/05/2023

Compressibility measures for two-dimensional data

In this paper we extend to two-dimensional data two recently introduced ...
research
08/11/2021

Automatic Sequences of Rank Two

Given a right-infinite word x over a finite alphabet A, the rank of x is...
research
06/03/2022

L-systems for Measuring Repetitiveness*

An L-system (for lossless compression) is a CPD0L-system extended with t...
research
04/19/2022

Compressed Empirical Measures (in finite dimensions)

We study approaches for compressing the empirical measure in the context...
research
08/24/2019

When a Dollar Makes a BWT

The Burrows-Wheeler-Transform (BWT) is a reversible string transformatio...

Please sign up or login with your details

Forgot password? Click here to reset