Substring Query Complexity of String Reconstruction

11/13/2020
by   Gabriele Fici, et al.
0

Suppose an oracle knows a string S that is unknown to us and we want to determine. The oracle can answer queries of the form "Is s a substring of S?". The Substring Query Complexity of a string S, denoted χ(S), is the minimum number of adaptive substring queries that are needed to exactly reconstruct (or learn) S. It has been introduced in 1995 by Skiena and Sundaram, who showed that χ(S) ≥σ n/4 -O(n) in the worst case, where σ is the size of the alphabet of S and n its length, and gave an algorithm that spends (σ-1)n+O(σ√(n)) queries to reconstruct S. We show that for any binary string S, χ(S) is asymptotically equal to the Kolmogorov complexity of S and therefore lower bounds any other measure of compressibility. However, since this result does not yield an efficient algorithm for the reconstruction, we present new algorithms to compute a set of substring queries whose size grows as a function of other known measures of complexity, e.g., the number rle of runs in S, the size g of the smallest grammar producing (only) S or the size z_no of the non-overlapping LZ77 factorization of S. We first show that any string of length n over an integer alphabet of size σ with rle runs can be reconstructed with q=O( rle (σ + logn/ rle)) substring queries in linear time and space. We then present an algorithm that spends q ∈ O(σ glog n) ⊆ O(σ z_nolog (n/z_no)log n) substring queries and runs in O(n(log n + logσ)+ q) time using linear space. This algorithm actually reconstructs the suffix tree of the string using a dynamic approach based on the centroid decomposition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/17/2020

Adaptive Exact Learning in a Mixed-Up World: Dealing with Periodicity, Errors and Jumbled-Index Queries in String Reconstruction

We study the query complexity of exactly reconstructing a string from ad...
research
08/02/2018

Reconstructing Strings from Substrings: Optimal Randomized and Average-Case Algorithms

The problem called "String reconstruction from substrings" is a mathemat...
research
07/19/2021

Sensitivity of string compressors and repetitiveness measures

The sensitivity of a string compression algorithm C asks how much the ou...
research
02/10/2020

Palindromic k-Factorization in Pure Linear Time

Given a string s of length n over a general alphabet and an integer k, t...
research
11/03/2018

Optimal Rank and Select Queries on Dictionary-Compressed Text

Let γ be the size of a string attractor for a string S of length n over ...
research
07/16/2020

Substring Complexity in Sublinear Space

Shannon's entropy is a definitive lower bound for statistical compressio...
research
12/20/2018

The Query Complexity of a Permutation-Based Variant of Mastermind

We study the query complexity of a permutation-based variant of the gues...

Please sign up or login with your details

Forgot password? Click here to reset