Limitations of Mean-Based Algorithms for Trace Reconstruction at Small Distance

11/27/2020
by   Elena Grigorescu, et al.
0

Trace reconstruction considers the task of recovering an unknown string x ∈{0,1}^n given a number of independent "traces", i.e., subsequences of x obtained by randomly and independently deleting every symbol of x with some probability p. The information-theoretic limit of the number of traces needed to recover a string of length n are still unknown. This limit is essentially the same as the number of traces needed to determine, given strings x and y and traces of one of them, which string is the source. The most studied class of algorithms for the worst-case version of the problem are "mean-based" algorithms. These are a restricted class of distinguishers that only use the mean value of each coordinate on the given samples. In this work we study limitations of mean-based algorithms on strings at small Hamming or edit distance. We show on the one hand that distinguishing strings that are nearby in Hamming distance is "easy" for such distinguishers. On the other hand, we show that distinguishing strings that are nearby in edit distance is "hard" for mean-based algorithms. Along the way we also describe a connection to the famous Prouhet-Tarry-Escott (PTE) problem, which shows a barrier to finding explicit hard-to-distinguish strings: namely such strings would imply explicit short solutions to the PTE problem, a well-known difficult problem in number theory. Our techniques rely on complex analysis arguments that involve careful trigonometric estimates, and algebraic techniques that include applications of Descartes' rule of signs for polynomials over the reals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2020

Approximate Trace Reconstruction

In the usual trace reconstruction problem, the goal is to exactly recons...
research
02/10/2021

Trace Reconstruction with Bounded Edit Distance

The trace reconstruction problem studies the number of noisy samples nee...
research
07/12/2019

Efficient average-case population recovery in the presence of insertions and deletions

Several recent works have considered the trace reconstruction problem, i...
research
08/29/2023

On k-Mer-Based and Maximum Likelihood Estimation Algorithms for Trace Reconstruction

The goal of the trace reconstruction problem is to recover a string x∈{0...
research
07/24/2021

Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

In the standard trace reconstruction problem, the goal is to exactly rec...
research
11/22/2017

Lightweight Fingerprints for Fast Approximate Keyword Matching Using Bitwise Operations

We aim to speed up approximate keyword matching by storing a lightweight...
research
02/04/2020

Faster Binary Mean Computation Under Dynamic Time Warping

Many consensus string problems are based on Hamming distance. We replace...

Please sign up or login with your details

Forgot password? Click here to reset