Efficient average-case population recovery in the presence of insertions and deletions

07/12/2019
by   Frank Ban, et al.
0

Several recent works have considered the trace reconstruction problem, in which an unknown source string x∈{0,1}^n is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a trace of x. The goal is to reconstruct the original string x from independent traces of x. While the best algorithms known for worst-case strings use (O(n^1/3)) traces DOS17,NazarovPeres17, highly efficient algorithms are known PZ17,HPP18 for the average-case version, in which x is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call average-case population recovery in the presence of insertions and deletions. In this problem, there is an unknown distribution D over s unknown source strings x^1,...,x^s ∈{0,1}^n, and each sample is independently generated by drawing some x^i from D and returning an independent trace of x^i. Building on PZ17 and HPP18, we give an efficient algorithm for this problem. For any support size s ≤(Θ(n^1/3)), for a 1-o(1) fraction of all s-element support sets {x^1,...,x^s}⊂{0,1}^n, for every distribution D supported on {x^1,...,x^s}, our algorithm efficiently recovers D up to total variation distance ϵ with high probability, given access to independent traces of independent draws from D. The algorithm runs in time poly(n,s,1/ϵ) and its sample complexity is poly(s,1/ϵ,(^1/3n)). This polynomial dependence on the support size s is in sharp contrast with the worst-case version (when x^1,...,x^s may be any strings in {0,1}^n), in which the sample complexity of the most efficient known algorithm BCFSS19 is doubly exponential in s.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/27/2020

Polynomial-time trace reconstruction in the smoothed complexity model

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...
research
12/04/2020

Polynomial-time trace reconstruction in the low deletion rate regime

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...
research
11/07/2022

Approximate Trace Reconstruction from a Single Trace

The well-known trace reconstruction problem is the problem of inferring ...
research
11/27/2020

Limitations of Mean-Based Algorithms for Trace Reconstruction at Small Distance

Trace reconstruction considers the task of recovering an unknown string ...
research
08/29/2023

On k-Mer-Based and Maximum Likelihood Estimation Algorithms for Trace Reconstruction

The goal of the trace reconstruction problem is to recover a string x∈{0...
research
02/18/2021

Mean-Based Trace Reconstruction over Practically any Replication-Insertion Channel

Mean-based reconstruction is a fundamental, natural approach to worst-ca...
research
04/11/2019

Beyond trace reconstruction: Population recovery from the deletion channel

Population recovery is the problem of learning an unknown distribution o...

Please sign up or login with your details

Forgot password? Click here to reset