DeepAI AI Chat
Log In Sign Up

Efficient average-case population recovery in the presence of insertions and deletions

by   Frank Ban, et al.
Columbia University
berkeley college

Several recent works have considered the trace reconstruction problem, in which an unknown source string x∈{0,1}^n is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a trace of x. The goal is to reconstruct the original string x from independent traces of x. While the best algorithms known for worst-case strings use (O(n^1/3)) traces DOS17,NazarovPeres17, highly efficient algorithms are known PZ17,HPP18 for the average-case version, in which x is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call average-case population recovery in the presence of insertions and deletions. In this problem, there is an unknown distribution D over s unknown source strings x^1,...,x^s ∈{0,1}^n, and each sample is independently generated by drawing some x^i from D and returning an independent trace of x^i. Building on PZ17 and HPP18, we give an efficient algorithm for this problem. For any support size s ≤(Θ(n^1/3)), for a 1-o(1) fraction of all s-element support sets {x^1,...,x^s}⊂{0,1}^n, for every distribution D supported on {x^1,...,x^s}, our algorithm efficiently recovers D up to total variation distance ϵ with high probability, given access to independent traces of independent draws from D. The algorithm runs in time poly(n,s,1/ϵ) and its sample complexity is poly(s,1/ϵ,(^1/3n)). This polynomial dependence on the support size s is in sharp contrast with the worst-case version (when x^1,...,x^s may be any strings in {0,1}^n), in which the sample complexity of the most efficient known algorithm BCFSS19 is doubly exponential in s.


page 1

page 2

page 3

page 4


Polynomial-time trace reconstruction in the smoothed complexity model

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...

Polynomial-time trace reconstruction in the low deletion rate regime

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...

Approximate Trace Reconstruction from a Single Trace

The well-known trace reconstruction problem is the problem of inferring ...

Limitations of Mean-Based Algorithms for Trace Reconstruction at Small Distance

Trace reconstruction considers the task of recovering an unknown string ...

Beyond trace reconstruction: Population recovery from the deletion channel

Population recovery is the problem of learning an unknown distribution o...

Trace Reconstruction: Generalized and Parameterized

In the beautifully simple-to-state problem of trace reconstruction, the ...

Mean-Based Trace Reconstruction over Practically any Replication-Insertion Channel

Mean-based reconstruction is a fundamental, natural approach to worst-ca...