DeepAI AI Chat
Log In Sign Up

Efficient average-case population recovery in the presence of insertions and deletions

07/12/2019
by   Frank Ban, et al.
Columbia University
berkeley college
0

Several recent works have considered the trace reconstruction problem, in which an unknown source string x∈{0,1}^n is transmitted through a probabilistic channel which may randomly delete coordinates or insert random bits, resulting in a trace of x. The goal is to reconstruct the original string x from independent traces of x. While the best algorithms known for worst-case strings use (O(n^1/3)) traces DOS17,NazarovPeres17, highly efficient algorithms are known PZ17,HPP18 for the average-case version, in which x is uniformly random. We consider a generalization of this average-case trace reconstruction problem, which we call average-case population recovery in the presence of insertions and deletions. In this problem, there is an unknown distribution D over s unknown source strings x^1,...,x^s ∈{0,1}^n, and each sample is independently generated by drawing some x^i from D and returning an independent trace of x^i. Building on PZ17 and HPP18, we give an efficient algorithm for this problem. For any support size s ≤(Θ(n^1/3)), for a 1-o(1) fraction of all s-element support sets {x^1,...,x^s}⊂{0,1}^n, for every distribution D supported on {x^1,...,x^s}, our algorithm efficiently recovers D up to total variation distance ϵ with high probability, given access to independent traces of independent draws from D. The algorithm runs in time poly(n,s,1/ϵ) and its sample complexity is poly(s,1/ϵ,(^1/3n)). This polynomial dependence on the support size s is in sharp contrast with the worst-case version (when x^1,...,x^s may be any strings in {0,1}^n), in which the sample complexity of the most efficient known algorithm BCFSS19 is doubly exponential in s.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/27/2020

Polynomial-time trace reconstruction in the smoothed complexity model

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...
12/04/2020

Polynomial-time trace reconstruction in the low deletion rate regime

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n...
11/07/2022

Approximate Trace Reconstruction from a Single Trace

The well-known trace reconstruction problem is the problem of inferring ...
11/27/2020

Limitations of Mean-Based Algorithms for Trace Reconstruction at Small Distance

Trace reconstruction considers the task of recovering an unknown string ...
04/11/2019

Beyond trace reconstruction: Population recovery from the deletion channel

Population recovery is the problem of learning an unknown distribution o...
04/21/2019

Trace Reconstruction: Generalized and Parameterized

In the beautifully simple-to-state problem of trace reconstruction, the ...
02/18/2021

Mean-Based Trace Reconstruction over Practically any Replication-Insertion Channel

Mean-based reconstruction is a fundamental, natural approach to worst-ca...