Approximate Trace Reconstruction via Median String (in Average-Case)

07/20/2021
by   Diptarka Chakraborty, et al.
0

We consider an approximate version of the trace reconstruction problem, where the goal is to recover an unknown string s∈{0,1}^n from m traces (each trace is generated independently by passing s through a probabilistic insertion-deletion channel with rate p). We present a deterministic near-linear time algorithm for the average-case model, where s is random, that uses only three traces. It runs in near-linear time Õ(n) and with high probability reports a string within edit distance O(ϵ p n) from s for ϵ=Õ(p), which significantly improves over the straightforward bound of O(pn). Technically, our algorithm computes a (1+ϵ)-approximate median of the three input traces. To prove its correctness, our probabilistic analysis shows that an approximate median is indeed close to the unknown s. To achieve a near-linear time bound, we have to bypass the well-known dynamic programming algorithm that computes an optimal median in time O(n^3).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset