# Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

In the standard trace reconstruction problem, the goal is to exactly reconstruct an unknown source string 𝗑∈{0,1}^n from independent "traces", which are copies of 𝗑 that have been corrupted by a δ-deletion channel which independently deletes each bit of 𝗑 with probability δ and concatenates the surviving bits. We study the approximate trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of 𝗑 rather than an exact reconstruction. We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string 𝗑∈{0,1}^n from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate 0 < δ < 1 (which may depend on n), for almost every source string 𝗑∈{0,1}^n, given any number M ≤Θ(1/δ) of traces from Del_δ(𝗑), the algorithm constructs a hypothesis string 𝗑 that has edit distance at most n · (δ M)^Ω(M) from 𝗑. We also prove a near-matching information-theoretic lower bound showing that given M ≤Θ(1/δ) traces from Del_δ(𝗑) for a random n-bit string 𝗑, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is n · (δ M)^O(M).

