# Polynomial-time trace reconstruction in the smoothed complexity model

In the trace reconstruction problem, an unknown source string x ∈{0,1}^n is sent through a probabilistic deletion channel which independently deletes each bit with probability δ and concatenates the surviving bits, yielding a trace of x. The problem is to reconstruct x given independent traces. This problem has received much attention in recent years both in the worst-case setting where x may be an arbitrary string in {0,1}^n <cit.> and in the average-case setting where x is drawn uniformly at random from {0,1}^n <cit.>. This paper studies trace reconstruction in the smoothed analysis setting, in which a “worst-case” string x^ is chosen arbitrarily from {0,1}^n, and then a perturbed version of x^ is formed by independently replacing each coordinate by a uniform random bit with probability σ. The problem is to reconstruct given independent traces from it. Our main result is an algorithm which, for any constant perturbation rate 0<σ < 1 and any constant deletion rate 0 < δ < 1, uses (n) running time and traces and succeeds with high probability in reconstructing the string . This stands in contrast with the worst-case version of the problem, for which exp(O(n^1/3)) is the best known time and sample complexity <cit.>. Our approach is based on reconstructing from the multiset of its short subwords and is quite different from previous algorithms for either the worst-case or average-case versions of the problem. The heart of our work is a new (n)-time procedure for reconstructing the multiset of all O(log n)-length subwords of any source string x∈{0,1}^n given access to traces of x.

READ FULL TEXT