Approximating Edit Distance in the Fully Dynamic Model
The edit distance is a fundamental measure of sequence similarity, defined as the minimum number of character insertions, deletions, and substitutions needed to transform one string into the other. Given two strings of length at most n, simple dynamic programming computes their edit distance exactly in O(n^2) time, which is also the best possible (up to subpolynomial factors) assuming the Strong Exponential Time Hypothesis (SETH). The last few decades have seen tremendous progress in edit distance approximation, where the runtime has been brought down to subquadratic, near-linear, and even sublinear at the cost of approximation. In this paper, we study the dynamic edit distance problem, where the strings change dynamically as the characters are substituted, inserted, or deleted over time. Each change may happen at any location of either of the two strings. The goal is to maintain the (exact or approximate) edit distance of such dynamic strings while minimizing the update time. The exact edit distance can be maintained in Õ(n) time per update (Charalampopoulos, Kociumaka, Mozes; 2020), which is again tight assuming SETH. Unfortunately, even with the unprecedented progress in edit distance approximation in the static setting, strikingly little is known regarding dynamic edit distance approximation. Utilizing the off-the-shelf tools, it is possible to achieve an O(n^c)-approximation in n^0.5-c+o(1) update time for any constant c∈ [0,1/6]. Improving upon this trade-off remains open. The contribution of this work is a dynamic n^o(1)-approximation algorithm with amortized expected update time of n^o(1). In other words, we bring the approximation-ratio and update-time product down to n^o(1). Our solution utilizes an elegant framework of precision sampling tree for edit distance approximation (Andoni, Krauthgamer, Onak; 2010).
READ FULL TEXT