Almost-Optimal Sublinear-Time Edit Distance in the Low Distance Regime

by   Karl Bringmann, et al.

We revisit the task of computing the edit distance in sublinear time. In the (k,K)-gap edit distance problem the task is to distinguish whether the edit distance of two strings is at most k or at least K. It has been established by Goldenberg, Krauthgamer and Saha (FOCS '19), with improvements by Kociumaka and Saha (FOCS '20), that the (k,k^2)-gap problem can be solved in time O(n/k+poly(k)). One of the most natural questions in this line of research is whether the (k,k^2)-gap is best-possible for the running time O(n/k+poly(k)). In this work we answer this question by significantly improving the gap. Specifically, we show that in time O(n/k+poly(k)) we can even solve the (k,k^1+o(1))-gap problem. This is the first algorithm that breaks the (k,k^2)-gap in this running time. Our algorithm is almost optimal in the following sense: In the low distance regime (k≤ n^0.19) our running time becomes O(n/k), which matches a known n/k^1+o(1) lower bound for the (k,k^1+o(1))-gap problem up to lower order factors. Our result also reveals a surprising similarity of Hamming distance and edit distance in the low distance regime: For both, the (k,k^1+o(1))-gap problem has time complexity n/k^1± o(1) for small k. In contrast to previous work, which employed a subsampled variant of the Landau-Vishkin algorithm, we instead build upon the algorithm of Andoni, Krauthgamer and Onak (FOCS '10). We first simplify their approach and then show how to to effectively prune their computation tree in order to obtain a sublinear-time algorithm in the given time bound. Towards that, we use a variety of structural insights on the (local and global) patterns that can emerge during this process and design appropriate property testers to effectively detect these patterns.


Gap Edit Distance via Non-Adaptive Queries: Simple and Optimal

We study the problem of approximating edit distance in sublinear time. T...

Sublinear-Time Algorithms for Computing Embedding Gap Edit Distance

In this paper, we design new sublinear-time algorithms for solving the g...

A Gap-ETH-Tight Approximation Scheme for Euclidean TSP

We revisit the classic task of finding the shortest tour of n points in ...

A Simple Sublinear Algorithm for Gap Edit Distance

We study the problem of estimating the edit distance between two n-chara...

Polyline Simplification has Cubic Complexity

In the classic polyline simplification problem we want to replace a give...

LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

We study k-SVD that is to obtain the first k singular vectors of a matri...

Approximation Algorithms for LCS and LIS with Truly Improved Running Times

Longest common subsequence (𝖫𝖢𝖲) is a classic and central problem in com...