Optimal Algorithms for Bounded Weighted Edit Distance

05/11/2023
by   Alejandro Cassis, et al.
0

The edit distance of two strings is the minimum number of insertions, deletions, and substitutions of characters needed to transform one string into the other. The textbook dynamic-programming algorithm computes the edit distance of two length-n strings in O(n^2) time, which is optimal up to subpolynomial factors under SETH. An established way of circumventing this hardness is to consider the bounded setting, where the running time is parameterized by the edit distance k. A celebrated algorithm by Landau and Vishkin (JCSS '88) achieves time O(n + k^2), which is optimal as a function of n and k. Most practical applications rely on a more general weighted edit distance, where each edit has a weight depending on its type and the involved characters from the alphabet Σ. This is formalized through a weight function w : Σ∪{ε}×Σ∪{ε}→ℝ normalized so that w(a,a)=0 and w(a,b)≥ 1 for all a,b ∈Σ∪{ε} with a ≠ b; the goal is to find an alignment of the two strings minimizing the total weight of edits. The O(n^2)-time algorithm supports this setting seamlessly, but only very recently, Das, Gilbert, Hajiaghayi, Kociumaka, and Saha (STOC '23) gave the first non-trivial algorithm for the bounded version, achieving time O(n + k^5). While this running time is linear for k≤ n^1/5, it is still very far from the bound O(n+k^2) achievable in the unweighted setting. In this paper, we essentially close this gap by showing both an improved Õ(n+√(nk^3))-time algorithm and, more surprisingly, a matching lower bound: Conditioned on the All-Pairs Shortest Paths (APSP) hypothesis, our running time is optimal for √(n)≤ k≤ n (up to subpolynomial factors). This is the first separation between the complexity of the weighted and unweighted edit distance problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

Weighted Edit Distance Computation: Strings, Trees and Dyck

Given two strings of length n over alphabet Σ, and an upper bound k on t...
research
11/13/2022

Bounds and Estimates on the Average Edit Distance

The edit distance is a metric of dissimilarity between strings, widely a...
research
12/06/2021

On Complexity of 1-Center in Various Metrics

We consider the classic 1-center problem: Given a set P of n points in a...
research
05/03/2019

RLE edit distance in near optimal time

We show that the edit distance between two run-length encoded strings of...
research
10/02/2018

Sketching, Streaming, and Fine-Grained Complexity of (Weighted) LCS

We study sketching and streaming algorithms for the Longest Common Subse...
research
10/25/2020

An Improved Sketching Bound for Edit Distance

We provide improved upper bounds for the simultaneous sketching complexi...
research
07/06/2020

Near-Linear Time Edit Distance for Indel Channels

We consider the following model for sampling pairs of strings: s_1 is a ...

Please sign up or login with your details

Forgot password? Click here to reset