Does Preprocessing help in Fast Sequence Comparisons?

08/20/2021
by   Elazar Goldenberg, et al.
0

We study edit distance computation with preprocessing: the preprocessing algorithm acts on each string separately, and then the query algorithm takes as input the two preprocessed strings. This model is inspired by scenarios where we would like to compute edit distance between many pairs in the same pool of strings. Our results include: Permutation-LCS: If the LCS between two permutations has length n-k, we can compute it exactly with O(n log(n)) preprocessing and O(k log(n)) query time. Small edit distance: For general strings, if their edit distance is at most k, we can compute it exactly with O(nlog(n)) preprocessing and O(k^2 log(n)) query time. Approximate edit distance: For the most general input, we can approximate the edit distance to within factor (7+o(1)) with preprocessing time Õ(n^2) and query time Õ(n^1.5+o(1)). All of these results significantly improve over the state of the art in edit distance computation without preprocessing. Interestingly, by combining ideas from our algorithms with preprocessing, we provide new improved results for approximating edit distance without preprocessing in subquadratic time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2022

Improved Sublinear-Time Edit Distance for Preprocessed Strings

We study the problem of approximating the edit distance of two strings i...
research
03/04/2021

An Almost Optimal Edit Distance Oracle

We consider the problem of preprocessing two strings S and T, of lengths...
research
07/02/2019

Approximate Similarity Search Under Edit Distance Using Locality-Sensitive Hashing

Edit distance similarity search, also called approximate pattern matchin...
research
05/07/2019

Kendall Tau Sequence Distance: Extending Kendall Tau from Ranks to Sequences

An edit distance is a measure of the minimum cost sequence of edit opera...
research
11/13/2022

Bounds and Estimates on the Average Edit Distance

The edit distance is a metric of dissimilarity between strings, widely a...
research
11/10/2018

Efficiently Approximating Edit Distance Between Pseudorandom Strings

We present an algorithm for approximating the edit distance ed(x, y) bet...
research
10/25/2020

An Improved Sketching Bound for Edit Distance

We provide improved upper bounds for the simultaneous sketching complexi...

Please sign up or login with your details

Forgot password? Click here to reset