The Dynamic k-Mismatch Problem

05/13/2021
by   Raphael Clifford, et al.
0

The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length m and all length-m substrings of a given text of length n≥ m. We focus on the k-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold k. We assume n≤ 2m (in general, one can partition the text into overlapping blocks). In this work, we show data structures for the dynamic version of this problem supporting two operations: An update performs a single-letter substitution in the pattern or the text, and a query, given an index i, returns the Hamming distance between the pattern and the text substring starting at position i, or reports that it exceeds k. First, we show a data structure with Õ(1) update and Õ(k) query time. Then we show that Õ(k) update and Õ(1) query time is also possible. These two provide an optimal trade-off for the dynamic k-mismatch problem with k ≤√(n): we prove that, conditioned on the strong 3SUM conjecture, one cannot simultaneously achieve k^1-Ω(1) time for all operations. For k≥√(n), we give another lower bound, conditioned on the Online Matrix-Vector conjecture, that excludes algorithms taking n^1/2-Ω(1) time per operation. This is tight for constant-sized alphabets: Clifford et al. (STACS 2018) achieved Õ(√(n)) time per operation in that case, but with Õ(n^3/4) time per operation for large alphabets. We improve and extend this result with an algorithm that, given 1≤ x≤ k, achieves update time Õ(n/k +√(nk/x)) and query time Õ(x). In particular, for k≥√(n), an appropriate choice of x yields Õ(√(nk)) time per operation, which is Õ(n^2/3) when no threshold k is provided.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/09/2020

Approximating Text-to-Pattern Distance via Dimensionality Reduction

Text-to-pattern distance is a fundamental problem in string matching, wh...
research
04/17/2020

Faster Approximate Pattern Matching: A Unified Approach

Approximate pattern matching is a natural and well-studied problem on st...
research
02/19/2018

Upper and lower bounds for dynamic data structures on strings

We consider a range of simply stated dynamic data structure problems on ...
research
09/07/2018

Streaming dictionary matching with mismatches

In the k-mismatch problem we are given a pattern of length m and a text ...
research
07/09/2019

L_p Pattern Matching in a Stream

We consider the problem of computing distance between a pattern of lengt...
research
11/10/2017

Hamming distance completeness and sparse matrix multiplication

We investigate relations between (+,) vector products for binary integer...
research
07/03/2019

Circular Pattern Matching with k Mismatches

The k-mismatch problem consists in computing the Hamming distance betwee...

Please sign up or login with your details

Forgot password? Click here to reset