Approximating Text-to-Pattern Distance via Dimensionality Reduction
Text-to-pattern distance is a fundamental problem in string matching, where given a pattern of length m and a text of length n, over integer alphabet, we are asked to compute the distance between pattern and text at every location. The distance function can be e.g. Hamming distance or ℓ_p distance for some parameter p > 0. Almost all state-of-the-art exact and approximate algorithms developed in the past ∼ 40 years were using FFT as a black-box. In this work we present O(n/ε^2) time algorithms for (1±ε)-approximation of ℓ_2 distances, and O(n/ε^3) algorithm for approximation of Hamming and ℓ_1 distances, all without use of FFT. This is independent to the very recent development by Chan et al. [STOC 2020], where O(n/ε^2) algorithm for Hamming distances not using FFT was presented – although their algorithm is much more "combinatorial", our techniques apply to other norms than Hamming.
READ FULL TEXT