Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings
Dynamic Time Warping (DTW) is a widely used similarity measure for comparing strings that encode time series data, with applications to areas including bioinformatics, signature verification, and speech recognition. The standard dynamic-programming algorithm for DTW takes O(n^2) time, and there are conditional lower bounds showing that no algorithm can do substantially better. In many applications, however, the strings x and y may contain long runs of repeated letters, meaning that they can be compressed using run-length encoding. A natural question is whether the DTW-distance between these compressed strings can be computed efficiently in terms of the lengths k and ℓ of the compressed strings. Recent work has shown how to achieve O(kℓ^2 + ℓ k^2) time, leaving open the question of whether a near-quadratic Õ(kℓ)-time algorithm might exist. We show that, if a small approximation loss is permitted, then a near-quadratic time algorithm is indeed possible: our algorithm computes a (1 + ϵ)-approximation for DTW(x, y) in Õ(kℓ / ϵ^3) time, where k and ℓ are the number of runs in x and y. Our algorithm allows for DTW to be computed over any metric space (Σ, δ) in which distances are O(log(n))-bit integers. Surprisingly, the algorithm also works even if δ does not induce a metric space on Σ (e.g., δ need not satisfy the triangle inequality).
READ FULL TEXT