Streaming dictionary matching with mismatches

09/07/2018

∙

In the k-mismatch problem we are given a pattern of length m and a text and must find all locations where the Hamming distance between the pattern and the text is at most k. A series of recent breakthroughs has resulted in an ultra-efficient streaming algorithm for this problem that requires only O(k m/k) space [Clifford, Kociumaka, Porat, 2017]. In this work we consider a strictly harder problem called dictionary matching with k mismatches, where we are given a dictionary of d patterns of lengths < m and must find all their k-mismatch occurrences in the text, and show the first streaming algorithm for it. The algorithm uses O(k d ^k d polylog m) space and processes each position of the text in O(k ^k+1 d polylog m + occ) time, where occ is the number of k-mismatch occurrences of the patterns that end at this position.

READ FULL TEXT

Streaming dictionary matching with mismatches

Sign in with Google

Consider DeepAI Pro