Periodicity in Data Streams with Wildcards
We investigate the problem of detecting periodic trends within a string S of length n, arriving in the streaming model, containing at most k wildcard characters, where k=o(n). A wildcard character is a special character that can be assigned any other character. We say S has wildcard-period p if there exists an assignment to each of the wildcard characters so that in the resulting stream the length n-p prefix equals the length n-p suffix. We present a two-pass streaming algorithm that computes wildcard-periods of S using O(k^3 polylog n) bits of space, while we also show that this problem cannot be solved in sublinear space in one pass. We then give a one-pass randomized streaming algorithm that computes all wildcard-periods p of S with p<n/2 and no wildcard characters appearing in the last p symbols of S, using O(k^3polylog n) space.
READ FULL TEXT