Improved Circular k-Mismatch Sketches

06/24/2020
βˆ™
by   Shay Golan, et al.
βˆ™
0
βˆ™

The shift distance π—Œπ—(S_1,S_2) between two strings S_1 and S_2 of the same length is defined as the minimum Hamming distance between S_1 and any rotation (cyclic shift) of S_2. We study the problem of sketching the shift distance, which is the following communication complexity problem: Strings S_1 and S_2 of length n are given to two identical players (encoders), who independently compute sketches (summaries) πšœπš”(S_1) and πšœπš”(S_2), respectively, so that upon receiving the two sketches, a third player (decoder) is able to compute (or approximate) π—Œπ—(S_1,S_2) with high probability. This paper primarily focuses on the more general k-mismatch version of the problem, where the decoder is allowed to declare a failure if π—Œπ—(S_1,S_2)>k, where k is a parameter known to all parties. Andoni et al. (STOC'13) introduced exact circular k-mismatch sketches of size O(k+D(n)), where D(n) is the number of divisors of n. Andoni et al. also showed that their sketch size is optimal in the class of linear homomorphic sketches. We circumvent this lower bound by designing a (non-linear) exact circular k-mismatch sketch of size O(k); this size matches communication-complexity lower bounds. We also design (1Β±Ξ΅)-approximate circular k-mismatch sketch of size O(min(Ξ΅^-2√(k), Ξ΅^-1.5√(n))), which improves upon an O(Ξ΅^-2√(n))-size sketch of Crouch and McGregor (APPROX'11).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
βˆ™ 07/03/2019

Circular Pattern Matching with k Mismatches

The k-mismatch problem consists in computing the Hamming distance betwee...
research
βˆ™ 08/18/2022

Approximate Circular Pattern Matching

We consider approximate circular pattern matching (CPM, in short) under ...
research
βˆ™ 12/26/2021

The Sketching and Communication Complexity of Subsequence Detection

We study the sketching and communication complexity of deciding whether ...
research
βˆ™ 01/31/2019

Quasi-Linear-Time Algorithm for Longest Common Circular Factor

We introduce the Longest Common Circular Factor (LCCF) problem in which,...
research
βˆ™ 09/02/2020

Circular Trace Reconstruction

Trace Reconstruction is the problem of learning an unknown string x from...
research
βˆ™ 04/27/2020

The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time

We revisit the k-mismatch problem in the streaming model on a pattern of...
research
βˆ™ 03/05/2021

Compressed Communication Complexity of Hamming Distance

We consider the communication complexity of the Hamming distance of two ...

Please sign up or login with your details

Forgot password? Click here to reset