A Faster Algorithm for Finding Closest Pairs in Hamming Metric

02/04/2021
by   Andre Esser, et al.
0

We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions. Furthermore, we give a proof of concept implementation of our algorithm which performs well in comparison to a quadratic search baseline. This is the first step towards answering an open question raised by May and Ozerov regarding the practicability of algorithms following these design principles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2018

A New Algorithm for Finding Closest Pair of Vectors

Given n vectors x_0, x_1, ..., x_n-1 in {0,1}^m, how to find two vectors...
research
10/15/2018

An Illuminating Algorithm for the Light Bulb Problem

The Light Bulb Problem is one of the most basic problems in data analysi...
research
07/31/2020

MSPP: A Highly Efficient and Scalable Algorithm for Mining Similar Pairs of Points

The closest pair of points problem or closest pair problem (CPP) is an i...
research
11/05/2019

On the Quantum Complexity of Closest Pair and Related Problems

The closest pair problem is a fundamental problem of computational geome...
research
10/03/2019

A Fast Exponential Time Algorithm for Max Hamming Distance X3SAT

X3SAT is the problem of whether one can satisfy a given set of clauses w...
research
02/25/2020

2-Dimensional Palindromes with k Mismatches

This paper extends the problem of 2-dimensional palindrome search into t...
research
09/03/2019

A Note on the Probability of Rectangles for Correlated Binary Strings

Consider two sequences of n independent and identically distributed fair...

Please sign up or login with your details

Forgot password? Click here to reset