DeepAI AI Chat
Log In Sign Up

MSPP: A Highly Efficient and Scalable Algorithm for Mining Similar Pairs of Points

by   Subrata Saha, et al.
University of Connecticut

The closest pair of points problem or closest pair problem (CPP) is an important problem in computational geometry where we have to find a pair of points from a set of points in metric space with the smallest distance between them. This problem arises in a number of applications, such as but not limited to clustering, graph partitioning, image processing, patterns identification, and intrusion detection. For example, in air-traffic control, we must monitor aircrafts that come too close together, since this may potentially indicate a possible collision. Numerous algorithms have been presented for solving the CPP. The algorithms that are employed in practice have a worst case quadratic run time complexity. In this article we present an elegant approximation algorithm for the CPP called MSPP: Mining Similar Pairs of Points. It is faster than currently best known algorithms while maintaining a very good accuracy. The proposed algorithm also detects a set of closely similar pairs of points in Euclidean and Pearson metric spaces and can be adapted in numerous real world applications, such as clustering, dimension reduction, constructing and analyzing gene/transcript co-expression network, among others.


page 1

page 2

page 3

page 4


Algorithms for metric learning via contrastive embeddings

We study the problem of supervised learning a metric space under discrim...

A Faster Algorithm for Finding Closest Pairs in Hamming Metric

We study the Closest Pair Problem in Hamming metric, which asks to find ...

On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic

Given a set of n points in R^d, the (monochromatic) Closest Pair proble...

Computing Euclidean k-Center over Sliding Windows

In the Euclidean k-center problem in sliding window model, input points ...

GriT-DBSCAN: A Spatial Clustering Algorithm for Very Large Databases

DBSCAN is a fundamental spatial clustering algorithm with numerous pract...

Towards Optimal Coreset Construction for (k,z)-Clustering: Breaking the Quadratic Dependency on k

Constructing small-sized coresets for various clustering problems has at...