Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

07/11/2022
by   Daichi Amagata, et al.
0

Clustering multi-dimensional points is a fundamental task in many fields, and density-based clustering supports many applications as it can discover clusters of arbitrary shapes. This paper addresses the problem of Density-Peaks Clustering (DPC), a recently proposed density-based clustering framework. Although DPC already has many applications, its straightforward implementation incurs a quadratic time computation to the number of points in a given dataset, thereby does not scale to large datasets. To enable DPC on large datasets, we propose efficient algorithms for DPC. Specifically, we propose an exact algorithm, Ex-DPC, and two approximation algorithms, Approx-DPC and S-Approx-DPC. Under a reasonable assumption about a DPC parameter, our algorithms are sub-quadratic, i.e., break the quadratic barrier. Besides, Approx-DPC does not require any additional parameters and can return the same cluster centers as those of Ex-DPC, rendering an accurate clustering result. S-Approx-DPC requires an approximation parameter but can speed up its computational efficiency. We further present that their efficiencies can be accelerated by leveraging multicore processing. We conduct extensive experiments using synthetic and real datasets, and our experimental results demonstrate that our algorithms are efficient, scalable, and accurate.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2021

VDPC: Variational Density Peak Clustering Algorithm

The widely applied density peak clustering (DPC) algorithm makes an intu...
research
12/29/2021

A sampling-based approach for efficient clustering in large datasets

We propose a simple and efficient clustering method for high-dimensional...
research
11/16/2015

Fast clustering for scalable statistical analysis on structured images

The use of brain images as markers for diseases or behavioral difference...
research
10/31/2018

DBSCAN++: Towards fast and scalable density clustering

DBSCAN is a classical density-based clustering procedure which has had t...
research
03/14/2022

Geometric reconstructions of density based clusterings

DBSCAN* and HDBSCAN* are well established density based clustering algor...
research
03/03/2020

Scalable Distributed Approximation of Internal Measures for Clustering Evaluation

The most widely used internal measure for clustering evaluation is the s...
research
04/10/2023

FINEX: A Fast Index for Exact Flexible Density-Based Clustering (Extended Version with Proofs)*

Density-based clustering aims to find groups of similar objects (i.e., c...

Please sign up or login with your details

Forgot password? Click here to reset