An Efficient Density-based Clustering Algorithm for Higher-Dimensional Data

01/22/2018
by   Thapana Boonchoo, et al.
0

DBSCAN is a typically used clustering algorithm due to its clustering ability for arbitrarily-shaped clusters and its robustness to outliers. Generally, the complexity of DBSCAN is O(n^2) in the worst case, and it practically becomes more severe in higher dimension. Grid-based DBSCAN is one of the recent improved algorithms aiming at facilitating efficiency. However, the performance of grid-based DBSCAN still suffers from two problems: neighbour explosion and redundancies in merging, which make the algorithms infeasible in high-dimensional space. In this paper, we propose a novel algorithm named GDPAM attempting to extend Grid-based DBSCAN to higher data dimension. In GDPAM, a bitmap indexing is utilized to manage non-empty grids so that the neighbour grid queries can be performed efficiently. Furthermore, we adopt an efficient union-find algorithm to maintain the clustering information in order to reduce redundancies in the merging. The experimental results on both real-world and synthetic datasets demonstrate that the proposed algorithm outperforms the state-of-the-art exact/approximate DBSCAN and suggests a good scalability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

GriT-DBSCAN: A Spatial Clustering Algorithm for Very Large Databases

DBSCAN is a fundamental spatial clustering algorithm with numerous pract...
research
11/27/2018

Adaptive Wavelet Clustering for High Noise Data

In this paper we make progress on the unsupervised task of mining arbitr...
research
11/27/2018

Adaptive Wavelet Clustering for Highly Noisy Data

In this paper we make progress on the unsupervised task of mining arbitr...
research
02/14/2023

Multi-Prototypes Convex Merging Based K-Means Clustering Algorithm

K-Means algorithm is a popular clustering method. However, it has two li...
research
04/24/2020

Non-Exhaustive, Overlapping Co-Clustering: An Extended Analysis

The goal of co-clustering is to simultaneously identify a clustering of ...
research
06/13/2023

PaVa: a novel Path-based Valley-seeking clustering algorithm

Clustering methods are being applied to a wider range of scenarios invol...
research
01/17/2022

Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

We address general-shaped clustering problems under very weak parametric...

Please sign up or login with your details

Forgot password? Click here to reset