Density-based Clustering with Best-scored Random Forest

06/24/2019
by   Hanyuan Hang, et al.
0

Single-level density-based approach has long been widely acknowledged to be a conceptually and mathematically convincing clustering method. In this paper, we propose an algorithm called "best-scored clustering forest" that can obtain the optimal level and determine corresponding clusters. The terminology "best-scored" means to select one random tree with the best empirical performance out of a certain number of purely random tree candidates. From the theoretical perspective, we first show that consistency of our proposed algorithm can be guaranteed. Moreover, under certain mild restrictions on the underlying density functions and target clusters, even fast convergence rates can be achieved. Last but not least, comparisons with other state-of-the-art clustering methods in the numerical experiments demonstrate accuracy of our algorithm on both synthetic data and several benchmark real data sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2019

Best-scored Random Forest Classification

We propose an algorithm named best-scored random forest for binary class...
research
04/29/2019

Clustering Optimization: Finding the Number and Centroids of Clusters by a Fourier-based Algorithm

We propose a Fourier-based approach for optimization of several clusteri...
research
05/09/2019

Best-scored Random Forest Density Estimation

This paper presents a brand new nonparametric density estimation strateg...
research
05/09/2019

Two-stage Best-scored Random Forest for Large-scale Regression

We propose a novel method designed for large-scale regression problems, ...
research
07/15/2015

Unsupervised Decision Forest for Data Clustering and Density Estimation

An algorithm to improve performance parameter for unsupervised decision ...
research
04/20/2021

Bisecting for selecting: using a Laplacian eigenmaps clustering approach to create the new European football Super League

We use European football performance data to select teams to form the pr...
research
09/13/2017

Efficient Computation of Multiple Density-Based Clustering Hierarchies

HDBSCAN*, a state-of-the-art density-based hierarchical clustering metho...

Please sign up or login with your details

Forgot password? Click here to reset