A Population Background for Nonparametric Density-Based Clustering

08/06/2014
by   José E. Chacón, et al.
0

Despite its popularity, it is widely recognized that the investigation of some theoretical aspects of clustering has been relatively sparse. One of the main reasons for this lack of theoretical results is surely the fact that, whereas for other statistical problems the theoretical population goal is clearly defined (as in regression or classification), for some of the clustering methodologies it is difficult to specify the population goal to which the data-based clustering algorithms should try to get close. This paper aims to provide some insight into the theoretical foundations of clustering by focusing on two main objectives: to provide an explicit formulation for the ideal population goal of the modal clustering methodology, which understands clusters as regions of high density; and to present two new loss functions, applicable in fact to any clustering methodology, to evaluate the performance of a data-based clustering algorithm with respect to the ideal population goal. In particular, it is shown that only mild conditions on a sequence of density estimators are needed to ensure that the sequence of modal clusterings that they induce is consistent.

READ FULL TEXT
research
12/06/2012

Clusters and water flows: a novel approach to modal clustering through Morse theory

The problem of finding groups in data (cluster analysis) has been extens...
research
01/22/2019

Modal clustering asymptotics with applications to bandwidth selection

Density-based clustering relies on the idea of linking groups to some sp...
research
12/01/2019

HCA-DBSCAN: HyperCube Accelerated Density Based Spatial Clustering for Applications with Noise

Density-based clustering has found numerous applications across various ...
research
04/27/2012

Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting

Important information concerning a multivariate data set, such as cluste...
research
10/26/2020

Modal clustering of matrix-variate data

The nonparametric formulation of density-based clustering, known as moda...
research
11/15/2019

How bettering the best? Answers via blending models and cluster formulations in density-based clustering

With the recent growth in data availability and complexity, and the asso...
research
07/21/2022

A Dynamical Systems Algorithm for Clustering in Hyperspectral Imagery

In this paper we present a new dynamical systems algorithm for clusterin...

Please sign up or login with your details

Forgot password? Click here to reset