A Fuzzy Clustering Algorithm for the Mode Seeking Framework

06/27/2014
by   Thomas Bonis, et al.
0

In this paper, we propose a new fuzzy clustering algorithm based on the mode-seeking framework. Given a dataset in R^d, we define regions of high density that we call cluster cores. We then consider a random walk on a neighborhood graph built on top of our data points which is designed to be attracted by high density regions. The strength of this attraction is controlled by a temperature parameter β > 0. The membership of a point to a given cluster is then the probability for the random walk to hit the corresponding cluster core before any other. While many properties of random walks (such as hitting times, commute distances, etc...) have been shown to enventually encode purely local information when the number of data points grows, we show that the regularization introduced by the use of cluster cores solves this issue. Empirically, we show how the choice of β influences the behavior of our algorithm: for small values of β the result is close to hard mode-seeking whereas when β is close to 1 the result is similar to the output of a (fuzzy) spectral clustering. Finally, we demonstrate the scalability of our approach by providing the fuzzy clustering of a protein configuration dataset containing a million data points in 30 dimensions.

READ FULL TEXT
research
05/03/2015

Risk Bounds For Mode Clustering

Density mode clustering is a nonparametric clustering method. The cluste...
research
02/10/2011

How the result of graph clustering methods depends on the construction of the graph

We study the scenario of graph-based clustering algorithms such as spect...
research
03/25/2023

Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments

With the membership function being strictly positive, the conventional f...
research
09/25/2021

Random Walk-steered Majority Undersampling

In this work, we propose Random Walk-steered Majority Undersampling (RWM...
research
12/14/2013

Clustering using Vector Membership: An Extension of the Fuzzy C-Means Algorithm

Clustering is an important facet of explorative data mining and finds ex...
research
04/19/2023

Community Detection Using Revised Medoid-Shift Based on KNN

Community detection becomes an important problem with the booming of soc...
research
04/01/2021

MeanShift++: Extremely Fast Mode-Seeking With Applications to Segmentation and Object Tracking

MeanShift is a popular mode-seeking clustering algorithm used in a wide ...

Please sign up or login with your details

Forgot password? Click here to reset