Scalable Laplacian K-modes

10/31/2018
by   Imtiaz Masud Ziko, et al.
0

We advocate Laplacian K-modes for joint clustering and density mode finding, and propose a concave-convex relaxation of the problem, which yields a parallel algorithm that scales up to large datasets and high dimensions. We optimize a tight bound (auxiliary function) of our relaxation, which, at each iteration, amounts to computing an independent update for each cluster-assignment variable, with guaranteed convergence. Therefore, our bound optimizer can be trivially distributed for large-scale data sets. Furthermore, we show that the density modes can be obtained as byproducts of the assignment variables via simple maximum-value operations whose additional computational cost is linear in the number of data points. Our formulation does not need storing a full affinity matrix and computing its eigenvalue decomposition, neither does it perform expensive projection steps and Lagrangian-dual inner iterates for the simplex constraints of each point. Furthermore, unlike mean-shift, our density-mode estimation does not require inner-loop gradient-ascent iterates. It has a complexity independent of feature-space dimension, yields modes that are valid data points in the input set and is applicable to discrete domains as well as arbitrary kernels. We report comprehensive experiments over various data sets, which show that our algorithm yields very competitive performances in term of optimization quality (i.e., the value of the discrete-variable objective at convergence) and clustering accuracy.

READ FULL TEXT
research
06/19/2019

Clustering with Fairness Constraints: A Flexible and Scalable Approach

This study investigates a general variational formulation of fair cluste...
research
04/24/2013

The K-modes algorithm for clustering

Many clustering algorithms exist that estimate a cluster centroid, such ...
research
04/20/2014

Clustering via Mode Seeking by Direct Estimation of the Gradient of a Log-Density

Mean shift clustering finds the modes of the data probability density by...
research
02/11/2010

Operator norm convergence of spectral clustering on level sets

Following Hartigan, a cluster is defined as a connected component of the...
research
11/20/2017

On Convergence of Epanechnikov Mean Shift

Epanechnikov Mean Shift is a simple yet empirically very effective algor...
research
04/20/2021

Space Partitioning and Regression Mode Seeking via a Mean-Shift-Inspired Algorithm

The mean shift (MS) algorithm is a nonparametric method used to cluster ...
research
07/05/2020

Aligning Partially Overlapping Point Sets: an Inner Approximation Algorithm

Aligning partially overlapping point sets where there is no prior inform...

Please sign up or login with your details

Forgot password? Click here to reset