Generalizing Lloyd's algorithm for graph clustering

03/03/2023
by   Tareq Zaman, et al.
0

Clustering is a commonplace problem in many areas of data science, with applications in biology and bioinformatics, understanding chemical structure, image segmentation, building recommender systems, and many more fields. While there are many different clustering variants (based on given distance or graph structure, probability distributions, or data density), we consider here the problem of clustering nodes in a graph, motivated by the problem of aggregating discrete degrees of freedom in multigrid and domain decomposition methods for solving sparse linear systems. Specifically, we consider the challenge of forming balanced clusters in the graph of a sparse matrix for use in algebraic multigrid, although the algorithm has general applicability. Based on an extension of the Bellman-Ford algorithm, we generalize Lloyd's algorithm for partitioning subsets of Rn to balance the number of nodes in each cluster; this is accompanied by a rebalancing algorithm that reduces the overall energy in the system. The algorithm provides control over the number of clusters and leads to "well centered" partitions of the graph. Theoretical results are provided to establish linear complexity and numerical results in the context of algebraic multigrid highlight the benefits of improved clustering.

READ FULL TEXT

page 17

page 23

research
02/22/2021

Weighted Graph Nodes Clustering via Gumbel Softmax

Graph is a ubiquitous data structure in data science that is widely appl...
research
09/17/2023

Axioms for Distanceless Graph Partitioning

In 2002, Kleinberg proposed three axioms for distance-based clustering, ...
research
03/27/2018

Compassionately Conservative Balanced Cuts for Image Segmentation

The Normalized Cut (NCut) objective function, widely used in data cluste...
research
03/16/2016

Clustering of Sparse and Approximately Sparse Graphs by Semidefinite Programming

As a model problem for clustering, we consider the densest k-disjoint-cl...
research
05/05/2016

Clustering on the Edge: Learning Structure in Graphs

With the recent popularity of graphical clustering methods, there has be...
research
01/10/2018

A Polynomial Algorithm for Balanced Clustering via Graph Partitioning

The objective of clustering is to discover natural groups in datasets an...
research
04/28/2021

SMLSOM: The shrinking maximum likelihood self-organizing map

Determining the number of clusters in a dataset is a fundamental issue i...

Please sign up or login with your details

Forgot password? Click here to reset