Capacity Releasing Diffusion for Speed and Locality

06/19/2017
by   Di Wang, et al.
0

Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass "too aggressively," thereby failing to find the "right" clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an O(^2 n) factor, where n is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good---but not very good---clusters.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2013

Local Graph Clustering Beyond Cheeger's Inequality

Motivated by applications of large-scale graph clustering, we study rand...
research
03/11/2019

Diffusion K-means clustering on manifolds: provable exact recovery via semidefinite relaxations

We introduce the diffusion K-means clustering method on Riemannian subm...
research
10/22/2020

Cluster-and-Conquer: When Randomness Meets Graph Locality

K-Nearest-Neighbors (KNN) graphs are central to many emblematic data min...
research
12/27/2019

Evolutionary Clustering via Message Passing

We are often interested in clustering objects that evolve over time and ...
research
03/11/2023

Distributed Solution of the Inverse Rig Problem in Blendshape Facial Animation

The problem of rig inversion is central in facial animation as it allows...
research
05/20/2020

p-Norm Flow Diffusion for Local Graph Clustering

Local graph clustering and the closely related seed set expansion proble...
research
10/15/2018

Learning by Unsupervised Nonlinear Diffusion

This paper proposes and analyzes a novel clustering algorithm that combi...

Please sign up or login with your details

Forgot password? Click here to reset