Flow-Based Local Graph Clustering with Better Seed Set Inclusion

11/29/2018
by   Nate Veldt, et al.
0

Flow-based methods for local graph clustering have received significant recent attention for their theoretical cut improvement and runtime guarantees. In this work we present two improvements for using flow-based methods in real-world semi-supervised clustering problems. Our first contribution is a generalized objective function that allows practitioners to place strict and soft penalties on excluding specific seed nodes from the output set. This feature allows us to avoid the tendency, often exhibited by previous flow-based methods, to contract a large seed set into a small set of nodes that does not contain all or even most of the seed nodes. Our second contribution is a fast algorithm for minimizing our generalized objective function, based on a variant of the push-relabel algorithm for computing preflows. We make our approach very fast in practice by implementing a global relabeling heuristic and employing a warm-start procedure to quickly solve related cut problems. In practice our algorithm is faster than previous related flow-based methods, and is also more robust in detecting ground truth target regions in a graph, thanks to its ability to better incorporate semi-supervised information about target clusters.

READ FULL TEXT
research
04/28/2013

Semi-supervised Eigenvectors for Large-scale Locally-biased Learning

In many applications, one has side information, e.g., labels that are pr...
research
04/20/2020

Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance

Clustering points in a vector space or nodes in a graph is a ubiquitous ...
research
05/20/2020

p-Norm Flow Diffusion for Local Graph Clustering

Local graph clustering and the closely related seed set expansion proble...
research
02/21/2020

Localized Flow-Based Clustering in Hypergraphs

Local graph clustering algorithms are designed to efficiently detect sma...
research
10/05/2021

Extensions of Karger's Algorithm: Why They Fail in Theory and How They Are Useful in Practice

The minimum graph cut and minimum s-t-cut problems are important primiti...
research
01/26/2020

Searching for polarization in signed graphs: a local spectral approach

Signed graphs have been used to model interactions in social net-works, ...
research
11/03/2016

Fast Eigenspace Approximation using Random Signals

We focus in this work on the estimation of the first k eigenvectors of a...

Please sign up or login with your details

Forgot password? Click here to reset