Mixed Membership Graph Clustering via Systematic Edge Query

by   Shahana Ibrahim, et al.

This work considers clustering nodes of a largely incomplete graph. Under the problem setting, only a small amount of queries about the edges can be made, but the entire graph is not observable. This problem finds applications in large-scale data clustering using limited annotations, community detection under restricted survey resources, and graph topology inference under hidden/removed node interactions. Prior works treated this problem as a convex optimization-based matrix completion task. However, this line of work is designed for learning single cluster membership of nodes belonging to disjoint clusters, yet mixed (multiple) cluster membership nodes and overlapping clusters often arise in practice. Existing works also rely on the uniformly random edge query pattern and nuclear norm-based optimization, which give rise to a number of implementation and scalability challenges. This work aims at learning mixed membership of the nodes of overlapping clusters using edge queries. Our method offers membership learning guarantees under systematic query patterns (as opposed to random ones). The query patterns can be controlled and adjusted by the system designers to accommodate implementation challenges—e.g., to avoid querying edges that are physically hard to acquire. Our framework also features a lightweight and scalable algorithm. Real-data experiments on crowdclustering and community detection are used to showcase the effectiveness of our method.



There are no comments yet.


page 13


Estimating mixed-memberships using the Symmetric Laplacian Inverse Matrix

Community detection has been well studied in network analysis, and one p...

Estimating network memberships by mixed regularized spectral clustering

Mixed membership community detection is a challenge problem in network a...

Mixed Membership Distribution-Free model

We consider the problem of detecting latent community information of mix...

The generalised random dot product graph

This paper introduces a latent position network model, called the genera...

Robust and computationally feasible community detection in the presence of arbitrary outlier nodes

Community detection, which aims to cluster N nodes in a given graph into...

Clustering Partially Observed Graphs via Convex Optimization

This paper considers the problem of clustering a partially observed unwe...

Breaking the Small Cluster Barrier of Graph Clustering

This paper investigates graph clustering in the planted cluster model in...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.