Hypergraph clustering with categorical edge labels

10/22/2019
by   Ilya Amburg, et al.
0

Graphs and networks are a standard model for describing data or systems based on pairwise interactions. Oftentimes, the underlying relationships involve more than two entities at a time, and hypergraphs are a more faithful model. However, we have fewer rigorous methods that can provide insight from such representations. Here, we develop a computational framework for the problem of clustering hypergraphs with categorical edge labels — or different interaction types — where clusters corresponds to groups of nodes that frequently participate in the same type of interaction. Our methodology is based on a combinatorial objective function that is related to correlation clustering but enables the design of much more efficient algorithms. When there are only two label types, our objective can be optimized in polynomial time, using an algorithm based on minimum cuts. Minimizing our objective becomes NP-hard with more than two label types, but we develop fast approximation algorithms based on linear programming relaxations that have theoretical cluster quality guarantees. We demonstrate the efficacy of our algorithms and the scope of the model through problems in edge-label community detection, clustering with temporal data, and exploratory data analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/20/2021

Faster Deterministic Approximation Algorithms for Correlation Clustering and Cluster Deletion

Correlation clustering is a framework for partitioning datasets based on...
research
05/28/2023

Overlapping and Robust Edge-Colored Clustering in Hypergraphs

A recent trend in data mining has explored (hyper)graph clustering algor...
research
02/21/2020

Parameterized Objectives and Algorithms for Clustering Bipartite Graphs and Hypergraphs

Graph clustering objective functions with tunable resolution parameters ...
research
04/20/2020

Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance

Clustering points in a vector space or nodes in a graph is a ubiquitous ...
research
11/08/2022

Significance-Based Categorical Data Clustering

Although numerous algorithms have been proposed to solve the categorical...
research
06/10/2020

Fair Clustering for Diverse and Experienced Groups

The ability for machine learning to exacerbate bias has led to many algo...
research
12/05/2018

Correlation Clustering in Data Streams

Clustering is a fundamental tool for analyzing large data sets. A rich b...

Please sign up or login with your details

Forgot password? Click here to reset