A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

09/02/2010
by   Yevgeny Seldin, et al.
0

We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering a more accurate way to deal with finite sample issues. We derive a bound minimization algorithm and show that it provides good results in real-life problems and that the derived PAC-Bayesian bound is reasonably tight.

READ FULL TEXT
research
01/13/2015

An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context

This paper provides a theoretical analysis of domain adaptation based on...
research
04/28/2021

Self-Bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound

In the PAC-Bayesian literature, the C-Bound refers to an insightful rela...
research
05/12/2011

PAC-Bayesian Analysis of Martingales and Multiarmed Bandits

We present two alternative ways to apply PAC-Bayesian analysis to sequen...
research
12/06/2019

Improved PAC-Bayesian Bounds for Linear Regression

In this paper, we improve the PAC-Bayesian error bound for linear regres...
research
10/10/2019

PAC-Bayesian Contrastive Unsupervised Representation Learning

Contrastive unsupervised representation learning (CURL) is the state-of-...
research
09/19/2010

Pair-Wise Cluster Analysis

This paper studies the problem of learning clusters which are consistent...
research
08/19/2016

A Strongly Quasiconvex PAC-Bayesian Bound

We propose a new PAC-Bayesian bound and a way of constructing a hypothes...

Please sign up or login with your details

Forgot password? Click here to reset