Spectral clustering in the Gaussian mixture block model

04/29/2023
by   Shuangping Li, et al.
0

Gaussian mixture block models are distributions over graphs that strive to model modern networks: to generate a graph from such a model, we associate each vertex i with a latent feature vector u_i ∈ℝ^d sampled from a mixture of Gaussians, and we add edge (i,j) if and only if the feature vectors are sufficiently similar, in that ⟨ u_i,u_j ⟩≥τ for a pre-specified threshold τ. The different components of the Gaussian mixture represent the fact that there may be different types of nodes with different distributions over features – for example, in a social network each component represents the different attributes of a distinct community. Natural algorithmic tasks associated with these networks are embedding (recovering the latent feature vectors) and clustering (grouping nodes by their mixture component). In this paper we initiate the study of clustering and embedding graphs sampled from high-dimensional Gaussian mixture block models, where the dimension of the latent feature vectors d→∞ as the size of the network n →∞. This high-dimensional setting is most appropriate in the context of modern networks, in which we think of the latent feature space as being high-dimensional. We analyze the performance of canonical spectral clustering and embedding algorithms for such graphs in the case of 2-component spherical Gaussian mixtures, and begin to sketch out the information-computation landscape for clustering and embedding in these models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2019

Optimality of Spectral Clustering for Gaussian Mixture Model

Spectral clustering is one of the most popular algorithms to group high ...
research
10/12/2019

Spectral clustering in the weighted stochastic block model

This paper is concerned with the statistical analysis of a real-valued s...
research
12/07/2020

Spectral clustering via adaptive layer aggregation for multi-layer networks

One of the fundamental problems in network analysis is detecting communi...
research
05/09/2017

Semiparametric spectral modeling of the Drosophila connectome

We present semiparametric spectral modeling of the complete larval Droso...
research
06/20/2022

flow-based clustering and spectral clustering: a comparison

We propose and study a novel graph clustering method for data with an in...
research
09/01/2023

Consistency of Lloyd's Algorithm Under Perturbations

In the context of unsupervised learning, Lloyd's algorithm is one of the...
research
05/02/2013

Learning Mixtures of Bernoulli Templates by Two-Round EM with Performance Guarantee

Dasgupta and Shulman showed that a two-round variant of the EM algorithm...

Please sign up or login with your details

Forgot password? Click here to reset