DeepAI AI Chat
Log In Sign Up

Spectral clustering in the weighted stochastic block model

by   Ian Gallagher, et al.

This paper is concerned with the statistical analysis of a real-valued symmetric data matrix. We assume a weighted stochastic block model: the matrix indices, taken to represent nodes, can be partitioned into communities so that all entries corresponding to a given community pair are replicates of the same random variable. Extending results previously known only for unweighted graphs, we provide a limit theorem showing that the point cloud obtained from spectrally embedding the data matrix follows a Gaussian mixture model where each community is represented with an elliptical component. We can therefore formally evaluate how well the communities separate under different data transformations, for example, whether it is productive to "take logs". We find that performance is invariant to affine transformation of the entries, but this expected and desirable feature hinges on adaptively selecting the eigenvectors according to eigenvalue magnitude and using Gaussian clustering. We present a network anomaly detection problem with cyber-security data where the matrix of log p-values, as opposed to p-values, has both theoretical and empirical advantages.


page 1

page 2

page 3

page 4


Spectral clustering in the Gaussian mixture block model

Gaussian mixture block models are distributions over graphs that strive ...

Exact Recovery of Community Detection in k-Community Gaussian Mixture Model

We study the community detection problem on a Gaussian mixture model, in...

Spectral clustering on spherical coordinates under the degree-corrected stochastic blockmodel

Spectral clustering is a popular method for community detection in netwo...

Spectral clustering under degree heterogeneity: a case for the random walk Laplacian

This paper shows that graph spectral embedding using the random walk Lap...

A simpler spectral approach for clustering in directed networks

We study the task of clustering in directed networks. We show that using...

Spectral Clustering Revisited: Information Hidden in the Fiedler Vector

We are interested in the clustering problem on graphs: it is known that ...