Spectral Clustering Oracles in Sublinear Time

01/14/2021
by   Grzegorz Głuch, et al.
0

Given a graph G that can be partitioned into k disjoint expanders with outer conductance upper bounded by ϵ≪ 1, can we efficiently construct a small space data structure that allows quickly classifying vertices of G according to the expander (cluster) they belong to? Formally, we would like an efficient local computation algorithm that misclassifies at most an O(ϵ) fraction of vertices in every expander. We refer to such a data structure as a spectral clustering oracle. Our main result is a spectral clustering oracle with query time O^*(n^1/2+O(ϵ)) and preprocessing time 2^O(1/ϵ k^4 log^2(k)) n^1/2+O(ϵ) that provides misclassification error O(ϵlog k) per cluster for any ϵ≪ 1/log k. More generally, query time can be reduced at the expense of increasing the preprocessing time appropriately (as long as the product is about n^1+O(ϵ)) – this in particular gives a nearly linear time spectral clustering primitive. The main technical contribution is a sublinear time oracle that provides dot product access to the spectral embedding of G by estimating distributions of short random walks from vertices in G. The distributions themselves provide a poor approximation to the spectral embedding, but we show that an appropriate linear transformation can be used to achieve high precision dot product access. We then show that dot product access to the spectral embedding is sufficient to design a clustering oracle. At a high level our approach amounts to hyperplane partitioning in the spectral embedding of G, but crucially operates on a nested sequence of carefully defined subspaces in the spectral embedding to achieve per cluster recovery guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2019

Robust Clustering Oracle and Local Reconstructor of Cluster Structure of Graphs

Due to the massive size of modern network data, local algorithms that ru...
research
06/28/2022

Sublinear-Time Clustering Oracle for Signed Graphs

Social networks are often modeled using signed graphs, where vertices co...
research
10/21/2008

Foundations of a Multi-way Spectral Clustering Framework for Hybrid Linear Modeling

The problem of Hybrid Linear Modeling (HLM) is to model and segment data...
research
02/25/2023

A parameter-free graph reduction for spectral clustering and SpectralNet

Graph-based clustering methods like spectral clustering and SpectralNet ...
research
02/05/2016

Compressive Spectral Clustering

Spectral clustering has become a popular technique due to its high perfo...
research
04/05/2019

Simultaneous Dimensionality and Complexity Model Selection for Spectral Graph Clustering

Our problem of interest is to cluster vertices of a graph by identifying...
research
06/25/2007

Separating populations with wide data: A spectral analysis

In this paper, we consider the problem of partitioning a small data samp...

Please sign up or login with your details

Forgot password? Click here to reset