Spectral Analysis Of Weighted Laplacians Arising In Data Clustering

09/13/2019
by   Franca Hoffmann, et al.
30

Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, therefore, sheds light on learning algorithms. This paper is devoted to the study of a parameterized family of divergence form elliptic operators that arise as the large data limit of graph Laplacians. The link between a three-parameter family of graph Laplacians and a three-parameter family of differential operators is explained. The spectral properties of these differential perators are analyzed in the situation where the data comprises two nearly separated clusters, in a sense which is made precise. In particular, we investigate how the spectral gap depends on the three parameters entering the graph Laplacian and on a parameter measuring the size of the perturbation from the perfectly clustered case. Numerical results are presented which exemplify and extend the analysis; in particular the computations study situations with more than two clusters. The findings provide insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices; they also provide the basis for analysis of the consistency of various unsupervised and semi-supervised learning algorithms, in the large data limit.

READ FULL TEXT
research
05/23/2018

Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

Scalings in which the graph Laplacian approaches a differential operator...
research
06/18/2019

Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods

Graph-based semi-supervised learning is the problem of propagating label...
research
11/27/2021

Asymptotic spectra of large (grid) graphs with a uniform local structure (part II): numerical applications

In the current work we are concerned with sequences of graphs having a g...
research
11/02/2016

Learning Methods for Dynamic Topic Modeling in Automated Behaviour Analysis

Semi-supervised and unsupervised systems provide operators with invaluab...
research
04/28/2013

Semi-supervised Eigenvectors for Large-scale Locally-biased Learning

In many applications, one has side information, e.g., labels that are pr...
research
09/23/2019

PDE-Inspired Algorithms for Semi-Supervised Learning on Point Clouds

Given a data set and a subset of labels the problem of semi-supervised l...
research
09/04/2019

The spectral properties of Vandermonde matrices with clustered nodes

We study rectangular Vandermonde matrices V with N+1 rows and s irregula...

Please sign up or login with your details

Forgot password? Click here to reset