Semi-supervised Eigenvectors for Large-scale Locally-biased Learning

04/28/2013
by   Toke J. Hansen, et al.
0

In many applications, one has side information, e.g., labels that are provided in a semi-supervised manner, about a specific target region of a large data set, and one wants to perform machine learning and data analysis tasks "nearby" that prespecified target region. For example, one might be interested in the clustering structure of a data graph near a prespecified "seed set" of nodes, or one might be interested in finding partitions in an image that are near a prespecified "ground truth" set of pixels. Locally-biased problems of this sort are particularly challenging for popular eigenvector-based machine learning and data analysis tools. At root, the reason is that eigenvectors are inherently global quantities, thus limiting the applicability of eigenvector-based methods in situations where one is interested in very local properties of the data. In this paper, we address this issue by providing a methodology to construct semi-supervised eigenvectors of a graph Laplacian, and we illustrate how these locally-biased eigenvectors can be used to perform locally-biased machine learning. These semi-supervised eigenvectors capture successively-orthogonalized directions of maximum variance, conditioned on being well-correlated with an input seed set of nodes that is assumed to be provided in a semi-supervised manner. We show that these semi-supervised eigenvectors can be computed quickly as the solution to a system of linear equations; and we also describe several variants of our basic method that have improved scaling properties. We provide several empirical examples demonstrating how these semi-supervised eigenvectors can be used to perform locally-biased learning; and we discuss the relationship between our results and recent machine learning algorithms that use global eigenvectors of the graph Laplacian.

READ FULL TEXT

page 18

page 19

page 20

page 24

page 25

page 27

page 34

page 39

research
01/26/2020

Searching for polarization in signed graphs: a local spectral approach

Signed graphs have been used to model interactions in social net-works, ...
research
11/29/2018

Flow-Based Local Graph Clustering with Better Seed Set Inclusion

Flow-based methods for local graph clustering have received significant ...
research
09/13/2016

Mapping the Similarities of Spectra: Global and Locally-biased Approaches to SDSS Galaxy Data

We apply a novel spectral graph technique, that of locally-biased semi-s...
research
05/23/2018

Large Data and Zero Noise Limits of Graph-Based Semi-Supervised Learning Algorithms

Scalings in which the graph Laplacian approaches a differential operator...
research
07/01/2013

Semi-supervised clustering methods

Cluster analysis methods seek to partition a data set into homogeneous s...
research
09/13/2019

Spectral Analysis Of Weighted Laplacians Arising In Data Clustering

Graph Laplacians computed from weighted adjacency matrices are widely us...
research
09/14/2017

Fast semi-supervised discriminant analysis for binary classification of large data-sets

High-dimensional data requires scalable algorithms. We propose and analy...

Please sign up or login with your details

Forgot password? Click here to reset