Concentration of kernel matrices with application to kernel spectral clustering

09/07/2019
by   Arash A. Amini, et al.
0

We study the concentration of random kernel matrices around their mean. We derive nonasymptotic exponential concentration inequalities for Lipschitz kernels assuming that the data points are independent draws from a class of multivariate distributions on R^d, including the strongly log-concave distributions under affine transformations. A feature of our result is that the data points need not have identical distributions or have zero mean, which is key in certain applications such as clustering. For comparison, we also derive the companion result for the Euclidean (inner product) kernel under a slightly modified set of distributional assumptions, more precisely, a class of sub-Gaussian vectors. A notable difference between the two cases is that, in contrast to the Euclidean kernel, in the Lipschitz case, the concentration inequality does not depend on the mean of the underlying vectors. As an application of these inequalities, we derive a bound on the misclassification rate of a kernel spectral clustering (KSC) algorithm, under a perturbed nonparametric mixture model. We show an example where this bound establishes the high-dimensional consistency (as d →∞) of the KSC, when applied with a Gaussian kernel, to a signal consisting of nested nonlinear manifolds (e.g., spheres) plus noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2022

Variations and extensions of the Gaussian concentration inequality, Part II

Pisier's version of the Gaussian concentration inequality is transformed...
research
08/28/2020

On the Non-Asymptotic Concentration of Heteroskedastic Wishart-type Matrix

This paper focuses on the non-asymptotic concentration of the heterosked...
research
02/16/2021

Concentration of measure and generalized product of random vectors with an application to Hanson-Wright-like inequalities

Starting from concentration of measure hypotheses on m random vectors Z_...
research
12/16/2019

A Robust Spectral Clustering Algorithm for Sub-Gaussian Mixture Models with Outliers

We consider the problem of clustering datasets in the presence of arbitr...
research
02/23/2017

Spectral Clustering using PCKID - A Probabilistic Cluster Kernel for Incomplete Data

In this paper, we propose PCKID, a novel, robust, kernel function for sp...
research
09/01/2023

Consistency of Lloyd's Algorithm Under Perturbations

In the context of unsupervised learning, Lloyd's algorithm is one of the...
research
12/05/2018

Relative concentration bounds for the kernel matrix spectrum

In this paper, we study the concentration properties of the kernel matri...

Please sign up or login with your details

Forgot password? Click here to reset