Kernel Sketching yields Kernel JL

by   Samory Kpotufe, et al.

The main contribution of the paper is to show that Gaussian sketching of a kernel-Gram matrix K yields an operator whose counterpart in an RKHS H, is a random projection operator—in the spirit of Johnson-Lindenstrauss (JL) lemma. To be precise, given a random matrix Z with i.i.d. Gaussian entries, we show that a sketch ZK corresponds to a particular random operator in (infinite-dimensional) Hilbert space H that maps functions f ∈H to a low-dimensional space R^d, while preserving a weighted RKHS inner-product of the form 〈 f, g 〉_Σ〈 f, Σ^3 g 〉_H, where Σ is the covariance operator induced by the data distribution. In particular, under similar assumptions as in kernel PCA (KPCA), or kernel k-means (K-k-means), well-separated subsets of feature-space {K(·, x): x ∈ X} remain well-separated after such operation, which suggests similar benefits as in KPCA and/or K-k-means, albeit at the much cheaper cost of a random projection. In particular, our convergence rates suggest that, given a large dataset {X_i}_i=1^N of size N, we can build the Gram matrix K on a much smaller subsample of size n≪ N, so that the sketch ZK is very cheap to obtain and subsequently apply as a projection operator on the original data {X_i}_i=1^N. We verify these insights empirically on synthetic data, and on real-world clustering applications.


page 1

page 2

page 3

page 4


Effective and Sparse Count-Sketch via k-means clustering

Count-sketch is a popular matrix sketching algorithm that can produce a ...

Projection-Cost-Preserving Sketches: Proof Strategies and Constructions

In this note we illustrate how common matrix approximation methods, such...

Kernel Treelets

A new method for hierarchical clustering is presented. It combines treel...

On the Size of the Online Kernel Sparsification Dictionary

We analyze the size of the dictionary constructed from online kernel spa...

A Distance-preserving Matrix Sketch

Visualizing very large matrices involves many formidable problems. Vario...

Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators

We present incomplete gamma kernels, a generalization of Locally Optimal...

Please sign up or login with your details

Forgot password? Click here to reset