Leverage Score Sampling for Tensor Product Matrices in Input Sparsity Time

02/09/2022
by   David P. Woodruff, et al.
0

We give an input sparsity time sampling algorithm for spectrally approximating the Gram matrix corresponding to the q-fold column-wise tensor product of q matrices using a nearly optimal number of samples, improving upon all previously known methods by poly(q) factors. Furthermore, for the important special care of the q-fold self-tensoring of a dataset, which is the feature matrix of the degree-q polynomial kernel, the leading term of our method's runtime is proportional to the size of the dataset and has no dependence on q. Previous techniques either incur a poly(q) factor slowdown in their runtime or remove the dependence on q at the expense of having sub-optimal target dimension, and depend quadratically on the number of data-points in their runtime. Our sampling technique relies on a collection of q partially correlated random projections which can be simultaneously applied to a dataset X in total time that only depends on the size of X, and at the same time their q-fold Kronecker product acts as a near-isometry for any fixed vector in the column span of X^⊗ q. We show that our sampling methods generalize to other classes of kernels beyond polynomial, such as Gaussian and Neural Tangent kernels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2020

Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling

To accelerate kernel methods, we propose a near input sparsity time algo...
research
08/21/2021

Fast Sketching of Polynomial Kernels of Polynomial Degree

Kernel methods are fundamental in machine learning, and faster algorithm...
research
06/23/2020

Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Robust covariance estimation is the following, well-studied problem in h...
research
04/01/2021

Learning with Neural Tangent Kernels in Near Input Sparsity Time

The Neural Tangent Kernel (NTK) characterizes the behavior of infinitely...
research
03/16/2018

Leveraging Sparsity to Speed Up Polynomial Feature Expansions of CSR Matrices Using K-Simplex Numbers

We provide an algorithm that speeds up polynomial and interaction featur...
research
11/09/2021

Active Sampling for Linear Regression Beyond the ℓ_2 Norm

We study active sampling algorithms for linear regression, which aim to ...
research
10/22/2022

Discrepancy Minimization in Input-Sparsity Time

A recent work of Larsen [Lar23] gave a faster combinatorial alternative ...

Please sign up or login with your details

Forgot password? Click here to reset