Fast Randomized Kernel Methods With Statistical Guarantees

11/02/2014
by   Ahmed El Alaoui, et al.
0

One approach to improving the running time of kernel-based machine learning methods is to build a small sketch of the input and use it in lieu of the full kernel matrix in the machine learning task of interest. Here, we describe a version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance. By extending the notion of statistical leverage scores to the setting of kernel ridge regression, our main statistical result is to identify an importance sampling distribution that reduces the size of the sketch (i.e., the required number of columns to be sampled) to the effective dimensionality of the problem. This quantity is often much smaller than previous bounds that depend on the maximal degrees of freedom. Our main algorithmic result is to present a fast algorithm to compute approximations to these scores. This algorithm runs in time that is linear in the number of samples---more precisely, the running time is O(np^2), where the parameter p depends only on the trace of the kernel matrix and the regularization parameter---and it can be applied to the matrix of feature vectors, without having to form the full kernel matrix. This is obtained via a variant of length-squared sampling that we adapt to the kernel setting in a way that is of independent interest. Lastly, we provide empirical results illustrating our theory, and we discuss how this new notion of the statistical leverage of a data point captures in a fine way the difficulty of the original statistical learning problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2018

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Ridge leverage scores provide a balance between low-rank approximation a...
research
11/22/2017

Leverage Score Sampling for Faster Accelerated Regression and ERM

Given a matrix A∈R^n× d and a vector b ∈R^d, we show how to compute an ϵ...
research
08/21/2021

Fast Sketching of Polynomial Kernels of Polynomial Degree

Kernel methods are fundamental in machine learning, and faster algorithm...
research
03/27/2018

Distributed Adaptive Sampling for Kernel Matrix Approximation

Most kernel-based methods, such as kernel or Gaussian process regression...
research
06/23/2013

A Statistical Perspective on Algorithmic Leveraging

One popular method for dealing with large-scale data sets is sampling. F...
research
06/03/2021

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Empirical risk minimization (ERM) is the workhorse of machine learning, ...
research
05/21/2018

Effective Dimension of Exp-concave Optimization

We investigate the role of the effective (a.k.a. statistical) dimension ...

Please sign up or login with your details

Forgot password? Click here to reset