-
Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features
Although kernel methods are widely used in many learning problems, they ...
read it
-
The Error Probability of Random Fourier Features is Dimensionality Independent
We show that the error probability of reconstructing kernel matrices fro...
read it
-
Theoretical Analysis of Divide-and-Conquer ERM: Beyond Square Loss and RKHS
Theoretical analysis of the divide-and-conquer based distributed learnin...
read it
-
Random Fourier Features via Fast Surrogate Leverage Weighted Sampling
In this paper, we propose a fast surrogate leverage weighted sampling st...
read it
-
Regularized ERM on random subspaces
We study a natural extension of classical empirical risk minimization, w...
read it
-
Optimal Rates for Random Fourier Features
Kernel methods represent one of the most powerful tools in machine learn...
read it
-
Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond
Random features is one of the most sought-after research topics in stati...
read it
A Unified Analysis of Random Fourier Features
We provide the first unified theoretical analysis of supervised learning with random Fourier features, covering different types of loss functions characteristic to kernel methods developed for this setting. More specifically, we investigate learning with squared error and Lipschitz continuous loss functions and give the sharpest expected risk convergence rates for problems in which random Fourier features are sampled either using the spectral measure corresponding to a shift-invariant kernel or the ridge leverage score function proposed in avron2017random. The trade-off between the number of features and the expected risk convergence rate is expressed in terms of the regularization parameter and the effective dimension of the problem. While the former can effectively capture the complexity of the target hypothesis, the latter is known for expressing the fine structure of the kernel with respect to the marginal distribution of a data generating process caponnetto2007optimal. In addition to our theoretical results, we propose an approximate leverage score sampler for large scale problems and show that it can be significantly more effective than the spectral measure sampler.
READ FULL TEXT
Comments
There are no comments yet.