DeepAI AI Chat
Log In Sign Up

Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

05/22/2013
by   Yuchen Zhang, et al.
berkeley college
0

We establish optimal convergence rates for a decomposition-based scalable approach to kernel ridge regression. The method is simple to describe: it randomly partitions a dataset of size N into m subsets of equal size, computes an independent kernel ridge regression estimator for each subset, then averages the local solutions into a global predictor. This partitioning leads to a substantial reduction in computation time versus the standard approach of performing kernel ridge regression on all N samples. Our two main theorems establish that despite the computational speed-up, statistical optimality is retained: as long as m is not too large, the partition-based estimator achieves the statistical minimax rate over all estimators using the set of N samples. As concrete examples, our theory guarantees that the number of processors m may grow nearly linearly for finite-rank kernels and Gaussian kernels and polynomially in N for Sobolev spaces, which in turn allows for substantial reductions in computational cost. We conclude with experiments on both simulated data and a music-prediction task that complement our theoretical results, exhibiting the computational and statistical benefits of our approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/05/2016

Kernel Ridge Regression via Partitioning

In this paper, we investigate a divide and conquer approach to Kernel Ri...
10/24/2016

Parallelizing Spectral Algorithms for Kernel Learning

We consider a distributed learning approach in supervised learning for a...
05/24/2021

Uncertainty quantification for distributed regression

The ever-growing size of the datasets renders well-studied learning tech...
06/30/2020

Optimal Rates of Distributed Regression with Imperfect Kernels

Distributed machine learning systems have been receiving increasing atte...
03/27/2020

Distributed Kernel Ridge Regression with Communications

This paper focuses on generalization performance analysis for distribute...
02/10/2020

Distributed Learning with Dependent Samples

This paper focuses on learning rate analysis of distributed kernel ridge...
06/23/2021

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions

We introduce ParK, a new large-scale solver for kernel ridge regression....