Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

05/30/2022
by   Lechao Xiao, et al.
0

As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-sample asymptotics (m→∞) or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension (m∝ d). There is a wide gulf between these two regimes, including all higher-order scaling relations m∝ d^r, which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the test error, bias, and variance, for data drawn uniformly from the sphere in the rth-order asymptotic scaling regime m→∞ with m/d^r held constant. We observe a peak in the learning curve whenever m ≈ d^r/r! for any integer r, leading to multiple sample-wise descent and nontrivial behavior at multiple scales.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2022

Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime

The generalization performance of kernel ridge regression (KRR) exhibits...
research
10/06/2020

Kernel regression in high dimension: Refined analysis beyond double descent

In this paper, we provide a precise characterize of generalization prope...
research
11/16/2020

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Contemporary machine learning applications often involve classification ...
research
04/21/2022

Spectrum of inner-product kernel matrices in the polynomial regime and multiple descent phenomenon in kernel ridge regression

We study the spectrum of inner-product kernel matrices, i.e., n × n matr...
research
06/23/2020

Statistical Mechanics of Generalization in Kernel Regression

Generalization beyond a training dataset is a main goal of machine learn...
research
02/01/2023

Optimal Learning of Deep Random Networks of Extensive-width

We consider the problem of learning a target function corresponding to a...
research
07/10/2014

On the Optimality of Averaging in Distributed Statistical Learning

A common approach to statistical learning with big-data is to randomly s...

Please sign up or login with your details

Forgot password? Click here to reset