Parameter-free Statistically Consistent Interpolation: Dimension-independent Convergence Rates for Hilbert kernel regression

06/07/2021
by   Partha P Mitra, et al.
4

Previously, statistical textbook wisdom has held that interpolating noisy data will generalize poorly, but recent work has shown that data interpolation schemes can generalize well. This could explain why overparameterized deep nets do not necessarily overfit. Optimal data interpolation schemes have been exhibited that achieve theoretical lower bounds for excess risk in any dimension for large data (Statistically Consistent Interpolation). These are non-parametric Nadaraya-Watson estimators with singular kernels. The recently proposed weighted interpolating nearest neighbors method (wiNN) is in this class, as is the previously studied Hilbert kernel interpolation scheme, in which the estimator has the form f̂(x)=∑_i y_i w_i(x), where w_i(x)= x-x_i^-d/∑_j x-x_j^-d. This estimator is unique in being completely parameter-free. While statistical consistency was previously proven, convergence rates were not established. Here, we comprehensively study the finite sample properties of Hilbert kernel regression. We prove that the excess risk is asymptotically equivalent pointwise to σ^2(x)/ln(n) where σ^2(x) is the noise variance. We show that the excess risk of the plugin classifier is less than 2|f(x)-1/2|^1-α (1+ε)^ασ^α(x)(ln(n))^-α/2, for any 0<α<1, where f is the regression function x↦𝔼[y|x]. We derive asymptotic equivalents of the moments of the weight functions w_i(x) for large n, for instance for β>1, 𝔼[w_i^β(x)]∼_n→∞((β-1)nln(n))^-1. We derive an asymptotic equivalent for the Lagrange function and exhibit the nontrivial extrapolation properties of this estimator. We present heuristic arguments for a universal w^-2 power-law behavior of the probability density of the weights in the large n limit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/28/2023

Kernel interpolation generalizes poorly

One of the most interesting problems in the recent renaissance of the st...
research
11/09/2021

Harmless interpolation in regression and classification with structured features

Overparametrized neural networks tend to perfectly fit noisy training da...
research
03/31/2021

Fitting Elephants

Textbook wisdom advocates for smooth function fits and implies that inte...
research
03/08/2019

Kernel Based Estimation of Spectral Risk Measures

Spectral risk measures (SRMs) belongs to the family of coherent risk mea...
research
03/23/2022

Stability of convergence rates: Kernel interpolation on non-Lipschitz domains

Error estimates for kernel interpolation in Reproducing Kernel Hilbert S...
research
08/27/2021

Convergence Rates for Learning Linear Operators from Noisy Data

We study the Bayesian inverse problem of learning a linear operator on a...

Please sign up or login with your details

Forgot password? Click here to reset