How isotropic kernels learn simple invariants

06/17/2020
by   Jonas Paccolat, et al.
0

We investigate how the training curve of isotropic kernel methods depends on the symmetry of the task to be learned, in several settings. (i) We consider a regression task, where the target function is a Gaussian random field that depends only on d_∥ variables, fewer than the input dimension d. We compute the expected test error ϵ that follows ϵ∼ p^-β where p is the size of the training set. We find that β∼1/d independently of d_∥, supporting previous findings that the presence of invariants does not resolve the curse of dimensionality for kernel regression. (ii) Next we consider support-vector binary classification and introduce the stripe model where the data label depends on a single coordinate y( x) = y(x_1), corresponding to parallel decision boundaries separating labels of different signs, and consider that there is no margin at these interfaces. We argue and confirm numerically that for large bandwidth, β = d-1+ξ/3d-3+ξ, where ξ∈ (0,2) is the exponent characterizing the singularity of the kernel at the origin. This estimation improves classical bounds obtainable from Rademacher complexity. In this setting there is no curse of dimensionality since β→1/3 as d→∞. (iii) We confirm these findings for the spherical model for which y( x) = y(|| x||). (iv) In the stripe model, we show that if the data are compressed along their invariants by some factor λ (an operation believed to take place in deep networks), the test error is reduced by a factor λ^-2(d-1)/3d-3+ξ.

READ FULL TEXT
research
05/26/2019

Asymptotic learning curves of kernel methods: empirical data v.s. Teacher-Student paradigm

How many training data are needed to learn a supervised task? It is ofte...
research
06/16/2021

Locality defeats the curse of dimensionality in convolutional teacher-student scenarios

Convolutional neural networks perform a local and translationally-invari...
research
07/22/2020

Compressing invariant manifolds in neural nets

We study how neural networks compress uninformative input space in model...
research
05/26/2022

Kernel Ridgeless Regression is Inconsistent for Low Dimensions

We show that kernel interpolation for a large class of shift-invariant k...
research
02/17/2016

Peak Criterion for Choosing Gaussian Kernel Bandwidth in Support Vector Data Description

Support Vector Data Description (SVDD) is a machine-learning technique u...
research
08/01/2022

How Wide Convolutional Neural Networks Learn Hierarchical Tasks

Despite their success, understanding how convolutional neural networks (...

Please sign up or login with your details

Forgot password? Click here to reset