Generalisation error in learning with random features and the hidden manifold model

02/21/2020
by   Federica Gerace, et al.
28

We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymptotic generalisation performance in these problems, valid in both the under- and over-parametrised regimes and for a broad choice of generalised linear model loss functions. In particular, we show how to obtain analytically the so-called double descent behaviour for logistic regression with a peak at the interpolation threshold, we illustrate the superiority of orthogonal against random Gaussian projections in learning with random features, and discuss the role played by correlations in the data generated by the hidden manifold model. Beyond the interest in these particular problems, the theoretical formalism introduced in this manuscript provides a path to further extensions to more complex tasks.

READ FULL TEXT

page 10

page 13

research
03/02/2020

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime

Deep neural networks can achieve remarkable generalization performances ...
research
03/02/2023

High-dimensional analysis of double descent for linear regression with random projections

We consider linear regression problems with a varying number of random p...
research
03/19/2019

Surprises in High-Dimensional Ridgeless Least Squares Interpolation

Interpolators -- estimators that achieve zero training error -- have att...
research
03/29/2023

The Hidden-Manifold Hopfield Model and a learning phase transition

The Hopfield model has a long-standing tradition in statistical physics,...
research
12/16/2020

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Deep networks are typically trained with many more parameters than the s...
research
07/16/2020

Large scale analysis of generalization error in learning using margin based classification methods

Large-margin classifiers are popular methods for classification. We deri...
research
10/16/2020

Binary Choice with Asymmetric Loss in a Data-Rich Environment: Theory and an Application to Racial Justice

The importance of asymmetries in prediction problems arising in economic...

Please sign up or login with your details

Forgot password? Click here to reset