Fluctuations, Bias, Variance Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

01/31/2022
by   Bruno Loureiro, et al.
6

From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features in high-dimensions. In particular, we provide a complete description of the asymptotic joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit. Our result encompasses a rich set of classification and regression tasks, such as the lazy regime of overparametrised neural networks, or equivalently the random features approximation of kernels. While allowing to study directly the mitigating effect of ensembling (or bagging) on the bias-variance decomposition of the test error, our analysis also helps disentangle the contribution of statistical fluctuations, and the singular role played by the interpolation threshold that are at the roots of the "double-descent" phenomenon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2020

Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime

Deep neural networks can achieve remarkable generalization performances ...
research
11/04/2020

Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition

Classical learning theory suggests that the optimal generalization perfo...
research
04/06/2023

Classification of Superstatistical Features in High Dimensions

We characterise the learning of a mixture of two clouds of data points w...
research
10/11/2020

What causes the test error? Going beyond bias-variance via ANOVA

Modern machine learning methods are often overparametrized, allowing ada...
research
09/06/2021

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

The rapid recent progress in machine learning (ML) has raised a number o...
research
12/16/2019

More Data Can Hurt for Linear Regression: Sample-wise Double Descent

In this expository note we describe a surprising phenomenon in overparam...
research
01/10/2023

A Unified Theory of Diversity in Ensemble Learning

We present a theory of ensemble diversity, explaining the nature and eff...

Please sign up or login with your details

Forgot password? Click here to reset