Gaussian Universality of Linear Classifiers with Random Labels in High-Dimension

05/26/2022
by   Federica Gerace, et al.
8

While classical in many theoretical settings, the assumption of Gaussian i.i.d. inputs is often perceived as a strong limitation in the analysis of high-dimensional learning. In this study, we redeem this line of work in the case of generalized linear classification with random labels. Our main contribution is a rigorous proof that data coming from a range of generative models in high-dimensions have the same minimum training loss as Gaussian data with corresponding data covariance. In particular, our theorem covers data created by an arbitrary mixture of homogeneous Gaussian clouds, as well as multi-modal generative neural networks. In the limit of vanishing regularization, we further demonstrate that the training loss is independent of the data covariance. Finally, we show that this universality property is observed in practice with real datasets and random labels.

READ FULL TEXT

page 28

page 29

research
02/17/2023

Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

In this manuscript we consider the problem of generalized linear estimat...
research
06/14/2021

Generalized kernel distance covariance in high dimensions: non-null CLTs and power universality

Distance covariance is a popular dependence measure for two random vecto...
research
04/06/2023

Classification of Superstatistical Features in High Dimensions

We characterise the learning of a mixture of two clouds of data points w...
research
02/17/2023

Universality laws for Gaussian mixtures in generalized linear models

Let (x_i, y_i)_i=1,…,n denote independent samples from a general mixture...
research
06/25/2020

The Gaussian equivalence of generative models for learning with two-layer neural networks

Understanding the impact of data structure on learning in neural network...
research
02/16/2021

Capturing the learning curves of generic features maps for realistic data sets with a teacher-student model

Teacher-student models provide a powerful framework in which the typical...
research
10/04/2018

Gaussian approximation of Gaussian scale mixture

For a given positive random variable V>0 and a given Z∼ N(0,1) independe...

Please sign up or login with your details

Forgot password? Click here to reset