Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models

06/23/2023
by   David X. Wu, et al.
0

We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al. '22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al. '22, matching the predicted regimes for generalization. Furthermore, our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1 asymptotically. One surprising consequence of our tight results is that the min-norm interpolating classifier can be asymptotically suboptimal relative to noninterpolating classifiers in the regime where the min-norm interpolating regressor is known to be optimal. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels. As an application, we show that the same type of analysis can be used to analyze the related multilabel classification problem under the same bi-level ensemble.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2022

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

We prove a new generalization bound that shows for any class of linear p...
research
05/01/2023

Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem

We provide a new information-theoretic generalization error bound that i...
research
02/19/2020

Truly Tight-in-Δ Bounds for Bipartite Maximal Matching and Variants

In a recent breakthrough result, Balliu et al. [FOCS'19] proved a determ...
research
06/03/2022

Generalization for multiclass classification with overparameterized linear models

Via an overparameterized linear model with Gaussian features, we provide...
research
11/01/2018

Semi-Finite Length Analysis for Information Theoretic Tasks

We focus on the optimal value for various information-theoretical tasks....
research
07/06/2023

Efficiency of Self-Adjusting Heaps

Since the invention of the pairing heap by Fredman et al., it has been a...
research
05/22/2018

Fully Understanding the Hashing Trick

Feature hashing, also known as the hashing trick, introduced by Weinber...

Please sign up or login with your details

Forgot password? Click here to reset