High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

12/01/2022
by   Jiahui Cheng, et al.
0

Label Shift has been widely believed to be harmful to the generalization performance of machine learning models. Researchers have proposed many approaches to mitigate the impact of the label shift, e.g., balancing the training data. However, these methods often consider the underparametrized regime, where the sample size is much larger than the data dimension. The research under the overparametrized regime is very limited. To bridge this gap, we propose a new asymptotic analysis of the Fisher Linear Discriminant classifier for binary classification with label shift. Specifically, we prove that there exists a phase transition phenomenon: Under certain overparametrized regime, the classifier trained using imbalanced data outperforms the counterpart with reduced balanced data. Moreover, we investigate the impact of regularization to the label shift: The aforementioned phase transition vanishes as the regularization becomes strong.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2020

Sequential changepoint detection for label shift in classification

Classifier predictions often rely on the assumption that new observation...
research
03/02/2021

Label-Imbalanced and Group-Sensitive Classification under Overparameterization

Label-imbalanced and group-sensitive classification seeks to appropriate...
research
11/28/2022

Double Data Piling for Heterogeneous Covariance Models

In this work, we characterize two data piling phenomenon for a high-dime...
research
09/22/2021

Sparse Uniformity Testing

In this paper we consider the uniformity testing problem for high-dimens...
research
10/26/2018

Negative Representation and Instability in Democratic Elections

Motivated by the troubling rise of political extremism and instability t...
research
07/24/2013

When is the majority-vote classifier beneficial?

In his seminal work, Schapire (1990) proved that weak classifiers could ...
research
08/15/2012

Asymptotic Generalization Bound of Fisher's Linear Discriminant Analysis

Fisher's linear discriminant analysis (FLDA) is an important dimension r...

Please sign up or login with your details

Forgot password? Click here to reset