Linear Classifiers Under Infinite Imbalance

06/10/2021
by   Paul Glasserman, et al.
0

We study the behavior of linear discriminant functions for binary classification in the infinite-imbalance limit, where the sample size of one class grows without bound while the sample size of the other remains fixed. The coefficients of the classifier minimize an expected loss specified through a weight function. We show that for a broad class of weight functions, the intercept diverges but the rest of the coefficient vector has a finite limit under infinite imbalance, extending prior work on logistic regression. The limit depends on the left tail of the weight function, for which we distinguish three cases: bounded, asymptotically polynomial, and asymptotically exponential. The limiting coefficient vectors reflect robustness or conservatism properties in the sense that they optimize against certain worst-case alternatives. In the bounded and polynomial cases, the limit is equivalent to an implicit choice of upsampling distribution for the minority class. We apply these ideas in a credit risk setting, with particular emphasis on performance in the high-sensitivity and high-specificity regions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2021

Statistical Theory for Imbalanced Binary Classification

Within the vast body of statistical theory developed for binary classifi...
research
02/12/2019

A Tunable Loss Function for Binary Classification

We present α-loss, α∈ [1,∞], a tunable loss function for binary classifi...
research
10/01/2021

Weight Vector Tuning and Asymptotic Analysis of Binary Linear Classifiers

Unlike its intercept, a linear classifier's weight vector cannot be tune...
research
02/15/2021

Don't Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification

Modern machine learning models with high accuracy are often miscalibrate...
research
05/26/2023

Exploring Weight Balancing on Long-Tailed Recognition Problem

Recognition problems in long-tailed data, where the sample size per clas...
research
09/21/2022

Infinite quantum signal processing

Quantum signal processing (QSP) represents a real scalar polynomial of d...

Please sign up or login with your details

Forgot password? Click here to reset