Label-Imbalanced and Group-Sensitive Classification under Overparameterization

03/02/2021
by   Ganesh Ramachandra Kini, et al.
0

Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics such as balanced error and/or equal opportunity. For label imbalances, recent works have proposed a logit-adjusted loss modification to standard empirical risk minimization. We show that this might be ineffective in general and, in particular so, in the overparameterized regime where training continues in the zero training-error regime. Specifically for binary linear classification of a separable dataset, we show that the modified loss converges to the max-margin SVM classifier despite the logit adjustment. Instead, we propose a more general vector-scaling loss that directly relates to the cost-sensitive SVM (CS-SVM), thus favoring larger margin to the minority class. Through an insightful sharp asymptotic analysis for a Gaussian-mixtures data model, we demonstrate the efficacy of CS-SVM in balancing the errors of the minority/majority classes. Our analysis also leads to a simple strategy for optimally tuning the involved margin-ratio parameter. Then, we show how our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way. We corroborate our theoretical findings with numerical experiments on both synthetic and real-world datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2012

Cost-Sensitive Support Vector Machines

A new procedure for learning cost-sensitive SVM(CS-SVM) classifiers is p...
research
06/21/2021

Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

The growing literature on "benign overfitting" in overparameterized mode...
research
12/01/2022

High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

Label Shift has been widely believed to be harmful to the generalization...
research
06/04/2013

∝SVM for learning with label proportions

We study the problem of learning with label proportions in which the tra...
research
03/14/2023

On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data

Various logit-adjusted parameterizations of the cross-entropy (CE) loss ...
research
06/19/2022

Primal Estimated Subgradient Solver for SVM for Imbalanced Classification

We aim to demonstrate in experiments that our cost sensitive PEGASOS SVM...
research
08/04/2014

Multithreshold Entropy Linear Classifier

Linear classifiers separate the data with a hyperplane. In this paper we...

Please sign up or login with your details

Forgot password? Click here to reset