Equity-Directed Bootstrapping: Examples and Analysis

08/14/2021
by   Harish S. Bhat, et al.
3

When faced with severely imbalanced binary classification problems, we often train models on bootstrapped data in which the number of instances of each class occur in a more favorable ratio, e.g., one. We view algorithmic inequity through the lens of imbalanced classification: in order to balance the performance of a classifier across groups, we can bootstrap to achieve training sets that are balanced with respect to both labels and group identity. For an example problem with severe class imbalance—prediction of suicide death from administrative patient records—we illustrate how an equity-directed bootstrap can bring test set sensitivities and specificities much closer to satisfying the equal odds criterion. In the context of naïve Bayes and logistic regression, we analyze the equity-directed bootstrap, demonstrating that it works by bringing odds ratios close to one, and linking it to methods involving intercept adjustment, thresholding, and weighting.

READ FULL TEXT
research
10/09/2020

Handling Imbalanced Data: A Case Study for Binary Class Problems

For several years till date, the major issues in terms of solving for cl...
research
10/28/2022

Improving Multi-class Classifier Using Likelihood Ratio Estimation with Regularization

The universal-set naive Bayes classifier (UNB) <cit.>, defined using lik...
research
04/07/2020

Long-Tailed Recognition Using Class-Balanced Experts

Classic deep learning methods achieve impressive results in image recogn...
research
04/19/2022

Imbalanced Classification via a Tabular Translation GAN

When presented with a binary classification problem where the data exhib...
research
08/02/2016

Can we trust the bootstrap in high-dimension?

We consider the performance of the bootstrap in high-dimensions for the ...
research
04/19/2018

Instance Selection Improves Geometric Mean Accuracy: A Study on Imbalanced Data Classification

A natural way of handling imbalanced data is to attempt to equalise the ...
research
06/16/2013

Local case-control sampling: Efficient subsampling in imbalanced data sets

For classification problems with significant class imbalance, subsamplin...

Please sign up or login with your details

Forgot password? Click here to reset