KDE sampling for imbalanced class distribution

10/17/2019
by   Firuz Kamalov, et al.
0

Imbalanced response variable distribution is not an uncommon occurrence in data science. One common way to combat class imbalance is through resampling the minority class to achieve a more balanced distribution. In this paper, we investigate the performance of the sampling method based on kernel density estimate (KDE). We illustrate how KDE is less prone to overfitting than other standard sampling methods. Numerical experiments show that KDE can outperform other sampling techniques on a range of classifiers and real life datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2017

CUSBoost: Cluster-based Under-sampling with Boosting for Imbalanced Classification

Class imbalance classification is a challenging research problem in data...
research
10/25/2021

Kernel density estimation-based sampling for neural network classification

Imbalanced data occurs in a wide range of scenarios. The skewed distribu...
research
10/16/2018

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

This research tested the following well known strategies to deal with bi...
research
07/11/2022

Partial Resampling of Imbalanced Data

Imbalanced data is a frequently encountered problem in machine learning....
research
03/27/2023

Evaluating XGBoost for Balanced and Imbalanced Data: Application to Fraud Detection

This paper evaluates XGboost's performance given different dataset sizes...
research
11/17/2021

Sampling To Improve Predictions For Underrepresented Observations In Imbalanced Data

Data imbalance is common in production data, where controlled production...
research
08/25/2022

An Empirical Analysis of the Efficacy of Different Sampling Techniques for Imbalanced Classification

Learning from imbalanced data is a challenging task. Standard classifica...

Please sign up or login with your details

Forgot password? Click here to reset