Influence of Resampling on Accuracy of Imbalanced Classification

07/12/2017
by   Evgeny Burnaev, et al.
0

In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally, accurate prediction of the minor class is crucial but it's hard to achieve since there is not much information about the minor class. One approach to deal with this problem is to preliminarily resample the dataset, i.e., add new elements to the dataset or remove existing ones. Resampling can be done in various ways which raises the problem of choosing the most appropriate one. In this paper we experimentally investigate impact of resampling on classification accuracy, compare resampling methods and highlight key points and difficulties of resampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2021

A Minimax Probability Machine for Non-Decomposable Performance Measures

Imbalanced classification tasks are widespread in many real-world applic...
research
01/06/2020

Identifying and Compensating for Feature Deviation in Imbalanced Deep Learning

We investigate learning a ConvNet classifier with class-imbalanced data....
research
06/16/2023

GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification

Class imbalance is the phenomenon that some classes have much fewer inst...
research
11/30/2020

Binary Classification: Counterbalancing Class Imbalance by Applying Regression Models in Combination with One-Sided Label Shifts

In many real-world pattern recognition scenarios, such as in medical app...
research
04/05/2021

Procrustean Training for Imbalanced Deep Learning

Neural networks trained with class-imbalanced data are known to perform ...
research
11/16/2019

An "outside the box" solution for imbalanced data classification

A common problem of the real-world data sets is the class imbalance, whi...
research
11/05/2021

Divide-and-Conquer Hard-thresholding Rules in High-dimensional Imbalanced Classification

In binary classification, imbalance refers to situations in which one cl...

Please sign up or login with your details

Forgot password? Click here to reset