Learning Classifiers for Imbalanced and Overlapping Data

10/22/2022
by   Shivaditya Shivganesh, et al.
0

This study is about inducing classifiers using data that is imbalanced, with a minority class being under-represented in relation to the majority classes. The first section of this research focuses on the main characteristics of data that generate this problem. Following a study of previous, relevant research, a variety of artificial, imbalanced data sets influenced by important elements were created. These data sets were used to create decision trees and rule-based classifiers. The second section of this research looks into how to improve classifiers by pre-processing data with resampling approaches. The results of the following trials are compared to the performance of distinct pre-processing re-sampling methods: two variants of random over-sampling and focused under-sampling NCR. This paper further optimises class imbalance with a new method called Sparsity. The data is made more sparse from its class centers, hence making it more homogenous.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2019

A Study of Data Pre-processing Techniques for Imbalanced Biomedical Data Classification

Biomedical data are widely accepted in developing prediction models for ...
research
11/09/2020

Synthetic Over-sampling with the Minority and Majority classes for imbalance problems

Class imbalance is a substantial challenge in classifying many real-worl...
research
04/06/2021

Survey of Imbalanced Data Methodologies

Imbalanced data set is a problem often found and well-studied in financi...
research
05/09/2014

Hellinger Distance Trees for Imbalanced Streams

Classifiers trained on data sets possessing an imbalanced class distribu...
research
11/25/2019

A Self-Adaptive Synthetic Over-Sampling Technique for Imbalanced Classification

Traditionally, in supervised machine learning, (a significant) part of t...
research
06/28/2016

Reviving Threshold-Moving: a Simple Plug-in Bagging Ensemble for Binary and Multiclass Imbalanced Data

Class imbalance presents a major hurdle in the application of data minin...
research
03/29/2018

Modified SMOTE Using Mutual Information and Different Sorts of Entropies

SMOTE is one of the oversampling techniques for balancing the datasets a...

Please sign up or login with your details

Forgot password? Click here to reset