A review of ensemble learning and data augmentation models for class imbalanced problems: combination, implementation and evaluation

04/06/2023
by   Azal Ahmad Khan, et al.
0

Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other classes. Ensemble learning that combines multiple models to obtain a robust model has been prominently used with data augmentation methods to address class imbalance problems. In the last decade, a number of strategies have been added to enhance ensemble learning and data augmentation methods, along with new methods such as generative adversarial networks (GANs). A combination of these has been applied in many studies, but the true rank of different combinations would require a computational review. In this paper, we present a computational review to evaluate data augmentation and ensemble learning methods used to address prominent benchmark CI problems. We propose a general framework that evaluates 10 data augmentation and 10 ensemble learning methods for CI problems. Our objective was to identify the most effective combination for improving classification performance on imbalanced datasets. The results indicate that combinations of data augmentation methods with ensemble learning can significantly improve classification performance on imbalanced datasets. These findings have important implications for the development of more effective approaches for handling imbalanced datasets in machine learning applications.

READ FULL TEXT

page 17

page 18

page 19

page 20

page 21

page 22

research
01/26/2023

Experimenting with an Evaluation Framework for Imbalanced Data Learning (EFIDL)

Introduction Data imbalance is one of the crucial issues in big data ana...
research
12/17/2022

Balanced Split: A new train-test data splitting strategy for imbalanced datasets

Classification data sets with skewed class proportions are called imbala...
research
09/14/2020

Adaptive Generation Model: A New Ensemble Method

As a common method in Machine Learning, Ensemble Method is used to train...
research
04/20/2023

Is augmentation effective to improve prediction in imbalanced text datasets?

Imbalanced datasets present a significant challenge for machine learning...
research
06/12/2023

Rotational augmentation techniques: a new perspective on ensemble learning for image classification

The popularity of data augmentation techniques in machine learning has i...
research
10/24/2022

GradMix for nuclei segmentation and classification in imbalanced pathology image datasets

An automated segmentation and classification of nuclei is an essential t...
research
09/03/2020

MixBoost: Synthetic Oversampling with Boosted Mixup for Handling Extreme Imbalance

Training a classification model on a dataset where the instances of one ...

Please sign up or login with your details

Forgot password? Click here to reset