Characterizing the Effect of Class Imbalance on the Learning Dynamics

07/01/2022
by   Emanuele Francazi, et al.
0

Data imbalance is a common problem in the machine learning literature that can have a critical effect on the performance of a model. Various solutions exist - such as the ones that focus on resampling or data generation - but their impact on the convergence of gradient-based optimizers used in deep learning is not understood. We here elucidate the significant negative impact of data imbalance on learning, showing that the learning curves for minority and majority classes follow sub-optimal trajectories when training with a gradient-based optimizer. The reason is not only that the gradient signal neglects the minority classes, but also that the minority classes are subject to a larger directional noise, which slows their learning by an amount related to the imbalance ratio. To address this problem, we propose a new algorithmic solution, for which we provide a detailed analysis of its convergence behavior. We show both theoretically and empirically that this new algorithm exhibits a better behavior with more stable learning curves for each class, as well as a better generalization performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2023

Rethinking Class Imbalance in Machine Learning

Imbalance learning is a subfield of machine learning that focuses on lea...
research
09/09/2021

An Experimental Study of Class Imbalance in Federated Learning

Federated learning is a distributed machine learning paradigm that train...
research
12/16/2022

TopoImb: Toward Topology-level Imbalance in Learning from Graphs

Graph serves as a powerful tool for modeling data that has an underlying...
research
11/28/2021

Imbalanced data preprocessing techniques utilizing local data characteristics

Data imbalance, that is the disproportion between the number of training...
research
11/24/2020

Alleviating Class-wise Gradient Imbalance for Pulmonary Airway Segmentation

Automated airway segmentation is a prerequisite for pre-operative diagno...
research
03/04/2022

Deep Learning Neural Networks for Emotion Classification from Text: Enhanced Leaky Rectified Linear Unit Activation and Weighted Loss

Accurate emotion classification for online reviews is vital for business...
research
07/25/2019

Machine learning approach to remove ion interference effect in agricultural nutrient solutions

High concentration agricultural facilities such as vertical farms or pla...

Please sign up or login with your details

Forgot password? Click here to reset