Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

06/29/2020
by   Kaidi Cao, et al.
1

Real-world large-scale datasets are heteroskedastic and imbalanced – labels have varying levels of uncertainty and label distributions are long-tailed. Heteroskedasticity and imbalance challenge deep learning algorithms due to the difficulty of distinguishing among mislabeled, ambiguous, and rare examples. Addressing heteroskedasticity and imbalance simultaneously is under-explored. We propose a data-dependent regularization technique for heteroskedastic datasets that regularizes different regions of the input space differently. Inspired by the theoretical derivation of the optimal regularization strength in a one-dimensional nonparametric classification setting, our approach adaptively regularizes the data points in higher-uncertainty, lower-density regions more heavily. We test our method on several benchmark tasks, including a real-world heteroskedastic and imbalanced dataset, WebVision. Our experiments corroborate our theory and demonstrate a significant improvement over other methods in noise-robust deep learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2021

Self-supervised Learning is More Robust to Dataset Imbalance

Self-supervised learning (SSL) is a scalable way to learn general visual...
research
06/13/2020

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Real-world data often exhibits long-tailed distributions with heavy clas...
research
12/03/2020

ReMix: Calibrated Resampling for Class Imbalance in Deep learning

Class imbalance is a problem of significant importance in applied deep l...
research
02/14/2018

Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets

Multilabel classification is an emergent data mining task with a broad r...
research
05/24/2023

Mixture of Experts with Uncertainty Voting for Imbalanced Deep Regression Problems

Data imbalance is ubiquitous when applying machine learning to real-worl...
research
06/11/2023

Variational Imbalanced Regression

Existing regression models tend to fall short in both accuracy and uncer...
research
07/08/2020

Remix: Rebalanced Mixup

Deep image classifiers often perform poorly when training data are heavi...

Please sign up or login with your details

Forgot password? Click here to reset