Class-Balanced Loss Based on Effective Number of Samples

01/16/2019
by   Yin Cui, et al.
16

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula (1-β^n)/(1-β), where n is the number of samples and β∈ [0,1) is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

READ FULL TEXT
research
12/30/2022

Delving into Semantic Scale Imbalance

Model bias triggered by long-tailed data has been widely studied. Howeve...
research
10/05/2020

Class-Wise Difficulty-Balanced Loss for Solving Class-Imbalance

Class-imbalance is one of the major challenges in real world datasets, w...
research
06/22/2020

ELF: An Early-Exiting Framework for Long-Tailed Classification

The natural world often follows a long-tailed data distribution where on...
research
05/26/2023

Score-balanced Loss for Multi-aspect Pronunciation Assessment

With rapid technological growth, automatic pronunciation assessment has ...
research
12/01/2019

Adaptive Divergence for Rapid Adversarial Optimization

Adversarial Optimization (AO) provides a reliable, practical way to matc...
research
08/09/2020

Feature Space Augmentation for Long-Tailed Data

Real-world data often follow a long-tailed distribution as the frequency...
research
04/08/2018

Personalized Classifier for Food Image Recognition

Currently, food image recognition tasks are evaluated against fixed data...

Please sign up or login with your details

Forgot password? Click here to reset