Disentangling Sampling and Labeling Bias for Learning in Large-Output Spaces

05/12/2021
by   Ankit Singh Rawat, et al.
2

Negative sampling schemes enable efficient training given a large number of classes, by offering a means to approximate a computationally expensive loss function that takes all labels into account. In this paper, we present a new connection between these schemes and loss modification techniques for countering label imbalance. We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels. Further, we provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance. We empirically verify our findings on long-tail classification and retrieval benchmarks.

READ FULL TEXT

page 11

page 12

page 22

page 23

page 24

page 25

research
07/31/2023

Towards Imbalanced Large Scale Multi-label Classification with Partially Annotated Labels

Multi-label classification is a widely encountered problem in daily life...
research
04/20/2023

Learning in Imperfect Environment: Multi-Label Classification with Long-Tailed Distribution and Partial Labels

Conventional multi-label classification (MLC) methods assume that all sa...
research
07/01/2020

Unbiased Loss Functions for Extreme Classification With Missing Labels

The goal in extreme multi-label classification (XMC) is to tag an instan...
research
10/30/2018

Weak-supervision for Deep Representation Learning under Class Imbalance

Class imbalance is a pervasive issue among classification models includi...
research
01/11/2023

Combining Self-labeling with Selective Sampling

Since data is the fuel that drives machine learning models, and access t...
research
05/20/2023

Semi-Supervised Graph Imbalanced Regression

Data imbalance is easily found in annotated data when the observations o...
research
10/16/2018

Stochastic Negative Mining for Learning with Large Output Spaces

We consider the problem of retrieving the most relevant labels for a giv...

Please sign up or login with your details

Forgot password? Click here to reset