Log In Sign Up

Extreme Multi-label Classification from Aggregated Labels

by   Yanyao Shen, et al.

Extreme multi-label classification (XMC) is the problem of finding the relevant labels for an input, from a very large universe of possible labels. We consider XMC in the setting where labels are available only for groups of samples - but not for individual ones. Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC sizes. We develop a new and scalable algorithm to impute individual-sample labels from the group labels; this can be paired with any existing XMC method to solve the aggregated label problem. We characterize the statistical properties of our algorithm under mild assumptions, and provide a new end-to-end framework for MIML as an extension. Experiments on both aggregated label XMC and MIML tasks show the advantages over existing approaches.


Multi-Label Learning from Single Positive Labels

Predicting all applicable labels for a given image is known as multi-lab...

How many labelers do you have? A closer look at gold-standard labels

The construction of most supervised learning datasets revolves around co...

Block-wise Partitioning for Extreme Multi-label Classification

Extreme multi-label classification aims to learn a classifier that annot...

Streaming Label Learning for Modeling Labels on the Fly

It is challenging to handle a large volume of labels in multi-label lear...

Statistical Topic Models for Multi-Label Document Classification

Machine learning approaches to multi-label document classification have ...

The Limited Multi-Label Projection Layer

We propose the Limited Multi-Label (LML) projection layer as a new primi...

Stratified Sampling for Extreme Multi-Label Data

Extreme multi-label classification (XML) is becoming increasingly releva...