Probabilistic Label Trees for Extreme Multi-label Classification

09/23/2020
by   Kalina Jasinska-Kobus, et al.
4

Extreme multi-label classification (XMLC) is a learning task of tagging instances with a small subset of relevant labels chosen from an extremely large pool of possible labels. Problems of this scale can be efficiently handled by organizing labels as a tree, like in hierarchical softmax used for multi-class problems. In this paper, we thoroughly investigate probabilistic label trees (PLTs) which can be treated as a generalization of hierarchical softmax for multi-label problems. We first introduce the PLT model and discuss training and inference procedures and their computational costs. Next, we prove the consistency of PLTs for a wide spectrum of performance metrics. To this end, we upperbound their regret by a function of surrogate-loss regrets of node classifiers. Furthermore, we consider a problem of training PLTs in a fully online setting, without any prior knowledge of training instances, their features, or labels. In this case, both node classifiers and the tree structure are trained online. We prove a specific equivalence between the fully online algorithm and an algorithm with a tree structure given in advance. Finally, we discuss several implementations of PLTs and introduce a new one, napkinXC, which we empirically evaluate and compare with state-of-the-art algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/27/2018

A no-regret generalization of hierarchical softmax to extreme multi-label classification

Extreme multi-label classification (XMLC) is a problem of tagging an ins...
research
07/08/2020

Online probabilistic label trees

We introduce online probabilistic label trees (OPLTs), an algorithm that...
research
06/01/2019

On the computational complexity of the probabilistic label tree algorithms

Label tree-based algorithms are widely used to tackle multi-class and mu...
research
10/20/2021

Propensity-scored Probabilistic Label Trees

Extreme multi-label classification (XMLC) refers to the task of tagging ...
research
08/09/2014

Conditional Probability Tree Estimation Analysis and Algorithms

We consider the problem of estimating the conditional probability of a l...
research
06/01/2021

Enabling Efficiency-Precision Trade-offs for Label Trees in Extreme Classification

Extreme multi-label classification (XMC) aims to learn a model that can ...
research
10/14/2016

Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation

We consider multi-class classification where the predictor has a hierarc...

Please sign up or login with your details

Forgot password? Click here to reset