CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification

10/29/2022
by   Siddhant Kharbanda, et al.
0

Extreme Multi-label Text Classification (XMC) involves learning a classifier that can assign an input with a subset of most relevant labels from millions of label choices. Recent approaches, such as XR-Transformer and LightXML, leverage a transformer instance to achieve state-of-the-art performance. However, in this process, these approaches need to make various trade-offs between performance and computational requirements. A major shortcoming, as compared to the Bi-LSTM based AttentionXML, is that they fail to keep separate feature representations for each resolution in a label tree. We thus propose CascadeXML, an end-to-end multi-resolution learning pipeline, which can harness the multi-layered architecture of a transformer model for attending to different label resolutions with separate feature representations. CascadeXML significantly outperforms all existing approaches with non-trivial gains obtained on benchmark datasets consisting of up to three million labels. Code for CascadeXML will be made publicly available at <https://github.com/xmc-aalto/cascadexml>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2023

MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

The eXtreme Multi-label text Classification(XMC) refers to training a cl...
research
01/09/2021

LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

Extreme Multi-label text Classification (XMC) is a task of finding the m...
research
07/05/2020

Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC) is a task for tagging a g...
research
10/21/2022

TransLIST: A Transformer-Based Linguistically Informed Sanskrit Tokenizer

Sanskrit Word Segmentation (SWS) is essential in making digitized texts ...
research
05/07/2019

A Modular Deep Learning Approach for Extreme Multi-label Text Classification

Extreme multi-label classification (XMC) aims to assign to an instance t...
research
04/02/2022

Exploiting Local and Global Features in Transformer-based Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC) is the task of tagging ea...
research
04/17/2019

Bonsai - Diverse and Shallow Trees for Extreme Multi-label Classification

Extreme multi-label classification refers to supervised multi-label lear...

Please sign up or login with your details

Forgot password? Click here to reset