Pretrained Generalized Autoregressive Model with Adaptive Probabilistic Label Clusters for Extreme Multi-label Text Classification

07/05/2020
by   Hui Ye, et al.
24

Extreme multi-label text classification (XMTC) is a task for tagging a given text with the most relevant labels from an extremely large label set. We propose a novel deep learning method called APLC-XLNet. Our approach fine-tunes the recently released generalized autoregressive pretrained model (XLNet) to learn a dense representation for the input text. We propose Adaptive Probabilistic Label Clusters (APLC) to approximate the cross entropy loss by exploiting the unbalanced label distribution to form clusters that explicitly reduce the computational time. Our experiments, carried out on five benchmark datasets, show that our approach significantly outperforms existing state-of-the-art methods. Our source code is available publicly at https://github.com/huiyegit/APLC_XLNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2022

GUDN A novel guide network for extreme multi-label text classification

The problem of extreme multi-label text classification (XMTC) is to reca...
research
08/25/2023

MatchXML: An Efficient Text-label Matching Framework for Extreme Multi-label Text Classification

The eXtreme Multi-label text Classification(XMC) refers to training a cl...
research
05/24/2022

Exploiting Dynamic and Fine-grained Semantic Scope for Extreme Multi-label Text Classification

Extreme multi-label text classification (XMTC) refers to the problem of ...
research
03/01/2021

Fast threshold optimization for multi-label audio tagging using Surrogate gradient learning

Multi-label audio tagging consists of assigning sets of tags to audio re...
research
09/11/2022

Learning When to Say "I Don't Know"

We propose a new Reject Option Classification technique to identify and ...
research
10/29/2022

CascadeXML: Rethinking Transformers for End-to-end Multi-resolution Training in Extreme Multi-label Classification

Extreme Multi-label Text Classification (XMC) involves learning a classi...
research
05/30/2023

Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Text classification in education, usually called auto-tagging, is the au...

Please sign up or login with your details

Forgot password? Click here to reset