Extreme Zero-Shot Learning for Extreme Text Classification

by   Yuanhao Xiong, et al.

The eXtreme Multi-label text Classification (XMC) problem concerns finding most relevant labels for an input text instance from a large label set. However, the XMC setup faces two challenges: (1) it is not generalizable to predict unseen labels in dynamic environments, and (2) it requires a large amount of supervised (instance, label) pairs, which can be difficult to obtain for emerging domains. Recently, the generalized zero-shot XMC (GZ-XMC) setup has been studied and ZestXML is proposed accordingly to handle the unseen labels, which still requires a large number of annotated (instance, label) pairs. In this paper, we consider a more practical scenario called Extreme Zero-Shot XMC (EZ-XMC), in which no supervision is needed and merely raw text of instances and labels are accessible. Few-Shot XMC (FS-XMC), an extension to EZ-XMC with limited supervision is also investigated. To learn the semantic embeddings of instances and labels with raw text, we propose to pre-train Transformer-based encoders with self-supervised contrastive losses. Specifically, we develop a pre-training method MACLR, which thoroughly leverages the raw text with techniques including Multi-scale Adaptive Clustering, Label Regularization, and self-training with pseudo positive pairs. Experimental results on four public EZ-XMC datasets demonstrate that MACLR achieves superior performance compared to all other leading baseline methods, in particular with approximately 5-10 average. Moreover, we also show that our pre-trained encoder can be further improved on FS-XMC when there are a limited number of ground-truth positive pairs in training. By fine-tuning the encoder on such a few-shot subset, MACLR still outperforms other extreme classifiers significantly.


page 1

page 2

page 3

page 4


Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification

Large-scale multi-label text classification (LMTC) aims to associate a d...

Few-Shot Learning with Siamese Networks and Label Tuning

We study the problem of building text classifiers with little or no trai...

Generation-driven Contrastive Self-training for Zero-shot Text Classification with Instruction-tuned GPT

Moreover, GPT-based zero-shot classification models tend to make indepen...

WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories

Our research focuses on solving the zero-shot text classification proble...

Extreme Multi-Label Legal Text Classification: A case study in EU Legislation

We consider the task of Extreme Multi-Label Text Classification (XMTC) i...

TeSS: Zero-Shot Classification via Textual Similarity Comparison with Prompting using Sentence Encoder

We introduce TeSS (Text Similarity Comparison using Sentence Encoder), a...

Learning from Multiple Noisy Partial Labelers

Programmatic weak supervision creates models without hand-labeled traini...

Please sign up or login with your details

Forgot password? Click here to reset