FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

12/11/2022
by   Tingyu Xia, et al.
0

Weakly-supervised text classification aims to train a classifier using only class descriptions and unlabeled data. Recent research shows that keyword-driven methods can achieve state-of-the-art performance on various tasks. However, these methods not only rely on carefully-crafted class descriptions to obtain class-specific keywords but also require substantial amount of unlabeled data and takes a long time to train. This paper proposes FastClass, an efficient weakly-supervised classification approach. It uses dense text representation to retrieve class-relevant documents from external unlabeled corpus and selects an optimal subset to train a classifier. Compared to keyword-driven methods, our approach is less reliant on initial class descriptions as it no longer needs to expand each class description into a set of class-specific keywords. Experiments on a wide range of classification tasks show that the proposed approach frequently outperforms keyword-driven models in terms of classification accuracy and often enjoys orders-of-magnitude faster training speed.

READ FULL TEXT

page 4

page 5

page 9

research
10/06/2021

Weakly-supervised Text Classification Based on Keyword Graph

Weakly-supervised text classification has received much attention in rec...
research
09/23/2022

Best Prompts for Text-to-Image Models and How to Find Them

Recent progress in generative models, especially in text-guided diffusio...
research
08/11/2023

Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures

Free text comments (FTC) in patient-reported outcome measures (PROMs) da...
research
02/25/2019

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Hierarchical text classification has many real-world applications. Howev...
research
11/23/2022

Embedding Compression for Text Classification Using Dictionary Screening

In this paper, we propose a dictionary screening method for embedding co...
research
05/30/2023

Understanding temporally weakly supervised training: A case study for keyword spotting

The currently most prominent algorithm to train keyword spotting (KWS) m...
research
10/10/2019

Learning Only from Relevant Keywords and Unlabeled Documents

We consider a document classification problem where document labels are ...

Please sign up or login with your details

Forgot password? Click here to reset