Label Semantic Aware Pre-training for Few-shot Text Classification

04/14/2022
by   Aaron Mueller, et al.
0

In text classification tasks, useful information is encoded in the label names. Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction. However, use of label-semantics during pre-training has not been extensively explored. We therefore propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems. LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains. As domain-general pre-training requires large amounts of data, we develop a filtering and labeling pipeline to automatically create sentence-label pairs from unlabeled text. We perform experiments on intent (ATIS, Snips, TOPv2) and topic classification (AG News, Yahoo! Answers). LSAP obtains significant accuracy improvements over state-of-the-art models for few-shot text classification while maintaining performance comparable to state of the art in high-resource settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2019

How to Fine-Tune BERT for Text Classification?

Language model pre-training has proven to be useful in learning universa...
research
12/19/2022

Less is More: Parameter-Free Text Classification with Gzip

Deep neural networks (DNNs) are often used for text classification tasks...
research
06/15/2023

MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification

Prompting methods have shown impressive performance in a variety of text...
research
04/04/2023

Multidimensional Perceptron for Efficient and Explainable Long Text Classification

Because of the inevitable cost and complexity of transformer and pre-tra...
research
05/05/2023

Augmenting Low-Resource Text Classification with Graph-Grounded Pre-training and Prompting

Text classification is a fundamental problem in information retrieval wi...
research
06/03/2022

Task-Adaptive Pre-Training for Boosting Learning With Noisy Labels: A Study on Text Classification for African Languages

For high-resource languages like English, text classification is a well-...
research
08/28/2023

Breaking the Bank with ChatGPT: Few-Shot Text Classification for Finance

We propose the use of conversational GPT models for easy and quick few-s...

Please sign up or login with your details

Forgot password? Click here to reset