Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data

09/22/2021
by   Dheeraj Mekala, et al.
0

Existing text classification methods mainly focus on a fixed label set, whereas many real-world applications require extending to new fine-grained classes as the number of samples per label increases. To accommodate such requirements, we introduce a new problem called coarse-to-fine grained classification, which aims to perform fine-grained classification on coarsely annotated data. Instead of asking for new fine-grained human annotations, we opt to leverage label surface names as the only human guidance and weave in rich pre-trained generative language models into the iterative weak supervision strategy. Specifically, we first propose a label-conditioned finetuning formulation to attune these generators for our task. Furthermore, we devise a regularization objective based on the coarse-fine label constraints derived from our problem setting, giving us even further improvements over the prior formulation. Our framework uses the fine-tuned generative models to sample pseudo-training data for training the classifier, and bootstraps on real unlabeled data for model refinement. Extensive experiments and case studies on two real-world datasets demonstrate superior performance over SOTA zero-shot classification baselines.

READ FULL TEXT
research
05/21/2023

OntoType: Ontology-Guided Zero-Shot Fine-Grained Entity Typing with Weak Supervision from Pre-Trained Language Models

Fine-grained entity typing (FET), which assigns entities in text with co...
research
11/18/2020

Your "Labrador" is My "Dog": Fine-Grained, or Not

Whether what you see in Figure 1 is a "labrador" or a "dog", is the ques...
research
09/29/2021

Active Refinement for Multi-Label Learning: A Pseudo-Label Approach

The goal of multi-label learning (MLL) is to associate a given instance ...
research
09/12/2021

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Fine-grained classification involves dealing with datasets with larger n...
research
06/24/2023

Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

Instead of relying on human-annotated training samples to build a classi...
research
09/19/2023

In-Context Learning for Text Classification with Many Labels

In-context learning (ICL) using large language models for tasks with man...
research
08/19/2021

Fine-Grained Element Identification in Complaint Text of Internet Fraud

Existing system dealing with online complaint provides a final decision ...

Please sign up or login with your details

Forgot password? Click here to reset