Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

02/25/2019
by   Huiru Xiao, et al.
0

Hierarchical text classification has many real-world applications. However, labeling a large number of documents is costly. In practice, we can use semi-supervised learning or weakly supervised learning (e.g., dataless classification) to reduce the labeling cost. In this paper, we propose a path cost-sensitive learning algorithm to utilize the structural information and further make use of unlabeled and weakly-labeled data. We use a generative model to leverage the large amount of unlabeled data and introduce path constraints into the learning algorithm to incorporate the structural information of the class hierarchy. The posterior probabilities of both unlabeled and weakly labeled data can be incorporated with path-dependent scores. Since we put a structure-sensitive cost to the learning algorithm to constrain the classification consistent with the class hierarchy and do not need to reconstruct the feature vectors for different structures, we can significantly reduce the computational cost compared to structural output learning. Experimental results on two hierarchical text classification benchmarks show that our approach is not only effective but also efficient to handle the semi-supervised and weakly supervised hierarchical text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/29/2018

Weakly-Supervised Hierarchical Text Classification

Hierarchical text classification, which aims to classify text documents ...
research
05/23/2022

Seeded Hierarchical Clustering for Expert-Crafted Taxonomies

Practitioners from many disciplines (e.g., political science) use expert...
research
06/24/2017

Semi-supervised Text Categorization Using Recursive K-means Clustering

In this paper, we present a semi-supervised learning algorithm for class...
research
04/28/2020

Learning Interpretable and Discrete Representations with Adversarial Training for Unsupervised Text Classification

Learning continuous representations from unlabeled textual data has been...
research
01/16/2021

Weakly-Supervised Hierarchical Models for Predicting Persuasive Strategies in Good-faith Textual Requests

Modeling persuasive language has the potential to better facilitate our ...
research
06/06/2019

What you need is a more professional teacher

We propose a simple and efficient method to combine semi-supervised lear...
research
12/11/2022

FastClass: A Time-Efficient Approach to Weakly-Supervised Text Classification

Weakly-supervised text classification aims to train a classifier using o...

Please sign up or login with your details

Forgot password? Click here to reset