Weakly-Supervised Hierarchical Text Classification

12/29/2018
by   Yu Meng, et al.
0

Hierarchical text classification, which aims to classify text documents into a given hierarchy, is an important task in many real-world applications. Recently, deep neural models are gaining increasing popularity for text classification due to their expressive power and minimum requirement for feature engineering. However, applying deep neural networks for hierarchical text classification remains challenging, because they heavily rely on a large amount of training data and meanwhile cannot easily determine appropriate levels of documents in the hierarchical setting. In this paper, we propose a weakly-supervised neural method for hierarchical text classification. Our method does not require a large amount of training data but requires only easy-to-provide weak supervision signals such as a few class-related documents or keywords. Our method effectively leverages such weak supervision signals to generate pseudo documents for model pre-training, and then performs self-training on real unlabeled data to iteratively refine the model. During the training process, our model features a hierarchical neural structure, which mimics the given hierarchy and is capable of determining the proper levels for documents with a blocking mechanism. Experiments on three datasets from different domains demonstrate the efficacy of our method compared with a comprehensive set of baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2018

Weakly-Supervised Neural Text Classification

Deep neural networks are gaining increasing popularity for the classic t...
research
02/25/2019

Efficient Path Prediction for Semi-Supervised and Weakly Supervised Hierarchical Text Classification

Hierarchical text classification has many real-world applications. Howev...
research
10/10/2019

Learning Only from Relevant Keywords and Unlabeled Documents

We consider a document classification problem where document labels are ...
research
01/21/2022

Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing

How to obtain hierarchical representations with an increasing level of a...
research
10/26/2020

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

Categorizing documents into a given label hierarchy is intuitively appea...
research
05/23/2022

Seeded Hierarchical Clustering for Expert-Crafted Taxonomies

Practitioners from many disciplines (e.g., political science) use expert...
research
09/02/2016

On Horizontal and Vertical Separation in Hierarchical Text Classification

Hierarchy is a common and effective way of organizing data and represent...

Please sign up or login with your details

Forgot password? Click here to reset