Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

07/18/2020
by   Yu Meng, et al.
1

Mining a set of meaningful topics organized into a hierarchy is intuitively appealing since topic correlations are ubiquitous in massive text corpora. To account for potential hierarchical topic structures, hierarchical topic models generalize flat topic models by incorporating latent topic hierarchies into their generative modeling process. However, due to their purely unsupervised nature, the learned topic hierarchy often deviates from users' particular needs or interests. To guide the hierarchical topic discovery process with minimal user supervision, we propose a new task, Hierarchical Topic Mining, which takes a category tree described by category names only, and aims to mine a set of representative terms for each category from a text corpus to help a user comprehend his/her interested topics. We develop a novel joint tree and text embedding method along with a principled optimization procedure that allows simultaneous modeling of the category tree structure and the corpus generative process in the spherical space for effective category-representative term discovery. Our comprehensive experiments show that our model, named JoSH, mines a high-quality set of hierarchical topics with high efficiency and benefits weakly-supervised hierarchical text classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2022

Knowledge-Aware Bayesian Deep Topic Model

We propose a Bayesian generative model for incorporating prior domain kn...
research
10/27/2021

TopicNet: Semantic Graph-Guided Topic Discovery

Existing deep hierarchical topic models are able to extract semantically...
research
05/04/2022

Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds

Discovering latent topics from text corpora has been studied for decades...
research
10/16/2022

HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding

Embedded topic models are able to learn interpretable topics even with l...
research
03/13/2014

Scalable and Robust Construction of Topical Hierarchies

Automated generation of high-quality topical hierarchies for a text coll...
research
10/26/2020

Hierarchical Metadata-Aware Document Categorization under Weak Supervision

Categorizing documents into a given label hierarchy is intuitively appea...
research
10/18/2016

Modeling community structure and topics in dynamic text networks

The last decade has seen great progress in both dynamic network modeling...

Please sign up or login with your details

Forgot password? Click here to reset