Log In Sign Up

Knowledge-Aware Bayesian Deep Topic Model

by   Dongsheng Wang, et al.

We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling. Although embedded topic models (ETMs) and its variants have gained promising performance in text analysis, they mainly focus on mining word co-occurrence patterns, ignoring potentially easy-to-obtain prior topic hierarchies that could help enhance topic coherence. While several knowledge-based topic models have recently been proposed, they are either only applicable to shallow hierarchies or sensitive to the quality of the provided prior knowledge. To this end, we develop a novel deep ETM that jointly models the documents and the given prior knowledge by embedding the words and topics into the same space. Guided by the provided knowledge, the proposed model tends to discover topic hierarchies that are organized into interpretable taxonomies. Besides, with a technique for adapting a given graph, our extended version allows the provided prior topic structure to be finetuned to match the target corpus. Extensive experiments show that our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.


page 1

page 2

page 3

page 4


Topic Modeling in Embedding Spaces

Topic modeling analyzes documents to learn meaningful patterns of words....

Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Mining a set of meaningful topics organized into a hierarchy is intuitiv...

Dirichlet belief networks for topic structure learning

Recently, considerable research effort has been devoted to developing de...

Less is More: Learning Prominent and Diverse Topics for Data Summarization

Statistical topic models efficiently facilitate the exploration of large...

HyperMiner: Topic Taxonomy Mining with Hyperbolic Embedding

Embedded topic models are able to learn interpretable topics even with l...

TopicNet: Semantic Graph-Guided Topic Discovery

Existing deep hierarchical topic models are able to extract semantically...

Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models

Spectral topic modeling algorithms operate on matrices/tensors of word c...