Scalable and Robust Construction of Topical Hierarchies

03/13/2014
by   Chi Wang, et al.
0

Automated generation of high-quality topical hierarchies for a text collection is a dream problem in knowledge engineering with many valuable applications. In this paper a scalable and robust algorithm is proposed for constructing a hierarchy of topics from a text collection. We divide and conquer the problem using a top-down recursive framework, based on a tensor orthogonal decomposition technique. We solve a critical challenge to perform scalable inference for our newly designed hierarchical topic model. Experiments with various real-world datasets illustrate its ability to generate robust, high-quality hierarchies efficiently. Our method reduces the time of construction by several orders of magnitude, and its robust feature renders it possible for users to interactively revise the hierarchy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/11/2023

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design

High-quality instruction-tuning data is critical to improving LLM capabi...
research
02/23/2017

Scalable Inference for Nested Chinese Restaurant Process Topic Models

Nested Chinese Restaurant Process (nCRP) topic models are powerful nonpa...
research
06/14/2023

Parallel Algorithms for Hierarchical Nucleus Decomposition

Nucleus decompositions have been shown to be a useful tool for finding d...
research
07/18/2020

Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding

Mining a set of meaningful topics organized into a hierarchy is intuitiv...
research
08/01/2011

Scaling Inference for Markov Logic with a Task-Decomposition Approach

Motivated by applications in large-scale knowledge base construction, we...
research
10/24/2021

Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

Wing and Tip decomposition construct a hierarchy of butterfly-dense edge...
research
06/16/2020

Decomposable Families of Itemsets

The problem of selecting a small, yet high quality subset of patterns fr...

Please sign up or login with your details

Forgot password? Click here to reset