Analysis and tuning of hierarchical topic models based on Renyi entropy approach

01/19/2021
by   Sergei Koltcov, et al.
0

Hierarchical topic modeling is a potentially powerful instrument for determining the topical structure of text collections that allows constructing a topical hierarchy representing levels of topical abstraction. However, tuning of parameters of hierarchical models, including the number of topics on each hierarchical level, remains a challenging task and an open issue. In this paper, we propose a Renyi entropy-based approach for a partial solution to the above problem. First, we propose a Renyi entropy-based metric of quality for hierarchical models. Second, we propose a practical concept of hierarchical topic model tuning tested on datasets with human mark-up. In the numerical experiments, we consider three different hierarchical models, namely, hierarchical latent Dirichlet allocation (hLDA) model, hierarchical Pachinko allocation model (hPAM), and hierarchical additive regularization of topic models (hARTM). We demonstrate that hLDA model possesses a significant level of instability and, moreover, the derived numbers of topics are far away from the true numbers for labeled datasets. For hPAM model, the Renyi entropy approach allows us to determine only one level of the data structure. For hARTM model, the proposed approach allows us to estimate the number of topics for two hierarchical levels.

READ FULL TEXT
research
06/20/2012

Nonparametric Bayes Pachinko Allocation

Recent advances in topic models have explored complicated structured dis...
research
02/28/2018

Application of Rényi and Tsallis Entropies to Topic Modeling Optimization

This is full length article (draft version) where problem number of topi...
research
05/21/2016

Latent Tree Models for Hierarchical Topic Detection

We present a novel method for hierarchical topic detection where topics ...
research
07/18/2017

Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization

The cooperative hierarchical structure is a common and significant data ...
research
05/16/2023

HyHTM: Hyperbolic Geometry based Hierarchical Topic Models

Hierarchical Topic Models (HTMs) are useful for discovering topic hierar...
research
09/02/2020

Local-HDP: Interactive Open-Ended 3D Object Categorization

We introduce a non-parametric hierarchical Bayesian approach for open-en...
research
11/07/2018

Construction and Quality Evaluation of Heterogeneous Hierarchical Topic Models

In our work, we propose to represent HTM as a set of flat models, or lay...

Please sign up or login with your details

Forgot password? Click here to reset