Deep NMF Topic Modeling

02/24/2021
by   Jianyu Wang, et al.
0

Nonnegative matrix factorization (NMF) based topic modeling methods do not rely on model- or data-assumptions much. However, they are usually formulated as difficult optimization problems, which may suffer from bad local minima and high computational complexity. In this paper, we propose a deep NMF (DNMF) topic modeling framework to alleviate the aforementioned problems. It first applies an unsupervised deep learning method to learn latent hierarchical structures of documents, under the assumption that if we could learn a good representation of documents by, e.g. a deep model, then the topic word discovery problem can be boosted. Then, it takes the output of the deep model to constrain a topic-document distribution for the discovery of the discriminant topic words, which not only improves the efficacy but also reduces the computational complexity over conventional unsupervised NMF methods. We constrain the topic-document distribution in three ways, which takes the advantages of the three major sub-categories of NMF – basic NMF, structured NMF, and constrained NMF respectively. To overcome the weaknesses of deep neural networks in unsupervised topic modeling, we adopt a non-neural-network deep model – multilayer bootstrap network. To our knowledge, this is the first time that a deep NMF model is used for unsupervised topic modeling. We have compared the proposed method with a number of representative references covering major branches of topic modeling on a variety of real-world text corpora. Experimental results illustrate the effectiveness of the proposed method under various evaluation metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2023

Neural Nonnegative Matrix Factorization for Hierarchical Multilayer Topic Modeling

We introduce a new method based on nonnegative matrix factorization, Neu...
research
10/24/2019

Deep topic modeling by multilayer bootstrap network and lasso

Topic modeling is widely studied for the dimension reduction and analysi...
research
10/22/2020

On a Guided Nonnegative Matrix Factorization

Fully unsupervised topic models have found fantastic success in document...
research
11/19/2017

A Double Parametric Bootstrap Test for Topic Models

Non-negative matrix factorization (NMF) is a technique for finding laten...
research
08/21/2022

SeNMFk-SPLIT: Large Corpora Topic Modeling by Semantic Non-negative Matrix Factorization with Automatic Model Selection

As the amount of text data continues to grow, topic modeling is serving ...
research
11/24/2022

Multi-scale Hybridized Topic Modeling: A Pipeline for Analyzing Unstructured Text Datasets via Topic Modeling

We propose a multi-scale hybridized topic modeling method to find hidden...
research
11/08/2022

Robust Manifold Nonnegative Tucker Factorization for Tensor Data Representation

Nonnegative Tucker Factorization (NTF) minimizes the euclidean distance ...

Please sign up or login with your details

Forgot password? Click here to reset