A Novel Method of Fuzzy Topic Modeling based on Transformer Processing

09/18/2023
by   Ching-Hsun Tseng, et al.
0

Topic modeling is admittedly a convenient way to monitor markets trend. Conventionally, Latent Dirichlet Allocation, LDA, is considered a must-do model to gain this type of information. By given the merit of deducing keyword with token conditional probability in LDA, we can know the most possible or essential topic. However, the results are not intuitive because the given topics cannot wholly fit human knowledge. LDA offers the first possible relevant keywords, which also brings out another problem of whether the connection is reliable based on the statistic possibility. It is also hard to decide the topic number manually in advance. As the booming trend of using fuzzy membership to cluster and using transformers to embed words, this work presents the fuzzy topic modeling based on soft clustering and document embedding from state-of-the-art transformer-based model. In our practical application in a press release monitoring, the fuzzy topic modeling gives a more natural result than the traditional output from LDA.

READ FULL TEXT

page 4

page 5

research
01/22/2014

Parsimonious Topic Models with Salient Word Discovery

We propose a parsimonious topic model for text corpora. In related model...
research
02/02/2021

Deep Autoencoder-based Fuzzy C-Means for Topic Detection

Topic detection is a process for determining topics from a collection of...
research
12/28/2016

Partial Membership Latent Dirichlet Allocation

Topic models (e.g., pLSA, LDA, sLDA) have been widely used for segmentin...
research
05/23/2013

A Supervised Neural Autoregressive Topic Model for Simultaneous Image Classification and Annotation

Topic modeling based on latent Dirichlet allocation (LDA) has been a fra...
research
05/02/2017

Fuzzy Approach Topic Discovery in Health and Medical Corpora

The majority of medical documents and electronic health records (EHRs) a...
research
10/04/2020

Unification of HDP and LDA Models for Optimal Topic Clustering of Subject Specific Question Banks

There has been an increasingly popular trend in Universities for curricu...
research
05/24/2016

Computing Web-scale Topic Models using an Asynchronous Parameter Server

Topic models such as Latent Dirichlet Allocation (LDA) have been widely ...

Please sign up or login with your details

Forgot password? Click here to reset