Optimized Tracking of Topic Evolution

12/16/2019
by   Patrick Kiss, et al.
0

Topic evolution modeling has been researched for a long time and has gained considerable interest. A state-of-the-art method has been recently using word modeling algorithms in combination with community detection mechanisms to achieve better results in a more effective way. We analyse results of this approach and discuss the two major challenges that this approach still faces. Although the topics that have resulted from the recent algorithm are good in general, they are very noisy due to many topics that are very unimportant because of their size, words, or ambiguity. Additionally, the number of words defining each topic is too large, making it difficult to analyse them in their unsorted state. In this paper, we propose approaches to tackle these challenges by adding topic filtering and network analysis metrics to define the importance of a topic. We test different combinations of these metrics to see which combination yields the best results. Furthermore, we add word filtering and ranking to each topic to identify the words with the highest novelty automatically. We evaluate our enhancement methods in two ways: human qualitative evaluation and automatic quantitative evaluation. Moreover, we created two case studies to test the quality of the clusters and words. In the quantitative evaluation, we use the pairwise mutual information score to test the coherency of topics. The quantitative evaluation also includes an analysis of execution times for each part of the program. The results of the experimental evaluations show that the two evaluation methods agree on the positive feasibility of the algorithm. We then show possible extensions in the form of usability and future improvements to the algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2019

Re-Ranking Words to Improve Interpretability of Automatically Generated Topics

Topics models, such as LDA, are widely used in Natural Language Processi...
research
10/23/2018

Topic representation: finding more representative words in topic models

The top word list, i.e., the top-M words with highest marginal probabili...
research
10/07/2017

Topic Modeling based on Keywords and Context

Current topic models often suffer from discovering topics not matching h...
research
11/05/2021

Monitoring geometrical properties of word embeddings for detecting the emergence of new topics

Slow emerging topic detection is a task between event detection, where w...
research
01/02/2023

Using meaning instead of words to track topics

The ability to monitor the evolution of topics over time is extremely va...
research
02/20/2023

Persian topic detection based on Human Word association and graph embedding

In this paper, we propose a framework to detect topics in social media b...
research
07/01/2019

Hidden in Plain Sight For Too Long: Using Text Mining Techniques to Shine a Light on Workplace Sexism and Sexual Harassment

Objective: The goal of this study is to understand how people experience...

Please sign up or login with your details

Forgot password? Click here to reset