Topic words analysis based on LDA model

05/15/2014
by   Xi Qiu, et al.
0

Social network analysis (SNA), which is a research field describing and modeling the social connection of a certain group of people, is popular among network services. Our topic words analysis project is a SNA method to visualize the topic words among emails from Obama.com to accounts registered in Columbus, Ohio. Based on Latent Dirichlet Allocation (LDA) model, a popular topic model of SNA, our project characterizes the preference of senders for target group of receptors. Gibbs sampling is used to estimate topic and word distribution. Our training and testing data are emails from the carbon-free server Datagreening.com. We use parallel computing tool BashReduce for word processing and generate related words under each latent topic to discovers typical information of political news sending specially to local Columbus receptors. Running on two instances using paralleling tool BashReduce, our project contributes almost 30 processing contents on one instance locally. Also, the experimental result shows that the LDA model applied in our project provides precision rate 53.96 higher than TF-IDF model finding target words, on the condition that appropriate size of topic words list is selected.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2014

Modeling Word Relatedness in Latent Dirichlet Allocation

Standard LDA model suffers the problem that the topic assignment of each...
research
08/13/2016

Analysis of Morphology in Topic Modeling

Topic models make strong assumptions about their data. In particular, di...
research
04/28/2016

Detecting "Smart" Spammers On Social Network: A Topic Model Approach

Spammer detection on social network is a challenging problem. The rigid ...
research
02/13/2023

Visualizing Topic Uncertainty in Topic Modelling

Word clouds became a standard tool for presenting results of natural lan...
research
01/15/2020

VSEC-LDA: Boosting Topic Modeling with Embedded Vocabulary Selection

Topic modeling has found wide application in many problems where latent ...
research
02/27/2018

Classifying Idiomatic and Literal Expressions Using Topic Models and Intensity of Emotions

We describe an algorithm for automatic classification of idiomatic and l...
research
08/19/2022

SimLDA: A tool for topic model evaluation

Variational Bayes (VB) applied to latent Dirichlet allocation (LDA) has ...

Please sign up or login with your details

Forgot password? Click here to reset