Clust-LDA: Joint Model for Text Mining and Author Group Inference

10/05/2018
by   Shaoyang Ning, et al.
0

Social media corpora pose unique challenges and opportunities, including typically short document lengths and rich meta-data such as author characteristics and relationships. This creates great potential for systematic analysis of the enormous body of the users and thus provides implications for industrial strategies such as targeted marketing. Here we propose a novel and statistically principled method, clust-LDA, which incorporates authorship structure into the topical modeling, thus accomplishing the task of the topical inferences across documents on the basis of authorship and, simultaneously, the identification of groupings between authors. We develop an inference procedure for clust-LDA and demonstrate its performance on simulated data, showing that clust-LDA out-performs the "vanilla" LDA on the topic identification task where authors exhibit distinctive topical preference. We also showcase the empirical performance of clust-LDA based on a real-world social media dataset from Reddit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2012

The Author-Topic Model for Authors and Documents

We introduce the author-topic model, a generative model for documents th...
research
01/24/2013

Transfer Topic Modeling with Ease and Scalability

The increasing volume of short texts generated on social media sites, su...
research
05/30/2023

Utilizing Social Media Attributes for Enhanced Keyword Detection: An IDF-LDA Model Applied to Sina Weibo

With the rapid development of social media such as Twitter and Weibo, de...
research
05/21/2019

A Comparative Analysis of Distributional Term Representations for Author Profiling in Social Media

Author Profiling (AP) aims at predicting specific characteristics from a...
research
02/12/2016

An Empirical Study on Academic Commentary and Its Implications on Reading and Writing

The relationship between reading and writing (RRW) is one of the major t...
research
08/07/2019

The Hitchhiker's Guide to LDA

Latent Dirichlet Allocation (LDA) model is a famous model in the topic m...
research
09/19/2018

Modeling Online Discourse with Coupled Distributed Topics

In this paper, we propose a deep, globally normalized topic model that i...

Please sign up or login with your details

Forgot password? Click here to reset