Keyword-based Topic Modeling and Keyword Selection

01/22/2020
by   Xingyu Wang, et al.
7

Certain type of documents such as tweets are collected by specifying a set of keywords. As topics of interest change with time it is beneficial to adjust keywords dynamically. The challenge is that these need to be specified ahead of knowing the forthcoming documents and the underlying topics. The future topics should mimic past topics of interest yet there should be some novelty in them. We develop a keyword-based topic model that dynamically selects a subset of keywords to be used to collect future documents. The generative process first selects keywords and then the underlying documents based on the specified keywords. The model is trained by using a variational lower bound and stochastic gradient optimization. The inference consists of finding a subset of keywords where given a subset the model predicts the underlying topic-word matrix for the unknown forthcoming documents. We compare the keyword topic model against a benchmark model using viral predictions of tweets combined with a topic model. The keyword-based topic model outperforms this sophisticated baseline model by 67

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2017

Topic Modeling based on Keywords and Context

Current topic models often suffer from discovering topics not matching h...
research
04/13/2020

Keyword Assisted Topic Models

For a long time, many social scientists have conducted content analysis ...
research
01/27/2017

Statistical Analysis on Bangla Newspaper Data to Extract Trending Topic and Visualize Its Change Over Time

Trending topic of newspapers is an indicator to understand the situation...
research
05/03/2022

A Comparison of Approaches for Imbalanced Classification Problems in the Context of Retrieving Relevant Documents for an Analysis

One of the first steps in many text-based social science studies is to r...
research
01/26/2014

Painting Analysis Using Wavelets and Probabilistic Topic Models

In this paper, computer-based techniques for stylistic analysis of paint...
research
07/28/2019

TopicSifter: Interactive Search Space Reduction Through Targeted Topic Modeling

Topic modeling is commonly used to analyze and understand large document...
research
06/20/2016

Comparing the hierarchy of keywords in on-line news portals

The tagging of on-line content with informative keywords is a widespread...

Please sign up or login with your details

Forgot password? Click here to reset