Topic Modeling on Health Journals with Regularized Variational Inference

01/15/2018
by   Robert Giaquinto, et al.
0

Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona --- where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model's personas to be distinct. Our results show significant improvements over competing topic models --- particularly after regularization, and highlight the DAP model's unique ability to capture common journeys shared by different authors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2018

DAPPER: Scaling Dynamic Author Persona Topic Model to Billion Word Corpora

Extracting common narratives from multi-author dynamic text corpora requ...
research
07/11/2012

The Author-Topic Model for Authors and Documents

We introduce the author-topic model, a generative model for documents th...
research
10/22/2020

A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

We show how to learn a neural topic model with discrete random variables...
research
06/09/2019

Crypto art: A decentralized view

This is a decentralized position paper on crypto art, which includes vie...
research
05/25/2017

A Neural Framework for Generalized Topic Models

Topic models for text corpora comprise a popular family of methods that ...
research
06/13/2012

Continuous Time Dynamic Topic Models

In this paper, we develop the continuous time dynamic topic model (cDTM)...
research
06/17/2019

Analyses of Multi-collection Corpora via Compound Topic Modeling

As electronically stored data grow in daily life, obtaining novel and re...

Please sign up or login with your details

Forgot password? Click here to reset