Graph-Sparse LDA: A Topic Model with Structured Sparsity

10/16/2014
by   Finale Doshi-Velez, et al.
0

Originally designed to model text, topic modeling has become a powerful tool for uncovering latent structure in domains including medicine, finance, and vision. The goals for the model vary depending on the application: in some cases, the discovered topics may be used for prediction or some other downstream task. In other cases, the content of the topic itself may be of intrinsic scientific interest. Unfortunately, even using modern sparse techniques, the discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that leverages knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance.

READ FULL TEXT
research
01/22/2014

Parsimonious Topic Models with Salient Word Discovery

We propose a parsimonious topic model for text corpora. In related model...
research
06/01/2016

On a Topic Model for Sentences

Probabilistic topic models are generative models that describe the conte...
research
07/24/2015

The Polylingual Labeled Topic Model

In this paper, we present the Polylingual Labeled Topic Model, a model w...
research
04/26/2016

Entities as topic labels: Improving topic interpretability and evaluability combining Entity Linking and Labeled LDA

In order to create a corpus exploration method providing topics that are...
research
01/15/2020

VSEC-LDA: Boosting Topic Modeling with Embedded Vocabulary Selection

Topic modeling has found wide application in many problems where latent ...
research
03/29/2019

Re-Ranking Words to Improve Interpretability of Automatically Generated Topics

Topics models, such as LDA, are widely used in Natural Language Processi...
research
09/08/2017

Combining LSTM and Latent Topic Modeling for Mortality Prediction

There is a great need for technologies that can predict the mortality of...

Please sign up or login with your details

Forgot password? Click here to reset