DeepAI AI Chat
Log In Sign Up

Entities as topic labels: Improving topic interpretability and evaluability combining Entity Linking and Labeled LDA

by   Federico Nanni, et al.

In order to create a corpus exploration method providing topics that are easier to interpret than standard LDA topic models, here we propose combining two techniques called Entity linking and Labeled LDA. Our method identifies in an ontology a series of descriptive labels for each document in a corpus. Then it generates a specific topic for each label. Having a direct relation between topics and labels makes interpretation easier; using an ontology as background knowledge limits label ambiguity. As our topics are described with a limited number of clear-cut labels, they promote interpretability, and this may help quantitative evaluation. We illustrate the potential of the approach by applying it in order to define the most relevant topics addressed by each party in the European Parliament's fifth mandate (1999-2004).


page 1

page 2

page 3

page 4


The Polylingual Labeled Topic Model

In this paper, we present the Polylingual Labeled Topic Model, a model w...

Source-LDA: Enhancing probabilistic topic models using prior knowledge sources

A popular approach to topic modeling involves extracting co-occurring n-...

Graph-Sparse LDA: A Topic Model with Structured Sparsity

Originally designed to model text, topic modeling has become a powerful ...

Modelling Grocery Retail Topic Distributions: Evaluation, Interpretability and Stability

Understanding the shopping motivations behind market baskets has high co...

Topics as Entity Clusters: Entity-based Topics from Language Models and Graph Neural Networks

Topic models aim to reveal the latent structure behind a corpus, typical...

Extractive and Abstractive Sentence Labelling of Sentiment-bearing Topics

This paper tackles the problem of automatically labelling sentiment-bear...

A Hybrid Supervised-unsupervised Method on Image Topic Visualization with Convolutional Neural Network and LDA

Given the progress in image recognition with recent data driven paradigm...