Learning Supervised Topic Models for Classification and Regression from Crowds

08/17/2018
by   Filipe Rodrigues, et al.
8

The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

READ FULL TEXT

page 11

page 12

page 14

research
02/19/2016

Spectral Learning for Supervised Topic Models

Supervised topic models simultaneously model the latent topic structure ...
research
10/25/2012

Nested Hierarchical Dirichlet Processes

We develop a nested hierarchical Dirichlet process (nHDP) for hierarchic...
research
12/05/2022

Federated Neural Topic Models

Over the last years, topic modeling has emerged as a powerful technique ...
research
01/06/2020

Topic Extraction of Crawled Documents Collection using Correlated Topic Model in MapReduce Framework

The tremendous increase in the amount of available research documents im...
research
01/26/2023

Neural Dynamic Focused Topic Model

Topic models and all their variants analyse text by learning meaningful ...
research
01/21/2020

Random-walk Based Generative Model for Classifying Document Networks

Document networks are found in various collections of real-world data, s...
research
06/24/2015

Efficient Learning for Undirected Topic Models

Replicated Softmax model, a well-known undirected topic model, is powerf...

Please sign up or login with your details

Forgot password? Click here to reset